Skip to content

Commit 736cd2c

Browse files
Mike SnitzerAnna Schumaker
authored andcommitted
nfs: add "NFS Client and Server Interlock" section to localio.rst
This section answers a new FAQ entry: 9. How does LOCALIO make certain that object lifetimes are managed properly given NFSD and NFS operate in different contexts? See the detailed "NFS Client and Server Interlock" section below. The first half of the section details NeilBrown's elegant design for LOCALIO's nfs_uuid_t based interlock and is heavily based on Neil's "net namespace refcounting" description here: https://marc.info/?l=linux-nfs&m=172498546024767&w=2 The second half of the section details the per-cpu-refcount introduced to ensure NFSD's nfsd_serv isn't destroyed while in use by a LOCALIO client. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
1 parent f712826 commit 736cd2c

File tree

1 file changed

+68
-0
lines changed

1 file changed

+68
-0
lines changed

Documentation/filesystems/nfs/localio.rst

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,11 @@ FAQ
150150
__fh_verify(). So they get handled exactly the same way for LOCALIO
151151
as they do for non-LOCALIO.
152152

153+
9. How does LOCALIO make certain that object lifetimes are managed
154+
properly given NFSD and NFS operate in different contexts?
155+
156+
See the detailed "NFS Client and Server Interlock" section below.
157+
153158
RPC
154159
===
155160

@@ -209,6 +214,69 @@ objects to span from the host kernel's nfsd to per-container knfsd
209214
instances that are connected to nfs client's running on the same local
210215
host.
211216

217+
NFS Client and Server Interlock
218+
===============================
219+
220+
LOCALIO provides the nfs_uuid_t object and associated interfaces to
221+
allow proper network namespace (net-ns) and NFSD object refcounting:
222+
223+
We don't want to keep a long-term counted reference on each NFSD's
224+
net-ns in the client because that prevents a server container from
225+
completely shutting down.
226+
227+
So we avoid taking a reference at all and rely on the per-cpu
228+
reference to the server (detailed below) being sufficient to keep
229+
the net-ns active. This involves allowing the NFSD's net-ns exit
230+
code to iterate all active clients and clear their ->net pointers
231+
(which are needed to find the per-cpu-refcount for the nfsd_serv).
232+
233+
Details:
234+
235+
- Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head
236+
that can be used to find the client. It does add the 16-byte
237+
uuid_t to nfs_client so it is bigger than needed (given that
238+
uuid_t is only used during the initial NFS client and server
239+
LOCALIO handshake to determine if they are local to each other).
240+
If that is really a problem we can find a fix.
241+
242+
- When the nfs server confirms that the uuid_t is local, it moves
243+
the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net.
244+
245+
- When each server's net-ns is shutting down - in a "pre_exit"
246+
handler, all these nfs_uuid_t have their ->net cleared. There is
247+
an rcu_synchronize() call between pre_exit() handlers and exit()
248+
handlers so any caller that sees nfs_uuid_t ->net as not NULL can
249+
safely manage the per-cpu-refcount for nfsd_serv.
250+
251+
- The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it
252+
can safely dereference ->net in a private rcu_read_lock() section
253+
to allow safe access to the associated nfsd_net and nfsd_serv.
254+
255+
So LOCALIO required the introduction and use of NFSD's percpu_ref to
256+
interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each
257+
nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and
258+
warrants a more detailed explanation:
259+
260+
nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its
261+
nfsd_file handle and then the caller (NFS client) must drop the
262+
reference for the nfsd_file and associated nn->nfsd_serv using
263+
nfs_file_put_local() once it has completed its IO.
264+
265+
This interlock working relies heavily on nfsd_open_local_fh() being
266+
afforded the ability to safely deal with the possibility that the
267+
NFSD's net-ns (and nfsd_net by association) may have been destroyed
268+
by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only
269+
possible given the nfs_uuid_t ->net pointer managemenet detailed
270+
above.
271+
272+
All told, this elaborate interlock of the NFS client and server has been
273+
verified to fix an easy to hit crash that would occur if an NFSD
274+
instance running in a container, with a LOCALIO client mounted, is
275+
shutdown. Upon restart of the container and associated NFSD the client
276+
would go on to crash due to NULL pointer dereference that occurred due
277+
to the LOCALIO client's attempting to nfsd_open_local_fh(), using
278+
nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.
279+
212280
NFS Client issues IO instead of Server
213281
======================================
214282

0 commit comments

Comments
 (0)