Skip to content

Commit 86fb594

Browse files
committed
Merge tag 'netfs-lib-20231228' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull netfs updates from David Howells: The main aims of these patches are to get high-level I/O and knowledge of the pagecache out of the filesystem drivers as much as possible and to get rid, as much of possible, of the knowledge that pages/folios exist. Further, I would like to see ->write_begin, ->write_end and ->launder_folio go away. Features that are added by these patches to that which is already there in netfslib: (1) NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to prevent these from happening at the same time. mmap'd I/O can, of necessity, happen at any time ignoring these locks. (2) Support for unbuffered I/O. The data is kept in the bounce buffer and the pagecache is not used. This can be turned on with an inode flag. (3) Support for direct I/O. This is basically unbuffered I/O with some extra restrictions and no RMW. (4) Support for using a bounce buffer in an operation. The bounce buffer may be bigger than the target data/buffer, allowing for crypto rounding. (5) ->write_begin() and ->write_end() are ignored in favour of merging all of that into one function, netfs_perform_write(), thereby avoiding the function pointer traversals. (6) Support for write-through caching in the pagecache. netfs_perform_write() adds the pages is modifies to an I/O operation as it goes and directly marks them writeback rather than dirty. When writing back from write-through, it limits the range written back. This should allow CIFS to deal with byte-range mandatory locks correctly. (7) O_*SYNC and RWF_*SYNC writes use write-through rather than writing to the pagecache and then flushing afterwards. An AIO O_*SYNC write will notify of completion when the sub-writes all complete. (8) Support for write-streaming where modifed data is held in !uptodate folios, with a private struct attached indicating the range that is valid. (9) Support for write grouping, multiplexing a pointer to a group in the folio private data with the write-streaming data. The writepages algorithm only writes stuff back that's in the nominated group. This is intended for use by Ceph to write is snaps in order. (10) Skipping reads for which we know the server could only supply zeros or EOF (for instance if we've done a local write that leaves a hole in the file and extends the local inode size). General notes: (1) The fscache module is merged into the netfslib module to avoid cyclic exported symbol usage that prevents either module from being loaded. (2) Some helpers from fscache are reassigned to netfslib by name. (3) netfslib now makes use of folio->private, which means the filesystem can't use it. (4) The filesystem provides wrappers to call the write helpers, allowing it to do pre-validation, oplock/capability fetching and the passing in of write group info. (5) I want to try flushing the data when tearing down an inode before invalidating it to try and render launder_folio unnecessary. (6) Write-through caching will generate and dispatch write subrequests as it gathers enough data to hit wsize and has whole pages that at least span that size. This needs to be a bit more flexible, allowing for a filesystem such as CIFS to have a variable wsize. (7) The filesystem driver is just given read and write calls with an iov_iter describing the data/buffer to use. Ideally, they don't see pages or folios at all. A function, extract_iter_to_sg(), is already available to decant part of an iterator into a scatterlist for crypto purposes. AFS notes: (1) I pushed a pair of patches that clean up the trace header down to the base so that they can be shared with another branch. 9P notes: (1) Most of xfstests now pass - more, in fact, since upstream 9p lacks a writepages method and can't handle mmap writes. An occasional oops (and sometimes panic) happens somewhere in the pathwalk/FID handling code that is unrelated to these changes. (2) Writes should now occur in larger-than-page-sized chunks. (3) It should be possible to turn on multipage folio support in 9P now. All in all these patches remove a little over 800 lines from AFS, 300 from 9P, albeit with around 3000 lines added to netfs. Hopefully, I will be able to remove a bunch of lines from Ceph too. I've split the CIFS patches out to a separate branch, cifs-netfs, where a further 2000+ lines are removed. I can run a certain amount of xfstests on CIFS, though I'm running into ksmbd issues and not all the tests work correctly because of issues between fallocate and what the SMB protocol actually supports. I've also dropped the content-crypto patches out for the moment as they're only usable by the ceph changes which I'm still working on. The patch to use PG_writeback instead of PG_fscache for writing to the cache has also been deferred, pending 9p, afs, ceph and cifs all being converted. * tag 'netfs-lib-20231228' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (40 commits) 9p: Use netfslib read/write_iter afs: Use the netfs write helpers netfs: Export the netfs_sreq tracepoint netfs: Optimise away reads above the point at which there can be no data netfs: Implement a write-through caching option netfs: Provide a launder_folio implementation netfs: Provide a writepages implementation netfs, cachefiles: Pass upper bound length to allow expansion netfs: Provide netfs_file_read_iter() netfs: Allow buffered shared-writeable mmap through netfs_page_mkwrite() netfs: Implement buffered write API netfs: Implement unbuffered/DIO write support netfs: Implement unbuffered/DIO read support netfs: Allocate multipage folios in the writepath netfs: Make netfs_read_folio() handle streaming-write pages netfs: Provide func to copy data to pagecache for buffered write netfs: Dispatch write requests to process a writeback slice netfs: Prep to use folio->private for write grouping and streaming write netfs: Make the refcounting of netfs_begin_read() easier to use netfs: Make netfs_put_request() handle a NULL pointer ... Signed-off-by: Christian Brauner <brauner@kernel.org>
2 parents 861deac + 80105ed commit 86fb594

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

72 files changed

+4248
-2485
lines changed

Documentation/filesystems/netfs_library.rst

Lines changed: 4 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,6 @@ through which it can issue requests and negotiate::
295295
struct netfs_request_ops {
296296
void (*init_request)(struct netfs_io_request *rreq, struct file *file);
297297
void (*free_request)(struct netfs_io_request *rreq);
298-
int (*begin_cache_operation)(struct netfs_io_request *rreq);
299298
void (*expand_readahead)(struct netfs_io_request *rreq);
300299
bool (*clamp_length)(struct netfs_io_subrequest *subreq);
301300
void (*issue_read)(struct netfs_io_subrequest *subreq);
@@ -317,20 +316,6 @@ The operations are as follows:
317316
[Optional] This is called as the request is being deallocated so that the
318317
filesystem can clean up any state it has attached there.
319318

320-
* ``begin_cache_operation()``
321-
322-
[Optional] This is called to ask the network filesystem to call into the
323-
cache (if present) to initialise the caching state for this read. The netfs
324-
library module cannot access the cache directly, so the cache should call
325-
something like fscache_begin_read_operation() to do this.
326-
327-
The cache gets to store its state in ->cache_resources and must set a table
328-
of operations of its own there (though of a different type).
329-
330-
This should return 0 on success and an error code otherwise. If an error is
331-
reported, the operation may proceed anyway, just without local caching (only
332-
out of memory and interruption errors cause failure here).
333-
334319
* ``expand_readahead()``
335320

336321
[Optional] This is called to allow the filesystem to expand the size of a
@@ -460,14 +445,14 @@ When implementing a local cache to be used by the read helpers, two things are
460445
required: some way for the network filesystem to initialise the caching for a
461446
read request and a table of operations for the helpers to call.
462447

463-
The network filesystem's ->begin_cache_operation() method is called to set up a
464-
cache and this must call into the cache to do the work. If using fscache, for
465-
example, the cache would call::
448+
To begin a cache operation on an fscache object, the following function is
449+
called::
466450

467451
int fscache_begin_read_operation(struct netfs_io_request *rreq,
468452
struct fscache_cookie *cookie);
469453

470-
passing in the request pointer and the cookie corresponding to the file.
454+
passing in the request pointer and the cookie corresponding to the file. This
455+
fills in the cache resources mentioned below.
471456

472457
The netfs_io_request object contains a place for the cache to hang its
473458
state::

MAINTAINERS

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8133,6 +8133,19 @@ S: Supported
81338133
F: fs/iomap/
81348134
F: include/linux/iomap.h
81358135

8136+
FILESYSTEMS [NETFS LIBRARY]
8137+
M: David Howells <dhowells@redhat.com>
8138+
L: linux-cachefs@redhat.com (moderated for non-subscribers)
8139+
L: linux-fsdevel@vger.kernel.org
8140+
S: Supported
8141+
F: Documentation/filesystems/caching/
8142+
F: Documentation/filesystems/netfs_library.rst
8143+
F: fs/netfs/
8144+
F: include/linux/fscache*.h
8145+
F: include/linux/netfs.h
8146+
F: include/trace/events/fscache.h
8147+
F: include/trace/events/netfs.h
8148+
81368149
FINTEK F75375S HARDWARE MONITOR AND FAN CONTROLLER DRIVER
81378150
M: Riku Voipio <riku.voipio@iki.fi>
81388151
L: linux-hwmon@vger.kernel.org
@@ -8567,14 +8580,6 @@ F: Documentation/power/freezing-of-tasks.rst
85678580
F: include/linux/freezer.h
85688581
F: kernel/freezer.c
85698582

8570-
FS-CACHE: LOCAL CACHING FOR NETWORK FILESYSTEMS
8571-
M: David Howells <dhowells@redhat.com>
8572-
L: linux-cachefs@redhat.com (moderated for non-subscribers)
8573-
S: Supported
8574-
F: Documentation/filesystems/caching/
8575-
F: fs/fscache/
8576-
F: include/linux/fscache*.h
8577-
85788583
FSCRYPT: FILE SYSTEM LEVEL ENCRYPTION SUPPORT
85798584
M: Eric Biggers <ebiggers@kernel.org>
85808585
M: Theodore Y. Ts'o <tytso@mit.edu>

arch/arm/configs/mxs_defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,8 @@ CONFIG_PWM_MXS=y
138138
CONFIG_NVMEM_MXS_OCOTP=y
139139
CONFIG_EXT4_FS=y
140140
# CONFIG_DNOTIFY is not set
141-
CONFIG_FSCACHE=m
141+
CONFIG_NETFS_SUPPORT=m
142+
CONFIG_FSCACHE=y
142143
CONFIG_FSCACHE_STATS=y
143144
CONFIG_CACHEFILES=m
144145
CONFIG_VFAT_FS=y

arch/csky/configs/defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,8 @@ CONFIG_GENERIC_PHY=y
3434
CONFIG_EXT4_FS=y
3535
CONFIG_FANOTIFY=y
3636
CONFIG_QUOTA=y
37-
CONFIG_FSCACHE=m
37+
CONFIG_NETFS_SUPPORT=m
38+
CONFIG_FSCACHE=y
3839
CONFIG_FSCACHE_STATS=y
3940
CONFIG_CACHEFILES=m
4041
CONFIG_MSDOS_FS=y

arch/mips/configs/ip27_defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -287,7 +287,8 @@ CONFIG_BTRFS_FS_POSIX_ACL=y
287287
CONFIG_QUOTA_NETLINK_INTERFACE=y
288288
CONFIG_FUSE_FS=m
289289
CONFIG_CUSE=m
290-
CONFIG_FSCACHE=m
290+
CONFIG_NETFS_SUPPORT=m
291+
CONFIG_FSCACHE=y
291292
CONFIG_FSCACHE_STATS=y
292293
CONFIG_CACHEFILES=m
293294
CONFIG_PROC_KCORE=y

arch/mips/configs/lemote2f_defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,8 @@ CONFIG_BTRFS_FS=m
238238
CONFIG_QUOTA=y
239239
CONFIG_QFMT_V2=m
240240
CONFIG_AUTOFS_FS=m
241-
CONFIG_FSCACHE=m
241+
CONFIG_NETFS_SUPPORT=m
242+
CONFIG_FSCACHE=y
242243
CONFIG_CACHEFILES=m
243244
CONFIG_ISO9660_FS=m
244245
CONFIG_JOLIET=y

arch/mips/configs/loongson3_defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -356,7 +356,8 @@ CONFIG_QFMT_V2=m
356356
CONFIG_AUTOFS_FS=y
357357
CONFIG_FUSE_FS=m
358358
CONFIG_VIRTIO_FS=m
359-
CONFIG_FSCACHE=m
359+
CONFIG_NETFS_SUPPORT=m
360+
CONFIG_FSCACHE=y
360361
CONFIG_ISO9660_FS=m
361362
CONFIG_JOLIET=y
362363
CONFIG_MSDOS_FS=m

arch/mips/configs/pic32mzda_defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,8 @@ CONFIG_EXT4_FS_POSIX_ACL=y
6868
CONFIG_EXT4_FS_SECURITY=y
6969
CONFIG_AUTOFS_FS=m
7070
CONFIG_FUSE_FS=m
71-
CONFIG_FSCACHE=m
71+
CONFIG_NETFS_SUPPORT=m
72+
CONFIG_FSCACHE=y
7273
CONFIG_ISO9660_FS=m
7374
CONFIG_JOLIET=y
7475
CONFIG_ZISOFS=y

arch/s390/configs/debug_defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -637,8 +637,9 @@ CONFIG_FUSE_FS=y
637637
CONFIG_CUSE=m
638638
CONFIG_VIRTIO_FS=m
639639
CONFIG_OVERLAY_FS=m
640+
CONFIG_NETFS_SUPPORT=m
640641
CONFIG_NETFS_STATS=y
641-
CONFIG_FSCACHE=m
642+
CONFIG_FSCACHE=y
642643
CONFIG_CACHEFILES=m
643644
CONFIG_ISO9660_FS=y
644645
CONFIG_JOLIET=y

arch/s390/configs/defconfig

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -622,8 +622,9 @@ CONFIG_FUSE_FS=y
622622
CONFIG_CUSE=m
623623
CONFIG_VIRTIO_FS=m
624624
CONFIG_OVERLAY_FS=m
625+
CONFIG_NETFS_SUPPORT=m
625626
CONFIG_NETFS_STATS=y
626-
CONFIG_FSCACHE=m
627+
CONFIG_FSCACHE=y
627628
CONFIG_CACHEFILES=m
628629
CONFIG_ISO9660_FS=y
629630
CONFIG_JOLIET=y

0 commit comments

Comments
 (0)