Skip to content

Commit 3931e67

Browse files
committed
Merge branch 'netfs-writeback' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull netfs writeback updates from David Howells: The primary purpose of these patches is to rework the netfslib writeback implementation such that pages read from the cache are written to the cache through ->writepages(), thereby allowing the fscache page flag to be retired. The reworking also: (1) builds on top of the new writeback_iter() infrastructure; (2) makes it possible to use vectored write RPCs as discontiguous streams of pages can be accommodated; (3) makes it easier to do simultaneous content crypto and stream division. (4) provides support for retrying writes and re-dividing a stream; (5) replaces the ->launder_folio() op, so that ->writepages() is used instead; (6) uses mempools to allocate the netfs_io_request and netfs_io_subrequest structs to avoid allocation failure in the writeback path. Some code that uses the fscache page flag is retained for compatibility purposes with nfs and ceph. The code is switched to using the synonymous private_2 label instead and marked with deprecation comments. I have a separate set of patches that convert cifs to use this code. -~- In this new implementation, writeback_iter() is used to pump folios, progressively creating two parallel, but separate streams. Either or both streams can contain gaps, and the subrequests in each stream can be of variable size, don't need to align with each other and don't need to align with the folios. (Note that more streams can be added if we have multiple servers to duplicate data to). Indeed, subrequests can cross folio boundaries, may cover several folios or a folio may be spanned by multiple subrequests, e.g.: +---+---+-----+-----+---+----------+ Folios: | | | | | | | +---+---+-----+-----+---+----------+ +------+------+ +----+----+ Upload: | | |.....| | | +------+------+ +----+----+ +------+------+------+------+------+ Cache: | | | | | | +------+------+------+------+------+ Data that got read from the server that needs copying to the cache is stored in folios that are marked dirty and have folio->private set to a special value. The progressive subrequest construction permits the algorithm to be preparing both the next upload to the server and the next write to the cache whilst the previous ones are already in progress. Throttling can be applied to control the rate of production of subrequests - and, in any case, we probably want to write them to the server in ascending order, particularly if the file will be extended. Content crypto can also be prepared at the same time as the subrequests and run asynchronously, with the prepped requests being stalled until the crypto catches up with them. This might also be useful for transport crypto, but that happens at a lower layer, so probably would be harder to pull off. The algorithm is split into three parts: (1) The issuer. This walks through the data, packaging it up, encrypting it and creating subrequests. The part of this that generates subrequests only deals with file positions and spans and so is usable for DIO/unbuffered writes as well as buffered writes. (2) The collector. This asynchronously collects completed subrequests, unlocks folios, frees crypto buffers and performs any retries. This runs in a work queue so that the issuer can return to the caller for writeback (so that the VM can have its kswapd thread back) or async writes. Collection is slightly complex as the collector has to work out where discontiguities happen in the folio list so that it doesn't try and collect folios that weren't included in the write out. (3) The retryer. This pauses the issuer, waits for all outstanding subrequests to complete and then goes through the failed subrequests to reissue them. This may involve reprepping them (with cifs, the credits must be renegotiated and a subrequest may need splitting), and doing RMW for content crypto if there's a conflicting change on the server. * 'netfs-writeback' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (22 commits) netfs, afs: Use writeback retry to deal with alternate keys netfs: Miscellaneous tidy ups netfs: Remove the old writeback code netfs: Cut over to using new writeback code netfs, cachefiles: Implement helpers for new write code netfs, 9p: Implement helpers for new write code netfs, afs: Implement helpers for new write code netfs: Add some write-side stats and clean up some stat names netfs: New writeback implementation netfs: Switch to using unsigned long long rather than loff_t mm: Export writeback_iter() netfs: Use mempools for allocating requests and subrequests netfs: Remove ->launder_folio() support afs: Use alternative invalidation to using launder_folio 9p: Use alternative invalidation to using launder_folio mm: Provide a means of invalidation without using launder_folio netfs: Use subreq_counter to allocate subreq debug_index values netfs: Make netfs_io_request::subreq_counter an atomic_t netfs: Remove deprecated use of PG_private_2 as a second writeback flag mm: Remove the PG_fscache alias for PG_private_2 ... Signed-off-by: Christian Brauner <brauner@kernel.org>
2 parents e67572c + 1ecb146 commit 3931e67

35 files changed

+2509
-1755
lines changed

fs/9p/vfs_addr.c

Lines changed: 35 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -26,36 +26,38 @@
2626
#include "cache.h"
2727
#include "fid.h"
2828

29-
static void v9fs_upload_to_server(struct netfs_io_subrequest *subreq)
29+
/*
30+
* Writeback calls this when it finds a folio that needs uploading. This isn't
31+
* called if writeback only has copy-to-cache to deal with.
32+
*/
33+
static void v9fs_begin_writeback(struct netfs_io_request *wreq)
3034
{
31-
struct p9_fid *fid = subreq->rreq->netfs_priv;
32-
int err, len;
33-
34-
trace_netfs_sreq(subreq, netfs_sreq_trace_submit);
35-
len = p9_client_write(fid, subreq->start, &subreq->io_iter, &err);
36-
netfs_write_subrequest_terminated(subreq, len ?: err, false);
37-
}
35+
struct p9_fid *fid;
3836

39-
static void v9fs_upload_to_server_worker(struct work_struct *work)
40-
{
41-
struct netfs_io_subrequest *subreq =
42-
container_of(work, struct netfs_io_subrequest, work);
37+
fid = v9fs_fid_find_inode(wreq->inode, true, INVALID_UID, true);
38+
if (!fid) {
39+
WARN_ONCE(1, "folio expected an open fid inode->i_ino=%lx\n",
40+
wreq->inode->i_ino);
41+
return;
42+
}
4343

44-
v9fs_upload_to_server(subreq);
44+
wreq->wsize = fid->clnt->msize - P9_IOHDRSZ;
45+
if (fid->iounit)
46+
wreq->wsize = min(wreq->wsize, fid->iounit);
47+
wreq->netfs_priv = fid;
48+
wreq->io_streams[0].avail = true;
4549
}
4650

4751
/*
48-
* Set up write requests for a writeback slice. We need to add a write request
49-
* for each write we want to make.
52+
* Issue a subrequest to write to the server.
5053
*/
51-
static void v9fs_create_write_requests(struct netfs_io_request *wreq, loff_t start, size_t len)
54+
static void v9fs_issue_write(struct netfs_io_subrequest *subreq)
5255
{
53-
struct netfs_io_subrequest *subreq;
56+
struct p9_fid *fid = subreq->rreq->netfs_priv;
57+
int err, len;
5458

55-
subreq = netfs_create_write_request(wreq, NETFS_UPLOAD_TO_SERVER,
56-
start, len, v9fs_upload_to_server_worker);
57-
if (subreq)
58-
netfs_queue_write_request(subreq);
59+
len = p9_client_write(fid, subreq->start, &subreq->io_iter, &err);
60+
netfs_write_subrequest_terminated(subreq, len ?: err, false);
5961
}
6062

6163
/**
@@ -87,12 +89,16 @@ static int v9fs_init_request(struct netfs_io_request *rreq, struct file *file)
8789
{
8890
struct p9_fid *fid;
8991
bool writing = (rreq->origin == NETFS_READ_FOR_WRITE ||
90-
rreq->origin == NETFS_WRITEBACK ||
9192
rreq->origin == NETFS_WRITETHROUGH ||
92-
rreq->origin == NETFS_LAUNDER_WRITE ||
9393
rreq->origin == NETFS_UNBUFFERED_WRITE ||
9494
rreq->origin == NETFS_DIO_WRITE);
9595

96+
if (rreq->origin == NETFS_WRITEBACK)
97+
return 0; /* We don't get the write handle until we find we
98+
* have actually dirty data and not just
99+
* copy-to-cache data.
100+
*/
101+
96102
if (file) {
97103
fid = file->private_data;
98104
if (!fid)
@@ -104,6 +110,10 @@ static int v9fs_init_request(struct netfs_io_request *rreq, struct file *file)
104110
goto no_fid;
105111
}
106112

113+
rreq->wsize = fid->clnt->msize - P9_IOHDRSZ;
114+
if (fid->iounit)
115+
rreq->wsize = min(rreq->wsize, fid->iounit);
116+
107117
/* we might need to read from a fid that was opened write-only
108118
* for read-modify-write of page cache, use the writeback fid
109119
* for that */
@@ -132,7 +142,8 @@ const struct netfs_request_ops v9fs_req_ops = {
132142
.init_request = v9fs_init_request,
133143
.free_request = v9fs_free_request,
134144
.issue_read = v9fs_issue_read,
135-
.create_write_requests = v9fs_create_write_requests,
145+
.begin_writeback = v9fs_begin_writeback,
146+
.issue_write = v9fs_issue_write,
136147
};
137148

138149
const struct address_space_operations v9fs_addr_operations = {
@@ -141,7 +152,6 @@ const struct address_space_operations v9fs_addr_operations = {
141152
.dirty_folio = netfs_dirty_folio,
142153
.release_folio = netfs_release_folio,
143154
.invalidate_folio = netfs_invalidate_folio,
144-
.launder_folio = netfs_launder_folio,
145155
.direct_IO = noop_direct_IO,
146156
.writepages = netfs_writepages,
147157
};

fs/afs/file.c

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,6 @@ const struct address_space_operations afs_file_aops = {
5454
.read_folio = netfs_read_folio,
5555
.readahead = netfs_readahead,
5656
.dirty_folio = netfs_dirty_folio,
57-
.launder_folio = netfs_launder_folio,
5857
.release_folio = netfs_release_folio,
5958
.invalidate_folio = netfs_invalidate_folio,
6059
.migrate_folio = filemap_migrate_folio,
@@ -354,7 +353,7 @@ static int afs_init_request(struct netfs_io_request *rreq, struct file *file)
354353
if (file)
355354
rreq->netfs_priv = key_get(afs_file_key(file));
356355
rreq->rsize = 256 * 1024;
357-
rreq->wsize = 256 * 1024;
356+
rreq->wsize = 256 * 1024 * 1024;
358357
return 0;
359358
}
360359

@@ -369,6 +368,7 @@ static int afs_check_write_begin(struct file *file, loff_t pos, unsigned len,
369368
static void afs_free_request(struct netfs_io_request *rreq)
370369
{
371370
key_put(rreq->netfs_priv);
371+
afs_put_wb_key(rreq->netfs_priv2);
372372
}
373373

374374
static void afs_update_i_size(struct inode *inode, loff_t new_i_size)
@@ -400,7 +400,9 @@ const struct netfs_request_ops afs_req_ops = {
400400
.issue_read = afs_issue_read,
401401
.update_i_size = afs_update_i_size,
402402
.invalidate_cache = afs_netfs_invalidate_cache,
403-
.create_write_requests = afs_create_write_requests,
403+
.begin_writeback = afs_begin_writeback,
404+
.prepare_write = afs_prepare_write,
405+
.issue_write = afs_issue_write,
404406
};
405407

406408
static void afs_add_open_mmap(struct afs_vnode *vnode)

fs/afs/internal.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -916,7 +916,6 @@ struct afs_operation {
916916
loff_t pos;
917917
loff_t size;
918918
loff_t i_size;
919-
bool laundering; /* Laundering page, PG_writeback not set */
920919
} store;
921920
struct {
922921
struct iattr *attr;
@@ -1599,11 +1598,14 @@ extern int afs_check_volume_status(struct afs_volume *, struct afs_operation *);
15991598
/*
16001599
* write.c
16011600
*/
1601+
void afs_prepare_write(struct netfs_io_subrequest *subreq);
1602+
void afs_issue_write(struct netfs_io_subrequest *subreq);
1603+
void afs_begin_writeback(struct netfs_io_request *wreq);
1604+
void afs_retry_request(struct netfs_io_request *wreq, struct netfs_io_stream *stream);
16021605
extern int afs_writepages(struct address_space *, struct writeback_control *);
16031606
extern int afs_fsync(struct file *, loff_t, loff_t, int);
16041607
extern vm_fault_t afs_page_mkwrite(struct vm_fault *vmf);
16051608
extern void afs_prune_wb_keys(struct afs_vnode *);
1606-
void afs_create_write_requests(struct netfs_io_request *wreq, loff_t start, size_t len);
16071609

16081610
/*
16091611
* xattr.c

fs/afs/validation.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -365,9 +365,9 @@ static void afs_zap_data(struct afs_vnode *vnode)
365365
* written back in a regular file and completely discard the pages in a
366366
* directory or symlink */
367367
if (S_ISREG(vnode->netfs.inode.i_mode))
368-
invalidate_remote_inode(&vnode->netfs.inode);
368+
filemap_invalidate_inode(&vnode->netfs.inode, true, 0, LLONG_MAX);
369369
else
370-
invalidate_inode_pages2(vnode->netfs.inode.i_mapping);
370+
filemap_invalidate_inode(&vnode->netfs.inode, false, 0, LLONG_MAX);
371371
}
372372

373373
/*

0 commit comments

Comments
 (0)