Skip to content

Commit 5326c19

Browse files
committed
MFC r344183-r344187, r344333-r344334, r344407, r344857, r344865 (by cem)
r344183: FUSE: Respect userspace FS "do-not-cache" of file attributes The FUSE protocol demands that kernel implementations cache user filesystem file attributes (vattr data) for a maximum period of time in the range of [0, ULONG_MAX] seconds. In practice, typical requests are for 0, 1, or 10 seconds; or "a long time" to represent indefinite caching. Historically, FreeBSD FUSE has ignored this client directive entirely. This works fine for local-only filesystems, but causes consistency issues with multi-writer network filesystems. For now, respect 0 second cache TTLs and do not cache such metadata. Non-zero metadata caching TTLs in the range [0.000000001, ULONG_MAX] seconds are still cached indefinitely, because it is unclear how a userspace filesystem could do anything sensible with those semantics even if implemented. In the future, as an optimization, we should implement notify_inval_entry, etc, which provide userspace filesystems a way of evicting the kernel cache. One potentially bogus access to invalid cached attribute data was left in fuse_io_strategy. It is restricted behind the undocumented and non-default "vfs.fuse.fix_broken_io" sysctl or "brokenio" mount option; maybe these are deadcode and can be eliminated? Some minor APIs changed to facilitate this: 1. Attribute cache validity is tracked in FUSE inodes ("fuse_vnode_data"). 2. cache_attrs() respects the provided TTL and only caches in the FUSE inode if TTL > 0. It also grows an "out" argument, which, if non-NULL, stores the translated fuse_attr (even if not suitable for caching). 3. FUSE VTOVA(vp) returns NULL if the vnode's cache is invalid, to help avoid programming mistakes. 4. A VOP_LINK check for potential nlink overflow prior to invoking the FUSE link op was weakened (only performed when we have a valid attr cache). The check is racy in a multi-writer network filesystem anyway -- classic TOCTOU. We have to trust any userspace filesystem that rejects local caching to account for it correctly. PR: 230258 (inspired by; does not fix) r344184: FUSE: Respect userspace FS "do-not-cache" of path components The FUSE protocol demands that kernel implementations cache user filesystem path components (lookup/cnp data) for a maximum period of time in the range of [0, ULONG_MAX] seconds. In practice, typical requests are for 0, 1, or 10 seconds; or "a long time" to represent indefinite caching. Historically, FreeBSD FUSE has ignored this client directive entirely. This works fine for local-only filesystems, but causes consistency issues with multi-writer network filesystems. For now, respect 0 second cache TTLs and do not cache such metadata. Non-zero metadata caching TTLs in the range [0.000000001, ULONG_MAX] seconds are still cached indefinitely, because it is unclear how a userspace filesystem could do anything sensible with those semantics even if implemented. Pass fuse_entry_out to fuse_vnode_get when available and only cache lookup if the user filesystem did not set a zero second TTL. PR: 230258 (inspired by; does not fix) r344185: FUSE: Only "dirty" cached file size when data is dirty Most users of fuse_vnode_setsize() set the cached fvdat->filesize and update the buf cache bounds as a result of either a read from the underlying FUSE filesystem, or as part of a write-through type operation (like truncate => VOP_SETATTR). In these cases, do not set the FN_SIZECHANGE flag, which indicates that an inode's data is dirty (in particular, that the local buf cache and fvdat->filesize have dirty extended data). PR: 230258 (related) r344186: FUSE: The FUSE design expects writethrough caching At least prior to 7.23 (which adds FUSE_WRITEBACK_CACHE), the FUSE protocol specifies only clean data to be cached. Prior to this change, we implement and default to writeback caching. This is ok enough for local only filesystems without hardlinks, but violates the general design contract with FUSE and breaks distributed filesystems or concurrent access to hardlinks of the same inode. In this change, add cache mode as an extension of cache enable/disable. The new modes are UC (was: cache disabled), WT (default), and WB (was: cache enabled). For now, WT caching is implemented as write-around, which meets the goal of only caching clean data. WT can be better than WA for workloads that frequently read data that was recently written, but WA is trivial to implement. Note that this has no effect on O_WRONLY-opened files, which were already coerced to write-around. Refs: * https://sourceforge.net/p/fuse/mailman/message/8902254/ * vgough/encfs#315 PR: 230258 (inspired by) r344187: FUSE: Refresh cached file size when it changes (lookup) The cached fvdat->filesize is indepedent of the (mostly unused) cached_attrs, and we failed to update it when a cached (but perhaps inactive) vnode was found during VOP_LOOKUP to have a different size than cached. As noted in the code comment, this can occur in distributed filesystems or with other kinds of irregular file behavior (anything is possible in FUSE). We do something similar in fuse_vnop_getattr already. PR: 230258 (as reported in description; other issues explored in comments are not all resolved) Reported by: MooseFS FreeBSD Team <freebsd AT moosefs.com> Submitted by: Jakub Kruszona-Zawadzki <acid AT moosefs.com> (earlier version) r344333: fuse: add descriptions for remaining sysctls (Except reclaim revoked; I don't know what that goal of that one is.) r344334: Fuse: whitespace and style(9) cleanup Take a pass through fixing some of the most egregious whitespace issues in fs/fuse. Also fix some style(9) warts while here. Not 100% cleaned up, but somewhat less painful to look at and edit. No functional change. r344407: fuse: Fix a regression introduced in r337165 On systems with non-default DFLTPHYS and/or MAXBSIZE, FUSE would attempt to use a buf cache block size in excess of permitted size. This did not affect most configurations, since DFLTPHYS and MAXBSIZE both default to 64kB. The issue was discovered and reported using a custom kernel with a DFLTPHYS of 512kB. PR: 230260 (comment #9) Reported by: ken@ r344857: FUSE: Prevent trivial panic When open(2) was invoked against a FUSE filesystem with an unexpected flags value (no O_RDONLY / O_RDWR / O_WRONLY), an assertion fired, causing panic. For now, prevent the panic by rejecting such VOP_OPENs with EINVAL. This is not considered the correct long term fix, but does prevent an unprivileged denial-of-service. PR: 236329 Reported by: asomers Reviewed by: asomers Sponsored by: Dell EMC Isilon r344865: fuse: switch from DFLTPHYS/MAXBSIZE to maxcachebuf On GENERIC kernels with empty loader.conf, there is no functional change. DFLTPHYS and MAXBSIZE are both 64kB at the moment. This change allows larger bufcache block sizes to be used when either MAXBSIZE (custom kernel) or the loader.conf tunable vfs.maxbcachebuf (GENERIC) is adjusted higher than the default. Suggested by: ken@
1 parent fc0d9df commit 5326c19

File tree

13 files changed

+640
-653
lines changed

13 files changed

+640
-653
lines changed

sys/fs/fuse/fuse.h

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -197,26 +197,27 @@ do { \
197197
#define FUSE_TRACE 0
198198
#endif
199199

200-
#define DEBUGX(cond, fmt, ...) do { \
201-
if (((cond))) { \
202-
printf("%s: " fmt, __func__, ## __VA_ARGS__); \
203-
} } while (0)
204-
205-
#define fuse_lck_mtx_lock(mtx) do { \
206-
DEBUGX(FUSE_DEBUG_LOCK, "0: lock(%s): %s@%d by %d\n", \
207-
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
208-
mtx_lock(&(mtx)); \
209-
DEBUGX(FUSE_DEBUG_LOCK, "1: lock(%s): %s@%d by %d\n", \
210-
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
211-
} while (0)
212-
213-
#define fuse_lck_mtx_unlock(mtx) do { \
214-
DEBUGX(FUSE_DEBUG_LOCK, "0: unlock(%s): %s@%d by %d\n", \
215-
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
216-
mtx_unlock(&(mtx)); \
217-
DEBUGX(FUSE_DEBUG_LOCK, "1: unlock(%s): %s@%d by %d\n", \
218-
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
219-
} while (0)
200+
#define DEBUGX(cond, fmt, ...) do { \
201+
if (((cond))) { \
202+
printf("%s: " fmt, __func__, ## __VA_ARGS__); \
203+
} \
204+
} while (0)
205+
206+
#define fuse_lck_mtx_lock(mtx) do { \
207+
DEBUGX(FUSE_DEBUG_LOCK, "0: lock(%s): %s@%d by %d\n", \
208+
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
209+
mtx_lock(&(mtx)); \
210+
DEBUGX(FUSE_DEBUG_LOCK, "1: lock(%s): %s@%d by %d\n", \
211+
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
212+
} while (0)
213+
214+
#define fuse_lck_mtx_unlock(mtx) do { \
215+
DEBUGX(FUSE_DEBUG_LOCK, "0: unlock(%s): %s@%d by %d\n", \
216+
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
217+
mtx_unlock(&(mtx)); \
218+
DEBUGX(FUSE_DEBUG_LOCK, "1: unlock(%s): %s@%d by %d\n", \
219+
__STRING(mtx), __func__, __LINE__, curthread->td_proc->p_pid); \
220+
} while (0)
220221

221222
void fuse_ipc_init(void);
222223
void fuse_ipc_destroy(void);

sys/fs/fuse/fuse_device.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -317,7 +317,7 @@ fuse_device_read(struct cdev *dev, struct uio *uio, int ioflag)
317317
return (err);
318318
}
319319

320-
static __inline int
320+
static inline int
321321
fuse_ohead_audit(struct fuse_out_header *ohead, struct uio *uio)
322322
{
323323
FS_DEBUG("Out header -- len: %i, error: %i, unique: %llu; iovecs: %d\n",

sys/fs/fuse/fuse_file.c

Lines changed: 9 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -88,14 +88,11 @@ __FBSDID("$FreeBSD$");
8888
static int fuse_fh_count = 0;
8989

9090
SYSCTL_INT(_vfs_fusefs, OID_AUTO, filehandle_count, CTLFLAG_RD,
91-
&fuse_fh_count, 0, "");
91+
&fuse_fh_count, 0, "number of open FUSE filehandles");
9292

9393
int
94-
fuse_filehandle_open(struct vnode *vp,
95-
fufh_type_t fufh_type,
96-
struct fuse_filehandle **fufhp,
97-
struct thread *td,
98-
struct ucred *cred)
94+
fuse_filehandle_open(struct vnode *vp, fufh_type_t fufh_type,
95+
struct fuse_filehandle **fufhp, struct thread *td, struct ucred *cred)
9996
{
10097
struct fuse_dispatcher fdi;
10198
struct fuse_open_in *foi;
@@ -114,8 +111,8 @@ fuse_filehandle_open(struct vnode *vp,
114111
/* NOTREACHED */
115112
}
116113
/*
117-
* Note that this means we are effectively FILTERING OUT open() flags.
118-
*/
114+
* Note that this means we are effectively FILTERING OUT open() flags.
115+
*/
119116
oflags = fuse_filehandle_xlate_to_oflags(fufh_type);
120117

121118
if (vnode_isdir(vp)) {
@@ -159,10 +156,8 @@ fuse_filehandle_open(struct vnode *vp,
159156
}
160157

161158
int
162-
fuse_filehandle_close(struct vnode *vp,
163-
fufh_type_t fufh_type,
164-
struct thread *td,
165-
struct ucred *cred)
159+
fuse_filehandle_close(struct vnode *vp, fufh_type_t fufh_type,
160+
struct thread *td, struct ucred *cred)
166161
{
167162
struct fuse_dispatcher fdi;
168163
struct fuse_release_in *fri;
@@ -265,10 +260,8 @@ fuse_filehandle_getrw(struct vnode *vp, fufh_type_t fufh_type,
265260
}
266261

267262
void
268-
fuse_filehandle_init(struct vnode *vp,
269-
fufh_type_t fufh_type,
270-
struct fuse_filehandle **fufhp,
271-
uint64_t fh_id)
263+
fuse_filehandle_init(struct vnode *vp, fufh_type_t fufh_type,
264+
struct fuse_filehandle **fufhp, uint64_t fh_id)
272265
{
273266
struct fuse_vnode_data *fvdat = VTOFUD(vp);
274267
struct fuse_filehandle *fufh;

sys/fs/fuse/fuse_file.h

Lines changed: 42 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -67,75 +67,65 @@
6767
#include <sys/vnode.h>
6868

6969
typedef enum fufh_type {
70-
FUFH_INVALID = -1,
71-
FUFH_RDONLY = 0,
72-
FUFH_WRONLY = 1,
73-
FUFH_RDWR = 2,
74-
FUFH_MAXTYPE = 3,
70+
FUFH_INVALID = -1,
71+
FUFH_RDONLY = 0,
72+
FUFH_WRONLY = 1,
73+
FUFH_RDWR = 2,
74+
FUFH_MAXTYPE = 3,
7575
} fufh_type_t;
76+
_Static_assert(FUFH_RDONLY == O_RDONLY, "RDONLY");
77+
_Static_assert(FUFH_WRONLY == O_WRONLY, "WRONLY");
78+
_Static_assert(FUFH_RDWR == O_RDWR, "RDWR");
7679

7780
struct fuse_filehandle {
78-
uint64_t fh_id;
79-
fufh_type_t fh_type;
81+
uint64_t fh_id;
82+
fufh_type_t fh_type;
8083
};
8184

8285
#define FUFH_IS_VALID(f) ((f)->fh_type != FUFH_INVALID)
8386

84-
static __inline__
85-
fufh_type_t
87+
static inline fufh_type_t
8688
fuse_filehandle_xlate_from_mmap(int fflags)
8789
{
88-
if (fflags & (PROT_READ | PROT_WRITE)) {
89-
return FUFH_RDWR;
90-
} else if (fflags & (PROT_WRITE)) {
91-
return FUFH_WRONLY;
92-
} else if ((fflags & PROT_READ) || (fflags & PROT_EXEC)) {
93-
return FUFH_RDONLY;
94-
} else {
95-
return FUFH_INVALID;
96-
}
90+
if (fflags & (PROT_READ | PROT_WRITE))
91+
return FUFH_RDWR;
92+
else if (fflags & (PROT_WRITE))
93+
return FUFH_WRONLY;
94+
else if ((fflags & PROT_READ) || (fflags & PROT_EXEC))
95+
return FUFH_RDONLY;
96+
else
97+
return FUFH_INVALID;
9798
}
9899

99-
static __inline__
100-
fufh_type_t
100+
static inline fufh_type_t
101101
fuse_filehandle_xlate_from_fflags(int fflags)
102102
{
103-
if ((fflags & FREAD) && (fflags & FWRITE)) {
104-
return FUFH_RDWR;
105-
} else if (fflags & (FWRITE)) {
106-
return FUFH_WRONLY;
107-
} else if (fflags & (FREAD)) {
108-
return FUFH_RDONLY;
109-
} else {
110-
panic("FUSE: What kind of a flag is this (%x)?", fflags);
111-
}
103+
if ((fflags & FREAD) && (fflags & FWRITE))
104+
return FUFH_RDWR;
105+
else if (fflags & (FWRITE))
106+
return FUFH_WRONLY;
107+
else if (fflags & (FREAD))
108+
return FUFH_RDONLY;
109+
else
110+
panic("FUSE: What kind of a flag is this (%x)?", fflags);
112111
}
113112

114-
static __inline__
115-
int
113+
static inline int
116114
fuse_filehandle_xlate_to_oflags(fufh_type_t type)
117115
{
118-
int oflags = -1;
119-
120-
switch (type) {
121-
122-
case FUFH_RDONLY:
123-
oflags = O_RDONLY;
124-
break;
125-
126-
case FUFH_WRONLY:
127-
oflags = O_WRONLY;
128-
break;
129-
130-
case FUFH_RDWR:
131-
oflags = O_RDWR;
132-
break;
133-
134-
default:
135-
break;
136-
}
137-
138-
return oflags;
116+
int oflags = -1;
117+
118+
switch (type) {
119+
case FUFH_RDONLY:
120+
case FUFH_WRONLY:
121+
case FUFH_RDWR:
122+
oflags = type;
123+
break;
124+
default:
125+
break;
126+
}
127+
128+
return oflags;
139129
}
140130

141131
int fuse_filehandle_valid(struct vnode *vp, fufh_type_t fufh_type);

sys/fs/fuse/fuse_internal.c

Lines changed: 7 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -373,23 +373,18 @@ fuse_internal_readdir_processdata(struct uio *uio,
373373

374374
/* remove */
375375

376-
#define INVALIDATE_CACHED_VATTRS_UPON_UNLINK 1
377376
int
378377
fuse_internal_remove(struct vnode *dvp,
379378
struct vnode *vp,
380379
struct componentname *cnp,
381380
enum fuse_opcode op)
382381
{
383382
struct fuse_dispatcher fdi;
383+
struct fuse_vnode_data *fvdat;
384+
int err;
384385

385-
struct vattr *vap = VTOVA(vp);
386-
387-
#if INVALIDATE_CACHED_VATTRS_UPON_UNLINK
388-
int need_invalidate = 0;
389-
uint64_t target_nlink = 0;
390-
391-
#endif
392-
int err = 0;
386+
err = 0;
387+
fvdat = VTOFUD(vp);
393388

394389
debug_printf("dvp=%p, cnp=%p, op=%d\n", vp, cnp, op);
395390

@@ -399,13 +394,6 @@ fuse_internal_remove(struct vnode *dvp,
399394
memcpy(fdi.indata, cnp->cn_nameptr, cnp->cn_namelen);
400395
((char *)fdi.indata)[cnp->cn_namelen] = '\0';
401396

402-
#if INVALIDATE_CACHED_VATTRS_UPON_UNLINK
403-
if (vap->va_nlink > 1) {
404-
need_invalidate = 1;
405-
target_nlink = vap->va_nlink;
406-
}
407-
#endif
408-
409397
err = fdisp_wait_answ(&fdi);
410398
fdisp_destroy(&fdi);
411399
return err;
@@ -483,13 +471,13 @@ fuse_internal_newentry_core(struct vnode *dvp,
483471
if ((err = fuse_internal_checkentry(feo, vtyp))) {
484472
return err;
485473
}
486-
err = fuse_vnode_get(mp, feo->nodeid, dvp, vpp, cnp, vtyp);
474+
err = fuse_vnode_get(mp, feo, feo->nodeid, dvp, vpp, cnp, vtyp);
487475
if (err) {
488476
fuse_internal_forget_send(mp, cnp->cn_thread, cnp->cn_cred,
489477
feo->nodeid, 1);
490478
return err;
491479
}
492-
cache_attrs(*vpp, feo);
480+
cache_attrs(*vpp, feo, NULL);
493481

494482
return err;
495483
}
@@ -563,6 +551,7 @@ fuse_internal_vnode_disappear(struct vnode *vp)
563551

564552
ASSERT_VOP_ELOCKED(vp, "fuse_internal_vnode_disappear");
565553
fvdat->flag |= FN_REVOKED;
554+
fvdat->valid_attr_cache = false;
566555
cache_purge(vp);
567556
}
568557

0 commit comments

Comments
 (0)