Skip to content

Commit 7ea65c8

Browse files
committed
Merge tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner: "Misc features, cleanups, and fixes for vfs and individual filesystems. Features: - Support idmapped mounts for hugetlbfs. - Add RWF_NOAPPEND flag for pwritev2(). This allows us to fix a bug where the passed offset is ignored if the file is O_APPEND. The new flag allows a caller to enforce that the offset is honored to conform to posix even if the file was opened in append mode. - Move i_mmap_rwsem in struct address_space to avoid false sharing between i_mmap and i_mmap_rwsem. - Convert efs, qnx4, and coda to use the new mount api. - Add a generic is_dot_dotdot() helper that's used by various filesystems and the VFS code instead of open-coding it multiple times. - Recently we've added stable offsets which allows stable ordering when iterating directories exported through NFS on e.g., tmpfs filesystems. Originally an xarray was used for the offset map but that caused slab fragmentation issues over time. This switches the offset map to the maple tree which has a dense mode that handles this scenario a lot better. Includes tests. - Finally merge the case-insensitive improvement series Gabriel has been working on for a long time. This cleanly propagates case insensitive operations through ->s_d_op which in turn allows us to remove the quite ugly generic_set_encrypted_ci_d_ops() operations. It also improves performance by trying a case-sensitive comparison first and then fallback to case-insensitive lookup if that fails. This also fixes a bug where overlayfs would be able to be mounted over a case insensitive directory which would lead to all sort of odd behaviors. Cleanups: - Make file_dentry() a simple accessor now that ->d_real() is simplified because of the backing file work we did the last two cycles. - Use the dedicated file_mnt_idmap helper in ntfs3. - Use smp_load_acquire/store_release() in the i_size_read/write helpers and thus remove the hack to handle i_size reads in the filemap code. - The SLAB_MEM_SPREAD is a nop now. Remove it from various places in fs/ - It's no longer necessary to perform a second built-in initramfs unpack call because we retain the contents of the previous extraction. Remove it. - Now that we have removed various allocators kfree_rcu() always works with kmem caches and kmalloc(). So simplify various places that only use an rcu callback in order to handle the kmem cache case. - Convert the pipe code to use a lockdep comparison function instead of open-coding the nesting making lockdep validation easier. - Move code into fs-writeback.c that was located in a header but can be made static as it's only used in that one file. - Rewrite the alignment checking iterators for iovec and bvec to be easier to read, and also significantly more compact in terms of generated code. This saves 270 bytes of text on x86-64 (with clang-18) and 224 bytes on arm64 (with gcc-13). In profiles it also saves a bit of time for the same workload. - Switch various places to use KMEM_CACHE instead of kmem_cache_create(). - Use inode_set_ctime_to_ts() in inode_set_ctime_current() - Use kzalloc() in name_to_handle_at() to avoid kernel infoleak. - Various smaller cleanups for eventfds. Fixes: - Fix various comments and typos, and unneeded initializations. - Fix stack allocation hack for clang in the select code. - Improve dump_mapping() debug code on a best-effort basis. - Fix build errors in various selftests. - Avoid wrap-around instrumentation in various places. - Don't allow user namespaces without an idmapping to be used for idmapped mounts. - Fix sysv sb_read() call. - Fix fallback implementation of the get_name() export operation" * tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (70 commits) hugetlbfs: support idmapped mounts qnx4: convert qnx4 to use the new mount api fs: use inode_set_ctime_to_ts to set inode ctime to current time libfs: Drop generic_set_encrypted_ci_d_ops ubifs: Configure dentry operations at dentry-creation time f2fs: Configure dentry operations at dentry-creation time ext4: Configure dentry operations at dentry-creation time libfs: Add helper to choose dentry operations at mount-time libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops fscrypt: Drop d_revalidate once the key is added fscrypt: Drop d_revalidate for valid dentries during lookup fscrypt: Factor out a helper to configure the lookup dentry ovl: Always reject mounting over case-insensitive directories libfs: Attempt exact-match comparison first during casefolded lookup efs: remove SLAB_MEM_SPREAD flag usage jfs: remove SLAB_MEM_SPREAD flag usage minix: remove SLAB_MEM_SPREAD flag usage openpromfs: remove SLAB_MEM_SPREAD flag usage proc: remove SLAB_MEM_SPREAD flag usage qnx6: remove SLAB_MEM_SPREAD flag usage ...
2 parents 97ec971 + 09406ad commit 7ea65c8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+816
-503
lines changed

Documentation/filesystems/files.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ before and after the reference count increment. This pattern can be seen
116116
in get_file_rcu() and __files_get_rcu().
117117

118118
In addition, it isn't possible to access or check fields in struct file
119-
without first aqcuiring a reference on it under rcu lookup. Not doing
119+
without first acquiring a reference on it under rcu lookup. Not doing
120120
that was always very dodgy and it was only usable for non-pointer data
121121
in struct file. With SLAB_TYPESAFE_BY_RCU it is necessary that callers
122122
either first acquire a reference or they must hold the files_lock of the

Documentation/filesystems/locking.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ prototypes::
2929
char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen);
3030
struct vfsmount *(*d_automount)(struct path *path);
3131
int (*d_manage)(const struct path *, bool);
32-
struct dentry *(*d_real)(struct dentry *, const struct inode *);
32+
struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
3333

3434
locking rules:
3535

Documentation/filesystems/vfs.rst

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1264,7 +1264,7 @@ defined:
12641264
char *(*d_dname)(struct dentry *, char *, int);
12651265
struct vfsmount *(*d_automount)(struct path *);
12661266
int (*d_manage)(const struct path *, bool);
1267-
struct dentry *(*d_real)(struct dentry *, const struct inode *);
1267+
struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
12681268
};
12691269
12701270
``d_revalidate``
@@ -1419,16 +1419,14 @@ defined:
14191419
the dentry being transited from.
14201420

14211421
``d_real``
1422-
overlay/union type filesystems implement this method to return
1423-
one of the underlying dentries hidden by the overlay. It is
1424-
used in two different modes:
1422+
overlay/union type filesystems implement this method to return one
1423+
of the underlying dentries of a regular file hidden by the overlay.
14251424

1426-
Called from file_dentry() it returns the real dentry matching
1427-
the inode argument. The real dentry may be from a lower layer
1428-
already copied up, but still referenced from the file. This
1429-
mode is selected with a non-NULL inode argument.
1425+
The 'type' argument takes the values D_REAL_DATA or D_REAL_METADATA
1426+
for returning the real underlying dentry that refers to the inode
1427+
hosting the file's data or metadata respectively.
14301428

1431-
With NULL inode the topmost real underlying dentry is returned.
1429+
For non-regular files, the 'dentry' argument is returned.
14321430

14331431
Each dentry has a pointer to its parent dentry, as well as a hash list
14341432
of child dentries. Child dentries are basically like files in a

fs/attr.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -352,7 +352,7 @@ int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
352352
EXPORT_SYMBOL(may_setattr);
353353

354354
/**
355-
* notify_change - modify attributes of a filesytem object
355+
* notify_change - modify attributes of a filesystem object
356356
* @idmap: idmap of the mount the inode was found from
357357
* @dentry: object affected
358358
* @attr: new attributes

fs/backing-file.c

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -325,9 +325,7 @@ EXPORT_SYMBOL_GPL(backing_file_mmap);
325325

326326
static int __init backing_aio_init(void)
327327
{
328-
backing_aio_cachep = kmem_cache_create("backing_aio",
329-
sizeof(struct backing_aio),
330-
0, SLAB_HWCACHE_ALIGN, NULL);
328+
backing_aio_cachep = KMEM_CACHE(backing_aio, SLAB_HWCACHE_ALIGN);
331329
if (!backing_aio_cachep)
332330
return -ENOMEM;
333331

fs/buffer.c

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -464,7 +464,7 @@ EXPORT_SYMBOL(mark_buffer_async_write);
464464
* a successful fsync(). For example, ext2 indirect blocks need to be
465465
* written back and waited upon before fsync() returns.
466466
*
467-
* The functions mark_buffer_inode_dirty(), fsync_inode_buffers(),
467+
* The functions mark_buffer_dirty_inode(), fsync_inode_buffers(),
468468
* inode_has_buffers() and invalidate_inode_buffers() are provided for the
469469
* management of a list of dependent buffers at ->i_mapping->i_private_list.
470470
*
@@ -3121,12 +3121,8 @@ void __init buffer_init(void)
31213121
unsigned long nrpages;
31223122
int ret;
31233123

3124-
bh_cachep = kmem_cache_create("buffer_head",
3125-
sizeof(struct buffer_head), 0,
3126-
(SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
3127-
SLAB_MEM_SPREAD),
3128-
NULL);
3129-
3124+
bh_cachep = KMEM_CACHE(buffer_head,
3125+
SLAB_RECLAIM_ACCOUNT|SLAB_PANIC);
31303126
/*
31313127
* Limit the bh occupancy to 10% of ZONE_NORMAL
31323128
*/

fs/coda/inode.c

Lines changed: 98 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@
2424
#include <linux/pid_namespace.h>
2525
#include <linux/uaccess.h>
2626
#include <linux/fs.h>
27+
#include <linux/fs_context.h>
28+
#include <linux/fs_parser.h>
2729
#include <linux/vmalloc.h>
2830

2931
#include <linux/coda.h>
@@ -87,10 +89,10 @@ void coda_destroy_inodecache(void)
8789
kmem_cache_destroy(coda_inode_cachep);
8890
}
8991

90-
static int coda_remount(struct super_block *sb, int *flags, char *data)
92+
static int coda_reconfigure(struct fs_context *fc)
9193
{
92-
sync_filesystem(sb);
93-
*flags |= SB_NOATIME;
94+
sync_filesystem(fc->root->d_sb);
95+
fc->sb_flags |= SB_NOATIME;
9496
return 0;
9597
}
9698

@@ -102,78 +104,102 @@ static const struct super_operations coda_super_operations =
102104
.evict_inode = coda_evict_inode,
103105
.put_super = coda_put_super,
104106
.statfs = coda_statfs,
105-
.remount_fs = coda_remount,
106107
};
107108

108-
static int get_device_index(struct coda_mount_data *data)
109+
struct coda_fs_context {
110+
int idx;
111+
};
112+
113+
enum {
114+
Opt_fd,
115+
};
116+
117+
static const struct fs_parameter_spec coda_param_specs[] = {
118+
fsparam_fd ("fd", Opt_fd),
119+
{}
120+
};
121+
122+
static int coda_parse_fd(struct fs_context *fc, int fd)
109123
{
124+
struct coda_fs_context *ctx = fc->fs_private;
110125
struct fd f;
111126
struct inode *inode;
112127
int idx;
113128

114-
if (data == NULL) {
115-
pr_warn("%s: Bad mount data\n", __func__);
116-
return -1;
117-
}
118-
119-
if (data->version != CODA_MOUNT_VERSION) {
120-
pr_warn("%s: Bad mount version\n", __func__);
121-
return -1;
122-
}
123-
124-
f = fdget(data->fd);
129+
f = fdget(fd);
125130
if (!f.file)
126-
goto Ebadf;
131+
return -EBADF;
127132
inode = file_inode(f.file);
128133
if (!S_ISCHR(inode->i_mode) || imajor(inode) != CODA_PSDEV_MAJOR) {
129134
fdput(f);
130-
goto Ebadf;
135+
return invalf(fc, "code: Not coda psdev");
131136
}
132137

133138
idx = iminor(inode);
134139
fdput(f);
135140

136-
if (idx < 0 || idx >= MAX_CODADEVS) {
137-
pr_warn("%s: Bad minor number\n", __func__);
138-
return -1;
141+
if (idx < 0 || idx >= MAX_CODADEVS)
142+
return invalf(fc, "coda: Bad minor number");
143+
ctx->idx = idx;
144+
return 0;
145+
}
146+
147+
static int coda_parse_param(struct fs_context *fc, struct fs_parameter *param)
148+
{
149+
struct fs_parse_result result;
150+
int opt;
151+
152+
opt = fs_parse(fc, coda_param_specs, param, &result);
153+
if (opt < 0)
154+
return opt;
155+
156+
switch (opt) {
157+
case Opt_fd:
158+
return coda_parse_fd(fc, result.uint_32);
139159
}
140160

141-
return idx;
142-
Ebadf:
143-
pr_warn("%s: Bad file\n", __func__);
144-
return -1;
161+
return 0;
162+
}
163+
164+
/*
165+
* Parse coda's binary mount data form. We ignore any errors and go with index
166+
* 0 if we get one for backward compatibility.
167+
*/
168+
static int coda_parse_monolithic(struct fs_context *fc, void *_data)
169+
{
170+
struct coda_mount_data *data = _data;
171+
172+
if (!data)
173+
return invalf(fc, "coda: Bad mount data");
174+
175+
if (data->version != CODA_MOUNT_VERSION)
176+
return invalf(fc, "coda: Bad mount version");
177+
178+
coda_parse_fd(fc, data->fd);
179+
return 0;
145180
}
146181

147-
static int coda_fill_super(struct super_block *sb, void *data, int silent)
182+
static int coda_fill_super(struct super_block *sb, struct fs_context *fc)
148183
{
184+
struct coda_fs_context *ctx = fc->fs_private;
149185
struct inode *root = NULL;
150186
struct venus_comm *vc;
151187
struct CodaFid fid;
152188
int error;
153-
int idx;
154-
155-
if (task_active_pid_ns(current) != &init_pid_ns)
156-
return -EINVAL;
157-
158-
idx = get_device_index((struct coda_mount_data *) data);
159189

160-
/* Ignore errors in data, for backward compatibility */
161-
if(idx == -1)
162-
idx = 0;
163-
164-
pr_info("%s: device index: %i\n", __func__, idx);
190+
infof(fc, "coda: device index: %i\n", ctx->idx);
165191

166-
vc = &coda_comms[idx];
192+
vc = &coda_comms[ctx->idx];
167193
mutex_lock(&vc->vc_mutex);
168194

169195
if (!vc->vc_inuse) {
170-
pr_warn("%s: No pseudo device\n", __func__);
196+
errorf(fc, "coda: No pseudo device");
171197
error = -EINVAL;
172198
goto unlock_out;
173199
}
174200

175201
if (vc->vc_sb) {
176-
pr_warn("%s: Device already mounted\n", __func__);
202+
errorf(fc, "coda: Device already mounted");
177203
error = -EBUSY;
178204
goto unlock_out;
179205
}
@@ -313,18 +339,45 @@ static int coda_statfs(struct dentry *dentry, struct kstatfs *buf)
313339
return 0;
314340
}
315341

316-
/* init_coda: used by filesystems.c to register coda */
342+
static int coda_get_tree(struct fs_context *fc)
343+
{
344+
if (task_active_pid_ns(current) != &init_pid_ns)
345+
return -EINVAL;
317346

318-
static struct dentry *coda_mount(struct file_system_type *fs_type,
319-
int flags, const char *dev_name, void *data)
347+
return get_tree_nodev(fc, coda_fill_super);
348+
}
349+
350+
static void coda_free_fc(struct fs_context *fc)
320351
{
321-
return mount_nodev(fs_type, flags, data, coda_fill_super);
352+
kfree(fc->fs_private);
353+
}
354+
355+
static const struct fs_context_operations coda_context_ops = {
356+
.free = coda_free_fc,
357+
.parse_param = coda_parse_param,
358+
.parse_monolithic = coda_parse_monolithic,
359+
.get_tree = coda_get_tree,
360+
.reconfigure = coda_reconfigure,
361+
};
362+
363+
static int coda_init_fs_context(struct fs_context *fc)
364+
{
365+
struct coda_fs_context *ctx;
366+
367+
ctx = kzalloc(sizeof(struct coda_fs_context), GFP_KERNEL);
368+
if (!ctx)
369+
return -ENOMEM;
370+
371+
fc->fs_private = ctx;
372+
fc->ops = &coda_context_ops;
373+
return 0;
322374
}
323375

324376
struct file_system_type coda_fs_type = {
325377
.owner = THIS_MODULE,
326378
.name = "coda",
327-
.mount = coda_mount,
379+
.init_fs_context = coda_init_fs_context,
380+
.parameters = coda_param_specs,
328381
.kill_sb = kill_anon_super,
329382
.fs_flags = FS_BINARY_MOUNTDATA,
330383
};

fs/crypto/fname.c

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -74,13 +74,7 @@ struct fscrypt_nokey_name {
7474

7575
static inline bool fscrypt_is_dot_dotdot(const struct qstr *str)
7676
{
77-
if (str->len == 1 && str->name[0] == '.')
78-
return true;
79-
80-
if (str->len == 2 && str->name[0] == '.' && str->name[1] == '.')
81-
return true;
82-
83-
return false;
77+
return is_dot_dotdot(str->name, str->len);
8478
}
8579

8680
/**

fs/crypto/hooks.c

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -102,11 +102,8 @@ int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
102102
if (err && err != -ENOENT)
103103
return err;
104104

105-
if (fname->is_nokey_name) {
106-
spin_lock(&dentry->d_lock);
107-
dentry->d_flags |= DCACHE_NOKEY_NAME;
108-
spin_unlock(&dentry->d_lock);
109-
}
105+
fscrypt_prepare_dentry(dentry, fname->is_nokey_name);
106+
110107
return err;
111108
}
112109
EXPORT_SYMBOL_GPL(__fscrypt_prepare_lookup);
@@ -131,12 +128,10 @@ EXPORT_SYMBOL_GPL(__fscrypt_prepare_lookup);
131128
int fscrypt_prepare_lookup_partial(struct inode *dir, struct dentry *dentry)
132129
{
133130
int err = fscrypt_get_encryption_info(dir, true);
131+
bool is_nokey_name = (!err && !fscrypt_has_encryption_key(dir));
132+
133+
fscrypt_prepare_dentry(dentry, is_nokey_name);
134134

135-
if (!err && !fscrypt_has_encryption_key(dir)) {
136-
spin_lock(&dentry->d_lock);
137-
dentry->d_flags |= DCACHE_NOKEY_NAME;
138-
spin_unlock(&dentry->d_lock);
139-
}
140135
return err;
141136
}
142137
EXPORT_SYMBOL_GPL(fscrypt_prepare_lookup_partial);

fs/dcache.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3139,7 +3139,7 @@ static void __init dcache_init(void)
31393139
* of the dcache.
31403140
*/
31413141
dentry_cache = KMEM_CACHE_USERCOPY(dentry,
3142-
SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
3142+
SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_ACCOUNT,
31433143
d_iname);
31443144

31453145
/* Hash may have been set up in dcache_init_early */

0 commit comments

Comments
 (0)