Skip to content

Commit 62e72d2

Browse files
ryncsnakpm00
authored andcommitted
mm, madvise: fix potential workingset node list_lru leaks
Since commit 5abc1e3 ("mm: list_lru: allocate list_lru_one only when needed"), all list_lru users need to allocate the items using the new infrastructure that provides list_lru info for slab allocation, ensuring that the corresponding memcg list_lru is allocated before use. For workingset shadow nodes (which are xa_node), users are converted to use the new infrastructure by commit 9bbdc0f ("xarray: use kmem_cache_alloc_lru to allocate xa_node"). The xas->xa_lru will be set correctly for filemap users. However, there is a missing case: xa_node allocations caused by madvise(..., MADV_COLLAPSE). madvise(..., MADV_COLLAPSE) will also read in the absent parts of file map, and there will be xa_nodes allocated for the caller's memcg (assuming it's not rootcg). However, these allocations won't trigger memcg list_lru allocation because the proper xas info was not set. If nothing else has allocated other xa_nodes for that memcg to trigger list_lru creation, and memory pressure starts to evict file pages, workingset_update_node will try to add these xa_nodes to their corresponding memcg list_lru, and it does not exist (NULL). So they will be added to rootcg's list_lru instead. This shouldn't be a significant issue in practice, but it is indeed unexpected behavior, and these xa_nodes will not be reclaimed effectively. And may lead to incorrect counting of the list_lru->nr_items counter. This problem wasn't exposed until recent commit 28e9802 ("mm/list_lru: simplify reparenting and initial allocation") added a sanity check: only dying memcg could have a NULL list_lru when list_lru_{add,del} is called. This problem triggered this WARNING. So make madvise(..., MADV_COLLAPSE) also call xas_set_lru() to pass the list_lru which we may want to insert xa_node into later. And move mapping_set_update to mm/internal.h, and turn into a macro to avoid including extra headers in mm/internal.h. Link: https://lkml.kernel.org/r/20241222122936.67501-1-ryncsn@gmail.com Fixes: 9bbdc0f ("xarray: use kmem_cache_alloc_lru to allocate xa_node") Reported-by: syzbot+38a0cbd267eff2d286ff@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/675d01e9.050a0220.37aaf.00be.GAE@google.com/ Signed-off-by: Kairui Song <kasong@tencent.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sasha Levin <sashal@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent 7d390b5 commit 62e72d2

File tree

3 files changed

+9
-9
lines changed

3 files changed

+9
-9
lines changed

mm/filemap.c

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -124,15 +124,6 @@
124124
* ->private_lock (zap_pte_range->block_dirty_folio)
125125
*/
126126

127-
static void mapping_set_update(struct xa_state *xas,
128-
struct address_space *mapping)
129-
{
130-
if (dax_mapping(mapping) || shmem_mapping(mapping))
131-
return;
132-
xas_set_update(xas, workingset_update_node);
133-
xas_set_lru(xas, &shadow_nodes);
134-
}
135-
136127
static void page_cache_delete(struct address_space *mapping,
137128
struct folio *folio, void *shadow)
138129
{

mm/internal.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1504,6 +1504,12 @@ static inline void shrinker_debugfs_remove(struct dentry *debugfs_entry,
15041504
/* Only track the nodes of mappings with shadow entries */
15051505
void workingset_update_node(struct xa_node *node);
15061506
extern struct list_lru shadow_nodes;
1507+
#define mapping_set_update(xas, mapping) do { \
1508+
if (!dax_mapping(mapping) && !shmem_mapping(mapping)) { \
1509+
xas_set_update(xas, workingset_update_node); \
1510+
xas_set_lru(xas, &shadow_nodes); \
1511+
} \
1512+
} while (0)
15071513

15081514
/* mremap.c */
15091515
unsigned long move_page_tables(struct vm_area_struct *vma,

mm/khugepaged.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
#include <linux/rcupdate_wait.h>
2020
#include <linux/swapops.h>
2121
#include <linux/shmem_fs.h>
22+
#include <linux/dax.h>
2223
#include <linux/ksm.h>
2324

2425
#include <asm/tlb.h>
@@ -1837,6 +1838,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
18371838
if (result != SCAN_SUCCEED)
18381839
goto out;
18391840

1841+
mapping_set_update(&xas, mapping);
1842+
18401843
__folio_set_locked(new_folio);
18411844
if (is_shmem)
18421845
__folio_set_swapbacked(new_folio);

0 commit comments

Comments
 (0)