Skip to content

Commit 0583135

Browse files
Baolin Wangakpm00
authored andcommitted
mm: shmem: fix potential data corruption during shmem swapin
Alex and Kairui reported some issues (system hang or data corruption) when swapping out or swapping in large shmem folios. This is especially easy to reproduce when the tmpfs is mount with the 'huge=within_size' parameter. Thanks to Kairui's reproducer, the issue can be easily replicated. The root cause of the problem is that swap readahead may asynchronously swap in order 0 folios into the swap cache, while the shmem mapping can still store large swap entries. Then an order 0 folio is inserted into the shmem mapping without splitting the large swap entry, which overwrites the original large swap entry, leading to data corruption. When getting a folio from the swap cache, we should split the large swap entry stored in the shmem mapping if the orders do not match, to fix this issue. Link: https://lkml.kernel.org/r/2fe47c557e74e9df5fe2437ccdc6c9115fa1bf70.1740476943.git.baolin.wang@linux.alibaba.com Fixes: 809bc86 ("mm: shmem: support large folio swap out") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca> Reported-by: Kairui Song <ryncsn@gmail.com> Closes: https://lore.kernel.org/all/1738717785.im3r5g2vxc.none@localhost/ Tested-by: Kairui Song <kasong@tencent.com> Cc: David Hildenbrand <david@redhat.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcow <willy@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent c50f8e6 commit 0583135

File tree

1 file changed

+27
-4
lines changed

1 file changed

+27
-4
lines changed

mm/shmem.c

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2253,7 +2253,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
22532253
struct folio *folio = NULL;
22542254
bool skip_swapcache = false;
22552255
swp_entry_t swap;
2256-
int error, nr_pages;
2256+
int error, nr_pages, order, split_order;
22572257

22582258
VM_BUG_ON(!*foliop || !xa_is_value(*foliop));
22592259
swap = radix_to_swp_entry(*foliop);
@@ -2272,10 +2272,9 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
22722272

22732273
/* Look it up and read it in.. */
22742274
folio = swap_cache_get_folio(swap, NULL, 0);
2275+
order = xa_get_order(&mapping->i_pages, index);
22752276
if (!folio) {
2276-
int order = xa_get_order(&mapping->i_pages, index);
22772277
bool fallback_order0 = false;
2278-
int split_order;
22792278

22802279
/* Or update major stats only when swapin succeeds?? */
22812280
if (fault_type) {
@@ -2339,14 +2338,38 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
23392338
error = -ENOMEM;
23402339
goto failed;
23412340
}
2341+
} else if (order != folio_order(folio)) {
2342+
/*
2343+
* Swap readahead may swap in order 0 folios into swapcache
2344+
* asynchronously, while the shmem mapping can still stores
2345+
* large swap entries. In such cases, we should split the
2346+
* large swap entry to prevent possible data corruption.
2347+
*/
2348+
split_order = shmem_split_large_entry(inode, index, swap, gfp);
2349+
if (split_order < 0) {
2350+
error = split_order;
2351+
goto failed;
2352+
}
2353+
2354+
/*
2355+
* If the large swap entry has already been split, it is
2356+
* necessary to recalculate the new swap entry based on
2357+
* the old order alignment.
2358+
*/
2359+
if (split_order > 0) {
2360+
pgoff_t offset = index - round_down(index, 1 << split_order);
2361+
2362+
swap = swp_entry(swp_type(swap), swp_offset(swap) + offset);
2363+
}
23422364
}
23432365

23442366
alloced:
23452367
/* We have to do this with folio locked to prevent races */
23462368
folio_lock(folio);
23472369
if ((!skip_swapcache && !folio_test_swapcache(folio)) ||
23482370
folio->swap.val != swap.val ||
2349-
!shmem_confirm_swap(mapping, index, swap)) {
2371+
!shmem_confirm_swap(mapping, index, swap) ||
2372+
xa_get_order(&mapping->i_pages, index) != folio_order(folio)) {
23502373
error = -EEXIST;
23512374
goto unlock;
23522375
}

0 commit comments

Comments
 (0)