Skip to content

Commit 2d5780b

Browse files
ptesarikChristoph Hellwig
authored andcommitted
swiotlb: fix the check whether a device has used software IO TLB
When CONFIG_SWIOTLB_DYNAMIC=y, devices which do not use the software IO TLB can avoid swiotlb lookup. A flag is added by commit 1395706 ("swiotlb: search the software IO TLB only if the device makes use of it"), the flag is correctly set, but it is then never checked. Add the actual check here. Note that this code is an alternative to the default pool check, not an additional check, because: 1. swiotlb_find_pool() also searches the default pool; 2. if dma_uses_io_tlb is false, the default swiotlb pool is not used. Tested in a KVM guest against a QEMU RAM-backed SATA disk over virtio and *not* using software IO TLB, this patch increases IOPS by approx 2% for 4-way parallel I/O. The write memory barrier in swiotlb_dyn_alloc() is not needed, because a newly allocated pool must always be observed by swiotlb_find_slots() before an address from that pool is passed to is_swiotlb_buffer(). Correctness was verified using the following litmus test: C swiotlb-new-pool (* * Result: Never * * Check that a newly allocated pool is always visible when the * corresponding swiotlb buffer is visible. *) { mem_pools = default; } P0(int **mem_pools, int *pool) { /* add_mem_pool() */ WRITE_ONCE(*pool, 999); rcu_assign_pointer(*mem_pools, pool); } P1(int **mem_pools, int *flag, int *buf) { /* swiotlb_find_slots() */ int *r0; int r1; rcu_read_lock(); r0 = READ_ONCE(*mem_pools); r1 = READ_ONCE(*r0); rcu_read_unlock(); if (r1) { WRITE_ONCE(*flag, 1); smp_mb(); } /* device driver (presumed) */ WRITE_ONCE(*buf, r1); } P2(int **mem_pools, int *flag, int *buf) { /* device driver (presumed) */ int r0 = READ_ONCE(*buf); /* is_swiotlb_buffer() */ int r1; int *r2; int r3; smp_rmb(); r1 = READ_ONCE(*flag); if (r1) { /* swiotlb_find_pool() */ rcu_read_lock(); r2 = READ_ONCE(*mem_pools); r3 = READ_ONCE(*r2); rcu_read_unlock(); } } exists (2:r0<>0 /\ 2:r3=0) (* Not found. *) Fixes: 1395706 ("swiotlb: search the software IO TLB only if the device makes use of it") Reported-by: Jonathan Corbet <corbet@lwn.net> Closes: https://lore.kernel.org/linux-iommu/87a5uz3ob8.fsf@meer.lwn.net/ Signed-off-by: Petr Tesarik <petr@tesarici.cz> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
1 parent a6a2417 commit 2d5780b

File tree

2 files changed

+36
-13
lines changed

2 files changed

+36
-13
lines changed

include/linux/swiotlb.h

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -172,14 +172,23 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
172172
if (!mem)
173173
return false;
174174

175-
if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC)) {
176-
/* Pairs with smp_wmb() in swiotlb_find_slots() and
177-
* swiotlb_dyn_alloc(), which modify the RCU lists.
178-
*/
179-
smp_rmb();
180-
return swiotlb_find_pool(dev, paddr);
181-
}
175+
#ifdef CONFIG_SWIOTLB_DYNAMIC
176+
/*
177+
* All SWIOTLB buffer addresses must have been returned by
178+
* swiotlb_tbl_map_single() and passed to a device driver.
179+
* If a SWIOTLB address is checked on another CPU, then it was
180+
* presumably loaded by the device driver from an unspecified private
181+
* data structure. Make sure that this load is ordered before reading
182+
* dev->dma_uses_io_tlb here and mem->pools in swiotlb_find_pool().
183+
*
184+
* This barrier pairs with smp_mb() in swiotlb_find_slots().
185+
*/
186+
smp_rmb();
187+
return READ_ONCE(dev->dma_uses_io_tlb) &&
188+
swiotlb_find_pool(dev, paddr);
189+
#else
182190
return paddr >= mem->defpool.start && paddr < mem->defpool.end;
191+
#endif
183192
}
184193

185194
static inline bool is_swiotlb_force_bounce(struct device *dev)

kernel/dma/swiotlb.c

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -728,9 +728,6 @@ static void swiotlb_dyn_alloc(struct work_struct *work)
728728
}
729729

730730
add_mem_pool(mem, pool);
731-
732-
/* Pairs with smp_rmb() in is_swiotlb_buffer(). */
733-
smp_wmb();
734731
}
735732

736733
/**
@@ -1151,9 +1148,26 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
11511148
spin_unlock_irqrestore(&dev->dma_io_tlb_lock, flags);
11521149

11531150
found:
1154-
dev->dma_uses_io_tlb = true;
1155-
/* Pairs with smp_rmb() in is_swiotlb_buffer() */
1156-
smp_wmb();
1151+
WRITE_ONCE(dev->dma_uses_io_tlb, true);
1152+
1153+
/*
1154+
* The general barrier orders reads and writes against a presumed store
1155+
* of the SWIOTLB buffer address by a device driver (to a driver private
1156+
* data structure). It serves two purposes.
1157+
*
1158+
* First, the store to dev->dma_uses_io_tlb must be ordered before the
1159+
* presumed store. This guarantees that the returned buffer address
1160+
* cannot be passed to another CPU before updating dev->dma_uses_io_tlb.
1161+
*
1162+
* Second, the load from mem->pools must be ordered before the same
1163+
* presumed store. This guarantees that the returned buffer address
1164+
* cannot be observed by another CPU before an update of the RCU list
1165+
* that was made by swiotlb_dyn_alloc() on a third CPU (cf. multicopy
1166+
* atomicity).
1167+
*
1168+
* See also the comment in is_swiotlb_buffer().
1169+
*/
1170+
smp_mb();
11571171

11581172
*retpool = pool;
11591173
return index;

0 commit comments

Comments
 (0)