Skip to content

Commit 2a29315

Browse files
Furong XuPaolo Abeni
authored andcommitted
net: stmmac: Optimize cache prefetch in RX path
Current code prefetches cache lines for the received frame first, and then dma_sync_single_for_cpu() against this frame, this is wrong. Cache prefetch should be triggered after dma_sync_single_for_cpu(). This patch brings ~2.8% driver performance improvement in a TCP RX throughput test with iPerf tool on a single isolated Cortex-A65 CPU core, 2.84 Gbits/sec increased to 2.92 Gbits/sec. Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
1 parent 2324c78 commit 2a29315

File tree

1 file changed

+1
-4
lines changed

1 file changed

+1
-4
lines changed

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5508,10 +5508,6 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
55085508

55095509
/* Buffer is good. Go on. */
55105510

5511-
prefetch(page_address(buf->page) + buf->page_offset);
5512-
if (buf->sec_page)
5513-
prefetch(page_address(buf->sec_page));
5514-
55155511
buf1_len = stmmac_rx_buf1_len(priv, p, status, len);
55165512
len += buf1_len;
55175513
buf2_len = stmmac_rx_buf2_len(priv, p, status, len);
@@ -5533,6 +5529,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
55335529

55345530
dma_sync_single_for_cpu(priv->device, buf->addr,
55355531
buf1_len, dma_dir);
5532+
prefetch(page_address(buf->page) + buf->page_offset);
55365533

55375534
xdp_init_buff(&ctx.xdp, buf_sz, &rx_q->xdp_rxq);
55385535
xdp_prepare_buff(&ctx.xdp, page_address(buf->page),

0 commit comments

Comments
 (0)