Skip to content

Commit d2bda3b

Browse files
Remove Unnecessary/Erroneous Reads In sgemm_tcopy_16.S COPY1x8 Macro
There appears to have been some code leak when copying from the COPY2x8 macro above where we're reading 8 bytes into d4-d7 directly after reading 4 bytes into s4-s7. These 32 bytes in d4-7 are unused and can possibly overrun the boundary of allocated memory -- Valgrind detected this which is what dragged my attention to it for a 128,1 copy. Additionally, there is no need to update the addresses stored in A0-A7 as the only possible paths after running this macro will overwrite A0-7 if looping to the next 8 rows, or overwrite A0-3 if moving to 4 rows -- in which case A4-7 are unused.
1 parent 903fd85 commit d2bda3b

File tree

1 file changed

+0
-10
lines changed

1 file changed

+0
-10
lines changed

kernel/arm64/sgemm_tcopy_16.S

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -270,11 +270,6 @@ All rights reserved.
270270
ldr s1, [A02]
271271
ldr s2, [A03]
272272
ldr s3, [A04]
273-
274-
add A01, A01, #4
275-
add A02, A02, #4
276-
add A03, A03, #4
277-
add A04, A04, #4
278273

279274
stp s0, s1, [B04]
280275
add B04, B04, #8
@@ -285,11 +280,6 @@ All rights reserved.
285280
ldr s5, [A06]
286281
ldr s6, [A07]
287282
ldr s7, [A08]
288-
289-
ldr d4, [A05], #8
290-
ldr d5, [A06], #8
291-
ldr d6, [A07], #8
292-
ldr d7, [A08], #8
293283

294284
stp s4, s5, [B04]
295285
add B04, B04, #8

0 commit comments

Comments
 (0)