You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 8, 2023. It is now read-only.
The codgen for adding architecture-specific stack alignment to the
effective alloca() usage is somewhat inefficient and allows a bit to get
carried beyond the desired entropy range. This isn't really a problem,
but it's unexpected and the codegen is kind of bad.
Quoting Mark[1], the disassembly for arm64's invoke_syscall() looks like:
// offset = raw_cpu_read(kstack_offset)
mov x4, sp
adrp x0, kstack_offset
mrs x5, tpidr_el1
add x0, x0, #:lo12:kstack_offset
ldr w0, [x0, x5]
// offset = KSTACK_OFFSET_MAX(offset)
and x0, x0, #0x3ff
// alloca(offset)
add x0, x0, #0xf
and x0, x0, #0x7f0
sub sp, x4, x0
... which in C would be:
offset = raw_cpu_read(kstack_offset)
offset &= 0x3ff; // [0x0, 0x3ff]
offset += 0xf; // [0xf, 0x40e]
offset &= 0x7f0; // [0x0,
... so when *all* bits [3:0] are 0, they'll have no impact, and when
*any* of bits [3:0] are 1 they'll trigger a carry into bit 4, which
could ripple all the way up and spill into bit 10.
Switch the masking in KSTACK_OFFSET_MAX() to explicitly clear the bottom
bits to avoid the rounding by using 0b1111110000 instead of 0b1111111111:
// offset = raw_cpu_read(kstack_offset)
mov x4, sp
adrp x0, 0 <kstack_offset>
mrs x5, tpidr_el1
add x0, x0, #:lo12:kstack_offset
ldr w0, [x0, x5]
// offset = KSTACK_OFFSET_MAX(offset)
and x0, x0, #0x3f0
// alloca(offset)
sub sp, x4, x0
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/lkml/ZnVfOnIuFl2kNWkT@J2N7QTR9R3/ [1]
Link: https://lore.kernel.org/r/20240702211612.work.576-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
0 commit comments