You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi()
KASAN detected the following issue:
BUG: KASAN: slab-out-of-bounds in hyperv_flush_tlb_multi+0xf88/0x1060
Read of size 4 at addr ffff8880011ccbc0 by task kcompactd0/33
CPU: 1 PID: 33 Comm: kcompactd0 Not tainted 5.14.0-39.el9.x86_64+debug #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine,
BIOS Hyper-V UEFI Release v4.0 12/17/2019
Call Trace:
dump_stack_lvl+0x57/0x7d
print_address_description.constprop.0+0x1f/0x140
? hyperv_flush_tlb_multi+0xf88/0x1060
__kasan_report.cold+0x7f/0x11e
? hyperv_flush_tlb_multi+0xf88/0x1060
kasan_report+0x38/0x50
hyperv_flush_tlb_multi+0xf88/0x1060
flush_tlb_mm_range+0x1b1/0x200
ptep_clear_flush+0x10e/0x150
...
Allocated by task 0:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc+0x7c/0x90
hv_common_init+0xae/0x115
hyperv_init+0x97/0x501
apic_intr_mode_init+0xb3/0x1e0
x86_late_time_init+0x92/0xa2
start_kernel+0x338/0x3eb
secondary_startup_64_no_verify+0xc2/0xcb
The buggy address belongs to the object at ffff8880011cc800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 960 bytes inside of
1024-byte region [ffff8880011cc800, ffff8880011ccc00)
'hyperv_flush_tlb_multi+0xf88/0x1060' points to
hv_cpu_number_to_vp_number() and '960 bytes' means we're trying to get
VP_INDEX for CPU#240. 'nr_cpus' here is exactly 240 so we're trying to
access past hv_vp_index's last element. This can (and will) happen
when 'cpus' mask is empty and cpumask_last() will return '>=nr_cpus'.
Commit ad0a6ba ("x86/hyperv: check cpu mask after interrupt has
been disabled") tried to deal with empty cpumask situation but
apparently didn't fully fix the issue.
'cpus' cpumask which is passed to hyperv_flush_tlb_multi() is
'mm_cpumask(mm)' (which is '&mm->cpu_bitmap'). This mask changes every
time the particular mm is scheduled/unscheduled on some CPU (see
switch_mm_irqs_off()), disabling IRQs on the CPU which is performing remote
TLB flush has zero influence on whether the particular process can get
scheduled/unscheduled on _other_ CPUs so e.g. in the case where the mm was
scheduled on one other CPU and got unscheduled during
hyperv_flush_tlb_multi()'s execution will lead to cpumask becoming empty.
It doesn't seem that there's a good way to protect 'mm_cpumask(mm)'
from changing during hyperv_flush_tlb_multi()'s execution. It would be
possible to copy it in the very beginning of the function but this is a
waste. It seems we can deal with changing cpumask just fine.
When 'cpus' cpumask changes during hyperv_flush_tlb_multi()'s
execution, there are two possible issues:
- 'Under-flushing': we will not flush TLB on a CPU which got added to
the mask while hyperv_flush_tlb_multi() was already running. This is
not a problem as this is equal to mm getting scheduled on that CPU
right after TLB flush.
- 'Over-flushing': we may flush TLB on a CPU which is already cleared
from the mask. First, extra TLB flush preserves correctness. Second,
Hyper-V's TLB flush hypercall takes 'mm->pgd' argument so Hyper-V may
avoid the flush if CR3 doesn't match.
Fix the immediate issue with cpumask_last()/hv_cpu_number_to_vp_number()
and remove the pointless cpumask_empty() check from the beginning of the
function as it really doesn't protect anything. Also, avoid the hypercall
altogether when 'flush->processor_mask' ends up being empty.
Fixes: ad0a6ba ("x86/hyperv: check cpu mask after interrupt has been disabled")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220106094611.1404218-1-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
0 commit comments