Skip to content

Commit be45bc4

Browse files
committed
KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow
Enable/disable local IRQs, i.e. set/clear RFLAGS.IF, in the common svm_vcpu_enter_exit() just after/before guest_state_{enter,exit}_irqoff() so that VMRUN is not executed in an STI shadow. AMD CPUs have a quirk (some would say "bug"), where the STI shadow bleeds into the guest's intr_state field if a #VMEXIT occurs during injection of an event, i.e. if the VMRUN doesn't complete before the subsequent #VMEXIT. The spurious "interrupts masked" state is relatively benign, as it only occurs during event injection and is transient. Because KVM is already injecting an event, the guest can't be in HLT, and if KVM is querying IRQ blocking for injection, then KVM would need to force an immediate exit anyways since injecting multiple events is impossible. However, because KVM copies int_state verbatim from vmcb02 to vmcb12, the spurious STI shadow is visible to L1 when running a nested VM, which can trip sanity checks, e.g. in VMware's VMM. Hoist the STI+CLI all the way to C code, as the aforementioned calls to guest_state_{enter,exit}_irqoff() already inform lockdep that IRQs are enabled/disabled, and taking a fault on VMRUN with RFLAGS.IF=1 is already possible. I.e. if there's kernel code that is confused by running with RFLAGS.IF=1, then it's already a problem. In practice, since GIF=0 also blocks NMIs, the only change in exposure to non-KVM code (relative to surrounding VMRUN with STI+CLI) is exception handling code, and except for the kvm_rebooting=1 case, all exception in the core VM-Enter/VM-Exit path are fatal. Use the "raw" variants to enable/disable IRQs to avoid tracing in the "no instrumentation" code; the guest state helpers also take care of tracing IRQ state. Oppurtunstically document why KVM needs to do STI in the first place. Reported-by: Doug Covelli <doug.covelli@broadcom.com> Closes: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4uzJQ@mail.gmail.com Fixes: f14eec0 ("KVM: SVM: move more vmentry code to assembly") Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20250224165442.2338294-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
1 parent c2fee09 commit be45bc4

File tree

2 files changed

+15
-9
lines changed

2 files changed

+15
-9
lines changed

arch/x86/kvm/svm/svm.c

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4189,6 +4189,18 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in
41894189

41904190
guest_state_enter_irqoff();
41914191

4192+
/*
4193+
* Set RFLAGS.IF prior to VMRUN, as the host's RFLAGS.IF at the time of
4194+
* VMRUN controls whether or not physical IRQs are masked (KVM always
4195+
* runs with V_INTR_MASKING_MASK). Toggle RFLAGS.IF here to avoid the
4196+
* temptation to do STI+VMRUN+CLI, as AMD CPUs bleed the STI shadow
4197+
* into guest state if delivery of an event during VMRUN triggers a
4198+
* #VMEXIT, and the guest_state transitions already tell lockdep that
4199+
* IRQs are being enabled/disabled. Note! GIF=0 for the entirety of
4200+
* this path, so IRQs aren't actually unmasked while running host code.
4201+
*/
4202+
raw_local_irq_enable();
4203+
41924204
amd_clear_divider();
41934205

41944206
if (sev_es_guest(vcpu->kvm))
@@ -4197,6 +4209,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in
41974209
else
41984210
__svm_vcpu_run(svm, spec_ctrl_intercepted);
41994211

4212+
raw_local_irq_disable();
4213+
42004214
guest_state_exit_irqoff();
42014215
}
42024216

arch/x86/kvm/svm/vmenter.S

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -170,12 +170,8 @@ SYM_FUNC_START(__svm_vcpu_run)
170170
mov VCPU_RDI(%_ASM_DI), %_ASM_DI
171171

172172
/* Enter guest mode */
173-
sti
174-
175173
3: vmrun %_ASM_AX
176174
4:
177-
cli
178-
179175
/* Pop @svm to RAX while it's the only available register. */
180176
pop %_ASM_AX
181177

@@ -340,12 +336,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
340336
mov KVM_VMCB_pa(%rax), %rax
341337

342338
/* Enter guest mode */
343-
sti
344-
345339
1: vmrun %rax
346-
347-
2: cli
348-
340+
2:
349341
/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
350342
FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
351343

0 commit comments

Comments
 (0)