Skip to content

Commit 8cfe148

Browse files
mrutland-armbonzini
authored andcommitted
kvm/arm64: rework guest entry logic
In kvm_arch_vcpu_ioctl_run() we enter an RCU extended quiescent state (EQS) by calling guest_enter_irqoff(), and unmasked IRQs prior to exiting the EQS by calling guest_exit(). As the IRQ entry code will not wake RCU in this case, we may run the core IRQ code and IRQ handler without RCU watching, leading to various potential problems. Additionally, we do not inform lockdep or tracing that interrupts will be enabled during guest execution, which caan lead to misleading traces and warnings that interrupts have been enabled for overly-long periods. This patch fixes these issues by using the new timing and context entry/exit helpers to ensure that interrupts are handled during guest vtime but with RCU watching, with a sequence: guest_timing_enter_irqoff(); guest_state_enter_irqoff(); < run the vcpu > guest_state_exit_irqoff(); < take any pending IRQs > guest_timing_exit_irqoff(); Since instrumentation may make use of RCU, we must also ensure that no instrumented code is run during the EQS. I've split out the critical section into a new kvm_arm_enter_exit_vcpu() helper which is marked noinstr. Fixes: 1b3d546 ("arm/arm64: KVM: Properly account for guest CPU time") Reported-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Message-Id: <20220201132926.3301912-3-mark.rutland@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1 parent b2d2af7 commit 8cfe148

File tree

1 file changed

+33
-18
lines changed

1 file changed

+33
-18
lines changed

arch/arm64/kvm/arm.c

Lines changed: 33 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -790,6 +790,24 @@ static bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu, int *ret)
790790
xfer_to_guest_mode_work_pending();
791791
}
792792

793+
/*
794+
* Actually run the vCPU, entering an RCU extended quiescent state (EQS) while
795+
* the vCPU is running.
796+
*
797+
* This must be noinstr as instrumentation may make use of RCU, and this is not
798+
* safe during the EQS.
799+
*/
800+
static int noinstr kvm_arm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
801+
{
802+
int ret;
803+
804+
guest_state_enter_irqoff();
805+
ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
806+
guest_state_exit_irqoff();
807+
808+
return ret;
809+
}
810+
793811
/**
794812
* kvm_arch_vcpu_ioctl_run - the main VCPU run function to execute guest code
795813
* @vcpu: The VCPU pointer
@@ -874,9 +892,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
874892
* Enter the guest
875893
*/
876894
trace_kvm_entry(*vcpu_pc(vcpu));
877-
guest_enter_irqoff();
895+
guest_timing_enter_irqoff();
878896

879-
ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
897+
ret = kvm_arm_vcpu_enter_exit(vcpu);
880898

881899
vcpu->mode = OUTSIDE_GUEST_MODE;
882900
vcpu->stat.exits++;
@@ -911,26 +929,23 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
911929
kvm_arch_vcpu_ctxsync_fp(vcpu);
912930

913931
/*
914-
* We may have taken a host interrupt in HYP mode (ie
915-
* while executing the guest). This interrupt is still
916-
* pending, as we haven't serviced it yet!
932+
* We must ensure that any pending interrupts are taken before
933+
* we exit guest timing so that timer ticks are accounted as
934+
* guest time. Transiently unmask interrupts so that any
935+
* pending interrupts are taken.
917936
*
918-
* We're now back in SVC mode, with interrupts
919-
* disabled. Enabling the interrupts now will have
920-
* the effect of taking the interrupt again, in SVC
921-
* mode this time.
937+
* Per ARM DDI 0487G.b section D1.13.4, an ISB (or other
938+
* context synchronization event) is necessary to ensure that
939+
* pending interrupts are taken.
922940
*/
923941
local_irq_enable();
942+
isb();
943+
local_irq_disable();
944+
945+
guest_timing_exit_irqoff();
946+
947+
local_irq_enable();
924948

925-
/*
926-
* We do local_irq_enable() before calling guest_exit() so
927-
* that if a timer interrupt hits while running the guest we
928-
* account that tick as being spent in the guest. We enable
929-
* preemption after calling guest_exit() so that if we get
930-
* preempted we make sure ticks after that is not counted as
931-
* guest time.
932-
*/
933-
guest_exit();
934949
trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
935950

936951
/* Exit types that need handling before we can be preempted */

0 commit comments

Comments
 (0)