Skip to content

Commit aaa2f14

Browse files
broonieMarc Zyngier
authored andcommitted
KVM: arm64: Clarify host SME state management
Normally when running a guest we do not touch the floating point register state until first use of floating point by the guest, saving the current state and loading the guest state at that point. This has been found to offer a performance benefit in common cases. However currently if SME is active when switching to a guest then we exit streaming mode, disable ZA and invalidate the floating point register state prior to starting the guest. The exit from streaming mode is required for correct guest operation, if we leave streaming mode enabled then many non-SME operations can generate SME traps (eg, SVE operations will become streaming SVE operations). If EL1 leaves CPACR_EL1.SMEN disabled then the host is unable to intercept these traps. This will mean that a SME unaware guest will see SME exceptions which will confuse it. Disabling streaming mode also avoids creating spurious indications of usage of the SME hardware which could impact system performance, especially with shared SME implementations. Document the requirement to exit streaming mode clearly. There is no issue with guest operation caused by PSTATE.ZA so we can defer handling for that until first floating point usage, do so if the register state is not that of the current task and hence has already been saved. We could also do this for the case where the register state is that for the current task however this is very unlikely to happen and would require disproportionate effort so continue to save the state in that case. Saving this state on first use would require that we map and unmap storage for the host version of these registers for use by the hypervisor, taking care to deal with protected KVM and the fact that the host can free or reallocate the backing storage. Given that the strong recommendation is that applications should only keep PSTATE.ZA enabled when the state it enables is in active use it is difficult to see a case where a VMM would wish to do this, it would need to not only be using SME but also running the guest in the middle of SME usage. This can be revisited in the future if a use case does arises, in the interim such tasks will work but experience a performance overhead. This brings our handling of SME more into line with our handling of other floating point state and documents more clearly the constraints we have, especially around streaming mode. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221214-kvm-arm64-sme-context-switch-v2-3-57ba0082e9ff@kernel.org
1 parent d071cef commit aaa2f14

File tree

1 file changed

+12
-9
lines changed

1 file changed

+12
-9
lines changed

arch/arm64/kvm/fpsimd.c

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -92,20 +92,23 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
9292
if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
9393
vcpu_set_flag(vcpu, HOST_SVE_ENABLED);
9494

95-
/*
96-
* We don't currently support SME guests but if we leave
97-
* things in streaming mode then when the guest starts running
98-
* FPSIMD or SVE code it may generate SME traps so as a
99-
* special case if we are in streaming mode we force the host
100-
* state to be saved now and exit streaming mode so that we
101-
* don't have to handle any SME traps for valid guest
102-
* operations. Do this for ZA as well for now for simplicity.
103-
*/
10495
if (system_supports_sme()) {
10596
vcpu_clear_flag(vcpu, HOST_SME_ENABLED);
10697
if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN)
10798
vcpu_set_flag(vcpu, HOST_SME_ENABLED);
10899

100+
/*
101+
* If PSTATE.SM is enabled then save any pending FP
102+
* state and disable PSTATE.SM. If we leave PSTATE.SM
103+
* enabled and the guest does not enable SME via
104+
* CPACR_EL1.SMEN then operations that should be valid
105+
* may generate SME traps from EL1 to EL1 which we
106+
* can't intercept and which would confuse the guest.
107+
*
108+
* Do the same for PSTATE.ZA in the case where there
109+
* is state in the registers which has not already
110+
* been saved, this is very unlikely to happen.
111+
*/
109112
if (read_sysreg_s(SYS_SVCR) & (SVCR_SM_MASK | SVCR_ZA_MASK)) {
110113
vcpu->arch.fp_state = FP_STATE_FREE;
111114
fpsimd_save_and_flush_cpu_state();

0 commit comments

Comments
 (0)