Skip to content

Commit 386d69f

Browse files
committed
KVM: x86/mmu: Treat TDP MMU faults as spurious if access is already allowed
Treat slow-path TDP MMU faults as spurious if the access is allowed given the existing SPTE to fix a benign warning (other than the WARN itself) due to replacing a writable SPTE with a read-only SPTE, and to avoid the unnecessary LOCK CMPXCHG and subsequent TLB flush. If a read fault races with a write fault, fast GUP fails for any reason when trying to "promote" the read fault to a writable mapping, and KVM resolves the write fault first, then KVM will end up trying to install a read-only SPTE (for a !map_writable fault) overtop a writable SPTE. Note, it's not entirely clear why fast GUP fails, or if that's even how KVM ends up with a !map_writable fault with a writable SPTE. If something else is going awry, e.g. due to a bug in mmu_notifiers, then treating read faults as spurious in this scenario could effectively mask the underlying problem. However, retrying the faulting access instead of overwriting an existing SPTE is functionally correct and desirable irrespective of the WARN, and fast GUP _can_ legitimately fail with a writable VMA, e.g. if the Accessed bit in primary MMU's PTE is toggled and causes a PTE value mismatch. The WARN was also recently added, specifically to track down scenarios where KVM is unnecessarily overwrites SPTEs, i.e. treating the fault as spurious doesn't regress KVM's bug-finding capabilities in any way. In short, letting the WARN linger because there's a tiny chance it's due to a bug elsewhere would be excessively paranoid. Fixes: 1a17508 ("KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable") Reported-by: Lei Yang <leiyang@redhat.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219588 Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218213611.3181643-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
1 parent 4d5163c commit 386d69f

File tree

3 files changed

+22
-12
lines changed

3 files changed

+22
-12
lines changed

arch/x86/kvm/mmu/mmu.c

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3364,18 +3364,6 @@ static bool fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu,
33643364
return true;
33653365
}
33663366

3367-
static bool is_access_allowed(struct kvm_page_fault *fault, u64 spte)
3368-
{
3369-
if (fault->exec)
3370-
return is_executable_pte(spte);
3371-
3372-
if (fault->write)
3373-
return is_writable_pte(spte);
3374-
3375-
/* Fault was on Read access */
3376-
return spte & PT_PRESENT_MASK;
3377-
}
3378-
33793367
/*
33803368
* Returns the last level spte pointer of the shadow page walk for the given
33813369
* gpa, and sets *spte to the spte value. This spte may be non-preset. If no

arch/x86/kvm/mmu/spte.h

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -461,6 +461,23 @@ static inline bool is_mmu_writable_spte(u64 spte)
461461
return spte & shadow_mmu_writable_mask;
462462
}
463463

464+
/*
465+
* Returns true if the access indicated by @fault is allowed by the existing
466+
* SPTE protections. Note, the caller is responsible for checking that the
467+
* SPTE is a shadow-present, leaf SPTE (either before or after).
468+
*/
469+
static inline bool is_access_allowed(struct kvm_page_fault *fault, u64 spte)
470+
{
471+
if (fault->exec)
472+
return is_executable_pte(spte);
473+
474+
if (fault->write)
475+
return is_writable_pte(spte);
476+
477+
/* Fault was on Read access */
478+
return spte & PT_PRESENT_MASK;
479+
}
480+
464481
/*
465482
* If the MMU-writable flag is cleared, i.e. the SPTE is write-protected for
466483
* write-tracking, remote TLBs must be flushed, even if the SPTE was read-only,

arch/x86/kvm/mmu/tdp_mmu.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -985,6 +985,11 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
985985
if (fault->prefetch && is_shadow_present_pte(iter->old_spte))
986986
return RET_PF_SPURIOUS;
987987

988+
if (is_shadow_present_pte(iter->old_spte) &&
989+
is_access_allowed(fault, iter->old_spte) &&
990+
is_last_spte(iter->old_spte, iter->level))
991+
return RET_PF_SPURIOUS;
992+
988993
if (unlikely(!fault->slot))
989994
new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
990995
else

0 commit comments

Comments
 (0)