Skip to content

Commit 61d65f2

Browse files
48casean-jc
authored andcommitted
KVM: x86/mmu: Don't force atomic update if only the Accessed bit is volatile
Don't force SPTE modifications to be done atomically if the only volatile bit in the SPTE is the Accessed bit. KVM and the primary MMU tolerate stale aging state, and the probability of an Accessed bit A/D assist being clobbered *and* affecting again is likely far lower than the probability of consuming stale information due to not flushing TLBs when aging. Rename spte_has_volatile_bits() to spte_needs_atomic_update() to better capture the nature of the helper. Opportunstically do s/write/update on the TDP MMU wrapper, as it's not simply the "write" that needs to be done atomically, it's the entire update, i.e. the entire read-modify-write operation needs to be done atomically so that KVM has an accurate view of the old SPTE. Leave kvm_tdp_mmu_write_spte_atomic() as is. While the name is imperfect, it pairs with kvm_tdp_mmu_write_spte(), which in turn pairs with kvm_tdp_mmu_read_spte(). And renaming all of those isn't obviously a net positive, and would require significant churn. Signed-off-by: James Houghton <jthoughton@google.com> Link: https://lore.kernel.org/r/20250204004038.1680123-6-jthoughton@google.com Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
1 parent e29b749 commit 61d65f2

File tree

5 files changed

+28
-30
lines changed

5 files changed

+28
-30
lines changed

Documentation/virt/kvm/locking.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -196,7 +196,7 @@ writable between reading spte and updating spte. Like below case:
196196
The Dirty bit is lost in this case.
197197

198198
In order to avoid this kind of issue, we always treat the spte as "volatile"
199-
if it can be updated out of mmu-lock [see spte_has_volatile_bits()]; it means
199+
if it can be updated out of mmu-lock [see spte_needs_atomic_update()]; it means
200200
the spte is always atomically updated in this case.
201201

202202
3) flush tlbs due to spte updated
@@ -212,7 +212,7 @@ function to update spte (present -> present).
212212

213213
Since the spte is "volatile" if it can be updated out of mmu-lock, we always
214214
atomically update the spte and the race caused by fast page fault can be avoided.
215-
See the comments in spte_has_volatile_bits() and mmu_spte_update().
215+
See the comments in spte_needs_atomic_update() and mmu_spte_update().
216216

217217
Lockless Access Tracking:
218218

arch/x86/kvm/mmu/mmu.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -501,7 +501,7 @@ static bool mmu_spte_update(u64 *sptep, u64 new_spte)
501501
return false;
502502
}
503503

504-
if (!spte_has_volatile_bits(old_spte))
504+
if (!spte_needs_atomic_update(old_spte))
505505
__update_clear_spte_fast(sptep, new_spte);
506506
else
507507
old_spte = __update_clear_spte_slow(sptep, new_spte);
@@ -524,7 +524,7 @@ static u64 mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
524524
int level = sptep_to_sp(sptep)->role.level;
525525

526526
if (!is_shadow_present_pte(old_spte) ||
527-
!spte_has_volatile_bits(old_spte))
527+
!spte_needs_atomic_update(old_spte))
528528
__update_clear_spte_fast(sptep, SHADOW_NONPRESENT_VALUE);
529529
else
530530
old_spte = __update_clear_spte_slow(sptep, SHADOW_NONPRESENT_VALUE);

arch/x86/kvm/mmu/spte.c

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -129,25 +129,30 @@ static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
129129
}
130130

131131
/*
132-
* Returns true if the SPTE has bits that may be set without holding mmu_lock.
133-
* The caller is responsible for checking if the SPTE is shadow-present, and
134-
* for determining whether or not the caller cares about non-leaf SPTEs.
132+
* Returns true if the SPTE needs to be updated atomically due to having bits
133+
* that may be changed without holding mmu_lock, and for which KVM must not
134+
* lose information. E.g. KVM must not drop Dirty bit information. The caller
135+
* is responsible for checking if the SPTE is shadow-present, and for
136+
* determining whether or not the caller cares about non-leaf SPTEs.
135137
*/
136-
bool spte_has_volatile_bits(u64 spte)
138+
bool spte_needs_atomic_update(u64 spte)
137139
{
140+
/* SPTEs can be made Writable bit by KVM's fast page fault handler. */
138141
if (!is_writable_pte(spte) && is_mmu_writable_spte(spte))
139142
return true;
140143

144+
/* Access-tracked SPTEs can be restored by KVM's fast page fault handler. */
141145
if (is_access_track_spte(spte))
142146
return true;
143147

144-
if (spte_ad_enabled(spte)) {
145-
if (!(spte & shadow_accessed_mask) ||
146-
(is_writable_pte(spte) && !(spte & shadow_dirty_mask)))
147-
return true;
148-
}
149-
150-
return false;
148+
/*
149+
* Dirty and Accessed bits can be set by the CPU. Ignore the Accessed
150+
* bit, as KVM tolerates false negatives/positives, e.g. KVM doesn't
151+
* invalidate TLBs when aging SPTEs, and so it's safe to clobber the
152+
* Accessed bit (and rare in practice).
153+
*/
154+
return spte_ad_enabled(spte) && is_writable_pte(spte) &&
155+
!(spte & shadow_dirty_mask);
151156
}
152157

153158
bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,

arch/x86/kvm/mmu/spte.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -519,7 +519,7 @@ static inline u64 get_mmio_spte_generation(u64 spte)
519519
return gen;
520520
}
521521

522-
bool spte_has_volatile_bits(u64 spte);
522+
bool spte_needs_atomic_update(u64 spte);
523523

524524
bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
525525
const struct kvm_memory_slot *slot,

arch/x86/kvm/mmu/tdp_iter.h

Lines changed: 7 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -39,28 +39,21 @@ static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_spte)
3939
}
4040

4141
/*
42-
* SPTEs must be modified atomically if they are shadow-present, leaf
43-
* SPTEs, and have volatile bits, i.e. has bits that can be set outside
44-
* of mmu_lock. The Writable bit can be set by KVM's fast page fault
45-
* handler, and Accessed and Dirty bits can be set by the CPU.
46-
*
47-
* Note, non-leaf SPTEs do have Accessed bits and those bits are
48-
* technically volatile, but KVM doesn't consume the Accessed bit of
49-
* non-leaf SPTEs, i.e. KVM doesn't care if it clobbers the bit. This
50-
* logic needs to be reassessed if KVM were to use non-leaf Accessed
51-
* bits, e.g. to skip stepping down into child SPTEs when aging SPTEs.
42+
* SPTEs must be modified atomically if they are shadow-present, leaf SPTEs,
43+
* and have volatile bits (bits that can be set outside of mmu_lock) that
44+
* must not be clobbered.
5245
*/
53-
static inline bool kvm_tdp_mmu_spte_need_atomic_write(u64 old_spte, int level)
46+
static inline bool kvm_tdp_mmu_spte_need_atomic_update(u64 old_spte, int level)
5447
{
5548
return is_shadow_present_pte(old_spte) &&
5649
is_last_spte(old_spte, level) &&
57-
spte_has_volatile_bits(old_spte);
50+
spte_needs_atomic_update(old_spte);
5851
}
5952

6053
static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte,
6154
u64 new_spte, int level)
6255
{
63-
if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level))
56+
if (kvm_tdp_mmu_spte_need_atomic_update(old_spte, level))
6457
return kvm_tdp_mmu_write_spte_atomic(sptep, new_spte);
6558

6659
__kvm_tdp_mmu_write_spte(sptep, new_spte);
@@ -70,7 +63,7 @@ static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte,
7063
static inline u64 tdp_mmu_clear_spte_bits(tdp_ptep_t sptep, u64 old_spte,
7164
u64 mask, int level)
7265
{
73-
if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level))
66+
if (kvm_tdp_mmu_spte_need_atomic_update(old_spte, level))
7467
return tdp_mmu_clear_spte_bits_atomic(sptep, mask);
7568

7669
__kvm_tdp_mmu_write_spte(sptep, old_spte & ~mask);

0 commit comments

Comments
 (0)