Skip to content

Commit fadaf57

Browse files
committed
Merge tag 'kvm-x86-docs-6.7' of https://github.com/kvm-x86/linux into HEAD
KVM x86 Documentation updates for 6.7: - Fix various typos, notably a confusing reference to the non-existent "struct kvm_vcpu_event" (the actual structure is kvm_vcpu_events, plural). - Update x86's kvm_mmu_page documentation to bring it closer to the code (this raced with the removal of async zapping and so the documentation is already stale; my bad). - Document the behavior of x86 PMU filters on fixed counters.
2 parents f233646 + b35babd commit fadaf57

File tree

2 files changed

+61
-18
lines changed

2 files changed

+61
-18
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 27 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -547,7 +547,7 @@ ioctl is useful if the in-kernel PIC is not used.
547547
PPC:
548548
^^^^
549549

550-
Queues an external interrupt to be injected. This ioctl is overleaded
550+
Queues an external interrupt to be injected. This ioctl is overloaded
551551
with 3 different irq values:
552552

553553
a) KVM_INTERRUPT_SET
@@ -998,7 +998,7 @@ be set in the flags field of this ioctl:
998998
The KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL flag requests KVM to generate
999999
the contents of the hypercall page automatically; hypercalls will be
10001000
intercepted and passed to userspace through KVM_EXIT_XEN. In this
1001-
ase, all of the blob size and address fields must be zero.
1001+
case, all of the blob size and address fields must be zero.
10021002

10031003
The KVM_XEN_HVM_CONFIG_EVTCHN_SEND flag indicates to KVM that userspace
10041004
will always use the KVM_XEN_HVM_EVTCHN_SEND ioctl to deliver event
@@ -1103,7 +1103,7 @@ Other flags returned by ``KVM_GET_CLOCK`` are accepted but ignored.
11031103
:Extended by: KVM_CAP_INTR_SHADOW
11041104
:Architectures: x86, arm64
11051105
:Type: vcpu ioctl
1106-
:Parameters: struct kvm_vcpu_event (out)
1106+
:Parameters: struct kvm_vcpu_events (out)
11071107
:Returns: 0 on success, -1 on error
11081108

11091109
X86:
@@ -1226,7 +1226,7 @@ directly to the virtual CPU).
12261226
:Extended by: KVM_CAP_INTR_SHADOW
12271227
:Architectures: x86, arm64
12281228
:Type: vcpu ioctl
1229-
:Parameters: struct kvm_vcpu_event (in)
1229+
:Parameters: struct kvm_vcpu_events (in)
12301230
:Returns: 0 on success, -1 on error
12311231

12321232
X86:
@@ -3115,7 +3115,7 @@ as follow::
31153115
};
31163116

31173117
An entry with a "page_shift" of 0 is unused. Because the array is
3118-
organized in increasing order, a lookup can stop when encoutering
3118+
organized in increasing order, a lookup can stop when encountering
31193119
such an entry.
31203120

31213121
The "slb_enc" field provides the encoding to use in the SLB for the
@@ -3507,7 +3507,7 @@ Possible features:
35073507
- KVM_RUN and KVM_GET_REG_LIST are not available;
35083508

35093509
- KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access
3510-
the scalable archietctural SVE registers
3510+
the scalable architectural SVE registers
35113511
KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() or
35123512
KVM_REG_ARM64_SVE_FFR;
35133513

@@ -4453,7 +4453,7 @@ This will have undefined effects on the guest if it has not already
44534453
placed itself in a quiescent state where no vcpu will make MMU enabled
44544454
memory accesses.
44554455

4456-
On succsful completion, the pending HPT will become the guest's active
4456+
On successful completion, the pending HPT will become the guest's active
44574457
HPT and the previous HPT will be discarded.
44584458

44594459
On failure, the guest will still be operating on its previous HPT.
@@ -5068,7 +5068,7 @@ before the vcpu is fully usable.
50685068

50695069
Between KVM_ARM_VCPU_INIT and KVM_ARM_VCPU_FINALIZE, the feature may be
50705070
configured by use of ioctls such as KVM_SET_ONE_REG. The exact configuration
5071-
that should be performaned and how to do it are feature-dependent.
5071+
that should be performed and how to do it are feature-dependent.
50725072

50735073
Other calls that depend on a particular feature being finalized, such as
50745074
KVM_RUN, KVM_GET_REG_LIST, KVM_GET_ONE_REG and KVM_SET_ONE_REG, will fail with
@@ -5176,6 +5176,24 @@ Valid values for 'action'::
51765176
#define KVM_PMU_EVENT_ALLOW 0
51775177
#define KVM_PMU_EVENT_DENY 1
51785178

5179+
Via this API, KVM userspace can also control the behavior of the VM's fixed
5180+
counters (if any) by configuring the "action" and "fixed_counter_bitmap" fields.
5181+
5182+
Specifically, KVM follows the following pseudo-code when determining whether to
5183+
allow the guest FixCtr[i] to count its pre-defined fixed event::
5184+
5185+
FixCtr[i]_is_allowed = (action == ALLOW) && (bitmap & BIT(i)) ||
5186+
(action == DENY) && !(bitmap & BIT(i));
5187+
FixCtr[i]_is_denied = !FixCtr[i]_is_allowed;
5188+
5189+
KVM always consumes fixed_counter_bitmap, it's userspace's responsibility to
5190+
ensure fixed_counter_bitmap is set correctly, e.g. if userspace wants to define
5191+
a filter that only affects general purpose counters.
5192+
5193+
Note, the "events" field also applies to fixed counters' hardcoded event_select
5194+
and unit_mask values. "fixed_counter_bitmap" has higher priority than "events"
5195+
if there is a contradiction between the two.
5196+
51795197
4.121 KVM_PPC_SVM_OFF
51805198
---------------------
51815199

@@ -5527,7 +5545,7 @@ KVM_XEN_ATTR_TYPE_EVTCHN
55275545
from the guest. A given sending port number may be directed back to
55285546
a specified vCPU (by APIC ID) / port / priority on the guest, or to
55295547
trigger events on an eventfd. The vCPU and priority can be changed
5530-
by setting KVM_XEN_EVTCHN_UPDATE in a subsequent call, but but other
5548+
by setting KVM_XEN_EVTCHN_UPDATE in a subsequent call, but other
55315549
fields cannot change for a given sending port. A port mapping is
55325550
removed by using KVM_XEN_EVTCHN_DEASSIGN in the flags field. Passing
55335551
KVM_XEN_EVTCHN_RESET in the flags field removes all interception of

Documentation/virt/kvm/x86/mmu.rst

Lines changed: 34 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -202,10 +202,22 @@ Shadow pages contain the following information:
202202
Is 1 if the MMU instance cannot use A/D bits. EPT did not have A/D
203203
bits before Haswell; shadow EPT page tables also cannot use A/D bits
204204
if the L1 hypervisor does not enable them.
205+
role.guest_mode:
206+
Indicates the shadow page is created for a nested guest.
205207
role.passthrough:
206208
The page is not backed by a guest page table, but its first entry
207209
points to one. This is set if NPT uses 5-level page tables (host
208210
CR4.LA57=1) and is shadowing L1's 4-level NPT (L1 CR4.LA57=0).
211+
mmu_valid_gen:
212+
The MMU generation of this page, used to fast zap of all MMU pages within a
213+
VM without blocking vCPUs too long. Specifically, KVM updates the per-VM
214+
valid MMU generation which causes the mismatch of mmu_valid_gen for each mmu
215+
page. This makes all existing MMU pages obsolete. Obsolete pages can't be
216+
used. Therefore, vCPUs must load a new, valid root before re-entering the
217+
guest. The MMU generation is only ever '0' or '1'. Note, the TDP MMU doesn't
218+
use this field as non-root TDP MMU pages are reachable only from their
219+
owning root. Thus it suffices for TDP MMU to use role.invalid in root pages
220+
to invalidate all MMU pages.
209221
gfn:
210222
Either the guest page table containing the translations shadowed by this
211223
page, or the base page frame for linear translations. See role.direct.
@@ -219,21 +231,30 @@ Shadow pages contain the following information:
219231
at __pa(sp2->spt). sp2 will point back at sp1 through parent_pte.
220232
The spt array forms a DAG structure with the shadow page as a node, and
221233
guest pages as leaves.
222-
gfns:
223-
An array of 512 guest frame numbers, one for each present pte. Used to
224-
perform a reverse map from a pte to a gfn. When role.direct is set, any
225-
element of this array can be calculated from the gfn field when used, in
226-
this case, the array of gfns is not allocated. See role.direct and gfn.
227-
root_count:
228-
A counter keeping track of how many hardware registers (guest cr3 or
229-
pdptrs) are now pointing at the page. While this counter is nonzero, the
230-
page cannot be destroyed. See role.invalid.
234+
shadowed_translation:
235+
An array of 512 shadow translation entries, one for each present pte. Used
236+
to perform a reverse map from a pte to a gfn as well as its access
237+
permission. When role.direct is set, the shadow_translation array is not
238+
allocated. This is because the gfn contained in any element of this array
239+
can be calculated from the gfn field when used. In addition, when
240+
role.direct is set, KVM does not track access permission for each of the
241+
gfn. See role.direct and gfn.
242+
root_count / tdp_mmu_root_count:
243+
root_count is a reference counter for root shadow pages in Shadow MMU.
244+
vCPUs elevate the refcount when getting a shadow page that will be used as
245+
a root page, i.e. page that will be loaded into hardware directly (CR3,
246+
PDPTRs, nCR3 EPTP). Root pages cannot be destroyed while their refcount is
247+
non-zero. See role.invalid. tdp_mmu_root_count is similar but exclusively
248+
used in TDP MMU as an atomic refcount.
231249
parent_ptes:
232250
The reverse mapping for the pte/ptes pointing at this page's spt. If
233251
parent_ptes bit 0 is zero, only one spte points at this page and
234252
parent_ptes points at this single spte, otherwise, there exists multiple
235253
sptes pointing at this page and (parent_ptes & ~0x1) points at a data
236254
structure with a list of parent sptes.
255+
ptep:
256+
The kernel virtual address of the SPTE that points at this shadow page.
257+
Used exclusively by the TDP MMU, this field is a union with parent_ptes.
237258
unsync:
238259
If true, then the translations in this page may not match the guest's
239260
translation. This is equivalent to the state of the tlb when a pte is
@@ -261,6 +282,10 @@ Shadow pages contain the following information:
261282
since the last time the page table was actually used; if emulation
262283
is triggered too frequently on this page, KVM will unmap the page
263284
to avoid emulation in the future.
285+
tdp_mmu_page:
286+
Is 1 if the shadow page is a TDP MMU page. This variable is used to
287+
bifurcate the control flows for KVM when walking any data structure that
288+
may contain pages from both TDP MMU and shadow MMU.
264289

265290
Reverse map
266291
===========

0 commit comments

Comments
 (0)