Skip to content

Commit 3890491

Browse files
committed
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini: - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr - Documentation improvements - Prevent module exit until all VMs are freed - PMU Virtualization fixes - Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences - Other miscellaneous bugfixes * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) KVM: x86: fix sending PV IPI KVM: x86/mmu: do compare-and-exchange of gPTE via the user address KVM: x86: Remove redundant vm_entry_controls_clearbit() call KVM: x86: cleanup enter_rmode() KVM: x86: SVM: fix tsc scaling when the host doesn't support it kvm: x86: SVM: remove unused defines KVM: x86: SVM: move tsc ratio definitions to svm.h KVM: x86: SVM: fix avic spec based definitions again KVM: MIPS: remove reference to trap&emulate virtualization KVM: x86: document limitations of MSR filtering KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr KVM: x86/emulator: Emulate RDPID only if it is enabled in guest KVM: x86/pmu: Fix and isolate TSX-specific performance event logic KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs KVM: x86: Trace all APICv inhibit changes and capture overall status KVM: x86: Add wrappers for setting/clearing APICv inhibits KVM: x86: Make APICv inhibit reasons an enum and cleanup naming KVM: X86: Handle implicit supervisor access with SMAP KVM: X86: Rename variable smap to not_smap in permission_fault() ...
2 parents 6f34f8c + c15e0ae commit 3890491

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+617
-414
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 55 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -151,12 +151,6 @@ In order to create user controlled virtual machines on S390, check
151151
KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as
152152
privileged user (CAP_SYS_ADMIN).
153153

154-
To use hardware assisted virtualization on MIPS (VZ ASE) rather than
155-
the default trap & emulate implementation (which changes the virtual
156-
memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
157-
flag KVM_VM_MIPS_VZ.
158-
159-
160154
On arm64, the physical address size for a VM (IPA Size limit) is limited
161155
to 40bits by default. The limit can be configured if the host supports the
162156
extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
@@ -4081,6 +4075,11 @@ x2APIC MSRs are always allowed, independent of the ``default_allow`` setting,
40814075
and their behavior depends on the ``X2APIC_ENABLE`` bit of the APIC base
40824076
register.
40834077

4078+
.. warning::
4079+
MSR accesses coming from nested vmentry/vmexit are not filtered.
4080+
This includes both writes to individual VMCS fields and reads/writes
4081+
through the MSR lists pointed to by the VMCS.
4082+
40844083
If a bit is within one of the defined ranges, read and write accesses are
40854084
guarded by the bitmap's value for the MSR index if the kind of access
40864085
is included in the ``struct kvm_msr_filter_range`` flags. If no range
@@ -5293,6 +5292,10 @@ type values:
52935292

52945293
KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
52955294
Sets the guest physical address of the vcpu_info for a given vCPU.
5295+
As with the shared_info page for the VM, the corresponding page may be
5296+
dirtied at any time if event channel interrupt delivery is enabled, so
5297+
userspace should always assume that the page is dirty without relying
5298+
on dirty logging.
52965299

52975300
KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
52985301
Sets the guest physical address of an additional pvclock structure
@@ -7719,3 +7722,49 @@ only be invoked on a VM prior to the creation of VCPUs.
77197722
At this time, KVM_PMU_CAP_DISABLE is the only capability. Setting
77207723
this capability will disable PMU virtualization for that VM. Usermode
77217724
should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
7725+
7726+
9. Known KVM API problems
7727+
=========================
7728+
7729+
In some cases, KVM's API has some inconsistencies or common pitfalls
7730+
that userspace need to be aware of. This section details some of
7731+
these issues.
7732+
7733+
Most of them are architecture specific, so the section is split by
7734+
architecture.
7735+
7736+
9.1. x86
7737+
--------
7738+
7739+
``KVM_GET_SUPPORTED_CPUID`` issues
7740+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7741+
7742+
In general, ``KVM_GET_SUPPORTED_CPUID`` is designed so that it is possible
7743+
to take its result and pass it directly to ``KVM_SET_CPUID2``. This section
7744+
documents some cases in which that requires some care.
7745+
7746+
Local APIC features
7747+
~~~~~~~~~~~~~~~~~~~
7748+
7749+
CPU[EAX=1]:ECX[21] (X2APIC) is reported by ``KVM_GET_SUPPORTED_CPUID``,
7750+
but it can only be enabled if ``KVM_CREATE_IRQCHIP`` or
7751+
``KVM_ENABLE_CAP(KVM_CAP_IRQCHIP_SPLIT)`` are used to enable in-kernel emulation of
7752+
the local APIC.
7753+
7754+
The same is true for the ``KVM_FEATURE_PV_UNHALT`` paravirtualized feature.
7755+
7756+
CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by ``KVM_GET_SUPPORTED_CPUID``.
7757+
It can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` is present and the kernel
7758+
has enabled in-kernel emulation of the local APIC.
7759+
7760+
Obsolete ioctls and capabilities
7761+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7762+
7763+
KVM_CAP_DISABLE_QUIRKS does not let userspace know which quirks are actually
7764+
available. Use ``KVM_CHECK_EXTENSION(KVM_CAP_DISABLE_QUIRKS2)`` instead if
7765+
available.
7766+
7767+
Ordering of KVM_GET_*/KVM_SET_* ioctls
7768+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7769+
7770+
TBD

Documentation/virt/kvm/index.rst

Lines changed: 7 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -8,25 +8,13 @@ KVM
88
:maxdepth: 2
99

1010
api
11-
amd-memory-encryption
12-
cpuid
13-
halt-polling
14-
hypercalls
15-
locking
16-
mmu
17-
msr
18-
nested-vmx
19-
ppc-pv
20-
s390-diag
21-
s390-pv
22-
s390-pv-boot
23-
timekeeping
24-
vcpu-requests
25-
26-
review-checklist
11+
devices/index
2712

2813
arm/index
14+
s390/index
15+
ppc-pv
16+
x86/index
2917

30-
devices/index
31-
32-
running-nested-guests
18+
locking
19+
vcpu-requests
20+
review-checklist

Documentation/virt/kvm/locking.rst

Lines changed: 34 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -210,32 +210,47 @@ time it will be set using the Dirty tracking mechanism described above.
210210
3. Reference
211211
------------
212212

213-
:Name: kvm_lock
213+
``kvm_lock``
214+
^^^^^^^^^^^^
215+
214216
:Type: mutex
215217
:Arch: any
216218
:Protects: - vm_list
217219

218-
:Name: kvm_count_lock
220+
``kvm_count_lock``
221+
^^^^^^^^^^^^^^^^^^
222+
219223
:Type: raw_spinlock_t
220224
:Arch: any
221225
:Protects: - hardware virtualization enable/disable
222226
:Comment: 'raw' because hardware enabling/disabling must be atomic /wrt
223227
migration.
224228

225-
:Name: kvm_arch::tsc_write_lock
226-
:Type: raw_spinlock
229+
``kvm->mn_invalidate_lock``
230+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
231+
232+
:Type: spinlock_t
233+
:Arch: any
234+
:Protects: mn_active_invalidate_count, mn_memslots_update_rcuwait
235+
236+
``kvm_arch::tsc_write_lock``
237+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
238+
239+
:Type: raw_spinlock_t
227240
:Arch: x86
228241
:Protects: - kvm_arch::{last_tsc_write,last_tsc_nsec,last_tsc_offset}
229242
- tsc offset in vmcb
230243
:Comment: 'raw' because updating the tsc offsets must not be preempted.
231244

232-
:Name: kvm->mmu_lock
233-
:Type: spinlock_t
245+
``kvm->mmu_lock``
246+
^^^^^^^^^^^^^^^^^
247+
:Type: spinlock_t or rwlock_t
234248
:Arch: any
235249
:Protects: -shadow page/shadow tlb entry
236250
:Comment: it is a spinlock since it is used in mmu notifier.
237251

238-
:Name: kvm->srcu
252+
``kvm->srcu``
253+
^^^^^^^^^^^^^
239254
:Type: srcu lock
240255
:Arch: any
241256
:Protects: - kvm->memslots
@@ -246,10 +261,20 @@ time it will be set using the Dirty tracking mechanism described above.
246261
The srcu index can be stored in kvm_vcpu->srcu_idx per vcpu
247262
if it is needed by multiple functions.
248263

249-
:Name: blocked_vcpu_on_cpu_lock
264+
``kvm->slots_arch_lock``
265+
^^^^^^^^^^^^^^^^^^^^^^^^
266+
:Type: mutex
267+
:Arch: any (only needed on x86 though)
268+
:Protects: any arch-specific fields of memslots that have to be modified
269+
in a ``kvm->srcu`` read-side critical section.
270+
:Comment: must be held before reading the pointer to the current memslots,
271+
until after all changes to the memslots are complete
272+
273+
``wakeup_vcpus_on_cpu_lock``
274+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
250275
:Type: spinlock_t
251276
:Arch: x86
252-
:Protects: blocked_vcpu_on_cpu
277+
:Protects: wakeup_vcpus_on_cpu
253278
:Comment: This is a per-CPU lock and it is used for VT-d posted-interrupts.
254279
When VT-d posted-interrupts is supported and the VM has assigned
255280
devices, we put the blocked vCPU on the list blocked_vcpu_on_cpu

Documentation/virt/kvm/s390/index.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
====================
4+
KVM for s390 systems
5+
====================
6+
7+
.. toctree::
8+
:maxdepth: 2
9+
10+
s390-diag
11+
s390-pv
12+
s390-pv-boot

Documentation/virt/kvm/vcpu-requests.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,16 @@ KVM_REQ_UNHALT
135135
such as a pending signal, which does not indicate the VCPU's halt
136136
emulation should stop, and therefore does not make the request.
137137

138+
KVM_REQ_OUTSIDE_GUEST_MODE
139+
140+
This "request" ensures the target vCPU has exited guest mode prior to the
141+
sender of the request continuing on. No action needs be taken by the target,
142+
and so no request is actually logged for the target. This request is similar
143+
to a "kick", but unlike a kick it guarantees the vCPU has actually exited
144+
guest mode. A kick only guarantees the vCPU will exit at some point in the
145+
future, e.g. a previous kick may have started the process, but there's no
146+
guarantee the to-be-kicked vCPU has fully exited guest mode.
147+
138148
KVM_REQUEST_MASK
139149
----------------
140150

0 commit comments

Comments
 (0)