Skip to content

Commit 4855215

Browse files
committed
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe: "Two significant new items: - Allow reporting IOMMU HW events to userspace when the events are clearly linked to a device. This is linked to the VIOMMU object and is intended to be used by a VMM to forward HW events to the virtual machine as part of emulating a vIOMMU. ARM SMMUv3 is the first driver to use this mechanism. Like the existing fault events the data is delivered through a simple FD returning event records on read(). - PASID support in VFIO. The "Process Address Space ID" is a PCI feature that allows the device to tag all PCI DMA operations with an ID. The IOMMU will then use the ID to select a unique translation for those DMAs. This is part of Intel's vIOMMU support as VT-D HW requires the hypervisor to manage each PASID entry. The support is generic so any VFIO user could attach any translation to a PASID, and the support should work on ARM SMMUv3 as well. AMD requires additional driver work. Some minor updates, along with fixes: - Prevent using nested parents with fault's, no driver support today - Put a single "cookie_type" value in the iommu_domain to indicate what owns the various opaque owner fields" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (49 commits) iommufd: Test attach before detaching pasid iommufd: Fix iommu_vevent_header tables markup iommu: Convert unreachable() to BUG() iommufd: Balance veventq->num_events inc/dec iommufd: Initialize the flags of vevent in iommufd_viommu_report_event() iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices ida: Add ida_find_first_range() iommufd/selftest: Add coverage for iommufd pasid attach/detach iommufd/selftest: Add test ops to test pasid attach/detach iommufd/selftest: Add a helper to get test device iommufd/selftest: Add set_dev_pasid in mock iommu iommufd: Allow allocating PASID-compatible domain iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support iommufd: Enforce PASID-compatible domain for RID iommufd: Support pasid attach/replace iommufd: Enforce PASID-compatible domain in PASID path iommufd/device: Add pasid_attach array to track per-PASID attach ...
2 parents 792b830 + 7be11d3 commit 4855215

39 files changed

+3147
-829
lines changed

Documentation/userspace-api/iommufd.rst

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,13 @@ Following IOMMUFD objects are exposed to userspace:
6363
space usually has mappings from guest-level I/O virtual addresses to guest-
6464
level physical addresses.
6565

66+
- IOMMUFD_FAULT, representing a software queue for an HWPT reporting IO page
67+
faults using the IOMMU HW's PRI (Page Request Interface). This queue object
68+
provides user space an FD to poll the page fault events and also to respond
69+
to those events. A FAULT object must be created first to get a fault_id that
70+
could be then used to allocate a fault-enabled HWPT via the IOMMU_HWPT_ALLOC
71+
command by setting the IOMMU_HWPT_FAULT_ID_VALID bit in its flags field.
72+
6673
- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
6774
passed to or shared with a VM. It may be some HW-accelerated virtualization
6875
features and some SW resources used by the VM. For examples:
@@ -109,6 +116,14 @@ Following IOMMUFD objects are exposed to userspace:
109116
vIOMMU, which is a separate ioctl call from attaching the same device to an
110117
HWPT_PAGING that the vIOMMU holds.
111118

119+
- IOMMUFD_OBJ_VEVENTQ, representing a software queue for a vIOMMU to report its
120+
events such as translation faults occurred to a nested stage-1 (excluding I/O
121+
page faults that should go through IOMMUFD_OBJ_FAULT) and HW-specific events.
122+
This queue object provides user space an FD to poll/read the vIOMMU events. A
123+
vIOMMU object must be created first to get its viommu_id, which could be then
124+
used to allocate a vEVENTQ. Each vIOMMU can support multiple types of vEVENTS,
125+
but is confined to one vEVENTQ per vEVENTQ type.
126+
112127
All user-visible objects are destroyed via the IOMMU_DESTROY uAPI.
113128

114129
The diagrams below show relationships between user-visible objects and kernel
@@ -251,8 +266,10 @@ User visible objects are backed by following datastructures:
251266
- iommufd_device for IOMMUFD_OBJ_DEVICE.
252267
- iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING.
253268
- iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED.
269+
- iommufd_fault for IOMMUFD_OBJ_FAULT.
254270
- iommufd_viommu for IOMMUFD_OBJ_VIOMMU.
255271
- iommufd_vdevice for IOMMUFD_OBJ_VDEVICE.
272+
- iommufd_veventq for IOMMUFD_OBJ_VEVENTQ.
256273

257274
Several terminologies when looking at these datastructures:
258275

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,8 @@ static void arm_smmu_make_nested_cd_table_ste(
4343
target->data[0] |= nested_domain->ste[0] &
4444
~cpu_to_le64(STRTAB_STE_0_CFG);
4545
target->data[1] |= nested_domain->ste[1];
46+
/* Merge events for DoS mitigations on eventq */
47+
target->data[1] |= cpu_to_le64(STRTAB_STE_1_MEV);
4648
}
4749

4850
/*
@@ -85,6 +87,47 @@ static void arm_smmu_make_nested_domain_ste(
8587
}
8688
}
8789

90+
int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
91+
struct arm_smmu_nested_domain *nested_domain)
92+
{
93+
struct arm_smmu_vmaster *vmaster;
94+
unsigned long vsid;
95+
int ret;
96+
97+
iommu_group_mutex_assert(state->master->dev);
98+
99+
ret = iommufd_viommu_get_vdev_id(&nested_domain->vsmmu->core,
100+
state->master->dev, &vsid);
101+
if (ret)
102+
return ret;
103+
104+
vmaster = kzalloc(sizeof(*vmaster), GFP_KERNEL);
105+
if (!vmaster)
106+
return -ENOMEM;
107+
vmaster->vsmmu = nested_domain->vsmmu;
108+
vmaster->vsid = vsid;
109+
state->vmaster = vmaster;
110+
111+
return 0;
112+
}
113+
114+
void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state)
115+
{
116+
struct arm_smmu_master *master = state->master;
117+
118+
mutex_lock(&master->smmu->streams_mutex);
119+
kfree(master->vmaster);
120+
master->vmaster = state->vmaster;
121+
mutex_unlock(&master->smmu->streams_mutex);
122+
}
123+
124+
void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master)
125+
{
126+
struct arm_smmu_attach_state state = { .master = master };
127+
128+
arm_smmu_attach_commit_vmaster(&state);
129+
}
130+
88131
static int arm_smmu_attach_dev_nested(struct iommu_domain *domain,
89132
struct device *dev)
90133
{
@@ -392,4 +435,21 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
392435
return &vsmmu->core;
393436
}
394437

438+
int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt)
439+
{
440+
struct iommu_vevent_arm_smmuv3 vevt;
441+
int i;
442+
443+
lockdep_assert_held(&vmaster->vsmmu->smmu->streams_mutex);
444+
445+
vevt.evt[0] = cpu_to_le64((evt[0] & ~EVTQ_0_SID) |
446+
FIELD_PREP(EVTQ_0_SID, vmaster->vsid));
447+
for (i = 1; i < EVTQ_ENT_DWORDS; i++)
448+
vevt.evt[i] = cpu_to_le64(evt[i]);
449+
450+
return iommufd_viommu_report_event(&vmaster->vsmmu->core,
451+
IOMMU_VEVENTQ_TYPE_ARM_SMMUV3, &vevt,
452+
sizeof(vevt));
453+
}
454+
395455
MODULE_IMPORT_NS("IOMMUFD");

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

Lines changed: 52 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1052,7 +1052,7 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
10521052
cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
10531053
STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
10541054
STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW |
1055-
STRTAB_STE_1_EATS);
1055+
STRTAB_STE_1_EATS | STRTAB_STE_1_MEV);
10561056
used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID);
10571057

10581058
/*
@@ -1068,7 +1068,7 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
10681068
if (cfg & BIT(1)) {
10691069
used_bits[1] |=
10701070
cpu_to_le64(STRTAB_STE_1_S2FWB | STRTAB_STE_1_EATS |
1071-
STRTAB_STE_1_SHCFG);
1071+
STRTAB_STE_1_SHCFG | STRTAB_STE_1_MEV);
10721072
used_bits[2] |=
10731073
cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
10741074
STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
@@ -1813,8 +1813,8 @@ static void arm_smmu_decode_event(struct arm_smmu_device *smmu, u64 *raw,
18131813
mutex_unlock(&smmu->streams_mutex);
18141814
}
18151815

1816-
static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
1817-
struct arm_smmu_event *event)
1816+
static int arm_smmu_handle_event(struct arm_smmu_device *smmu, u64 *evt,
1817+
struct arm_smmu_event *event)
18181818
{
18191819
int ret = 0;
18201820
u32 perm = 0;
@@ -1823,6 +1823,10 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
18231823
struct iommu_fault *flt = &fault_evt.fault;
18241824

18251825
switch (event->id) {
1826+
case EVT_ID_BAD_STE_CONFIG:
1827+
case EVT_ID_STREAM_DISABLED_FAULT:
1828+
case EVT_ID_BAD_SUBSTREAMID_CONFIG:
1829+
case EVT_ID_BAD_CD_CONFIG:
18261830
case EVT_ID_TRANSLATION_FAULT:
18271831
case EVT_ID_ADDR_SIZE_FAULT:
18281832
case EVT_ID_ACCESS_FAULT:
@@ -1832,31 +1836,30 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
18321836
return -EOPNOTSUPP;
18331837
}
18341838

1835-
if (!event->stall)
1836-
return -EOPNOTSUPP;
1837-
1838-
if (event->read)
1839-
perm |= IOMMU_FAULT_PERM_READ;
1840-
else
1841-
perm |= IOMMU_FAULT_PERM_WRITE;
1839+
if (event->stall) {
1840+
if (event->read)
1841+
perm |= IOMMU_FAULT_PERM_READ;
1842+
else
1843+
perm |= IOMMU_FAULT_PERM_WRITE;
18421844

1843-
if (event->instruction)
1844-
perm |= IOMMU_FAULT_PERM_EXEC;
1845+
if (event->instruction)
1846+
perm |= IOMMU_FAULT_PERM_EXEC;
18451847

1846-
if (event->privileged)
1847-
perm |= IOMMU_FAULT_PERM_PRIV;
1848+
if (event->privileged)
1849+
perm |= IOMMU_FAULT_PERM_PRIV;
18481850

1849-
flt->type = IOMMU_FAULT_PAGE_REQ;
1850-
flt->prm = (struct iommu_fault_page_request) {
1851-
.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
1852-
.grpid = event->stag,
1853-
.perm = perm,
1854-
.addr = event->iova,
1855-
};
1851+
flt->type = IOMMU_FAULT_PAGE_REQ;
1852+
flt->prm = (struct iommu_fault_page_request){
1853+
.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
1854+
.grpid = event->stag,
1855+
.perm = perm,
1856+
.addr = event->iova,
1857+
};
18561858

1857-
if (event->ssv) {
1858-
flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
1859-
flt->prm.pasid = event->ssid;
1859+
if (event->ssv) {
1860+
flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
1861+
flt->prm.pasid = event->ssid;
1862+
}
18601863
}
18611864

18621865
mutex_lock(&smmu->streams_mutex);
@@ -1866,7 +1869,12 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
18661869
goto out_unlock;
18671870
}
18681871

1869-
ret = iommu_report_device_fault(master->dev, &fault_evt);
1872+
if (event->stall)
1873+
ret = iommu_report_device_fault(master->dev, &fault_evt);
1874+
else if (master->vmaster && !event->s2)
1875+
ret = arm_vmaster_report_event(master->vmaster, evt);
1876+
else
1877+
ret = -EOPNOTSUPP; /* Unhandled events should be pinned */
18701878
out_unlock:
18711879
mutex_unlock(&smmu->streams_mutex);
18721880
return ret;
@@ -1944,7 +1952,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
19441952
do {
19451953
while (!queue_remove_raw(q, evt)) {
19461954
arm_smmu_decode_event(smmu, evt, &event);
1947-
if (arm_smmu_handle_event(smmu, &event))
1955+
if (arm_smmu_handle_event(smmu, evt, &event))
19481956
arm_smmu_dump_event(smmu, evt, &event, &rs);
19491957

19501958
put_device(event.dev);
@@ -2803,6 +2811,7 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
28032811
struct arm_smmu_domain *smmu_domain =
28042812
to_smmu_domain_devices(new_domain);
28052813
unsigned long flags;
2814+
int ret;
28062815

28072816
/*
28082817
* arm_smmu_share_asid() must not see two domains pointing to the same
@@ -2832,9 +2841,18 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
28322841
}
28332842

28342843
if (smmu_domain) {
2844+
if (new_domain->type == IOMMU_DOMAIN_NESTED) {
2845+
ret = arm_smmu_attach_prepare_vmaster(
2846+
state, to_smmu_nested_domain(new_domain));
2847+
if (ret)
2848+
return ret;
2849+
}
2850+
28352851
master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
2836-
if (!master_domain)
2852+
if (!master_domain) {
2853+
kfree(state->vmaster);
28372854
return -ENOMEM;
2855+
}
28382856
master_domain->master = master;
28392857
master_domain->ssid = state->ssid;
28402858
if (new_domain->type == IOMMU_DOMAIN_NESTED)
@@ -2861,6 +2879,7 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
28612879
spin_unlock_irqrestore(&smmu_domain->devices_lock,
28622880
flags);
28632881
kfree(master_domain);
2882+
kfree(state->vmaster);
28642883
return -EINVAL;
28652884
}
28662885

@@ -2893,6 +2912,8 @@ void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
28932912

28942913
lockdep_assert_held(&arm_smmu_asid_lock);
28952914

2915+
arm_smmu_attach_commit_vmaster(state);
2916+
28962917
if (state->ats_enabled && !master->ats_enabled) {
28972918
arm_smmu_enable_ats(master);
28982919
} else if (state->ats_enabled && master->ats_enabled) {
@@ -3162,6 +3183,7 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
31623183
struct arm_smmu_ste ste;
31633184
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
31643185

3186+
arm_smmu_master_clear_vmaster(master);
31653187
arm_smmu_make_bypass_ste(master->smmu, &ste);
31663188
arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
31673189
return 0;
@@ -3180,7 +3202,9 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
31803202
struct device *dev)
31813203
{
31823204
struct arm_smmu_ste ste;
3205+
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
31833206

3207+
arm_smmu_master_clear_vmaster(master);
31843208
arm_smmu_make_abort_ste(&ste);
31853209
arm_smmu_attach_dev_ste(domain, dev, &ste,
31863210
STRTAB_STE_1_S1DSS_TERMINATE);

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,7 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
266266
#define STRTAB_STE_1_S1COR GENMASK_ULL(5, 4)
267267
#define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6)
268268

269+
#define STRTAB_STE_1_MEV (1UL << 19)
269270
#define STRTAB_STE_1_S2FWB (1UL << 25)
270271
#define STRTAB_STE_1_S1STALLD (1UL << 27)
271272

@@ -799,6 +800,11 @@ struct arm_smmu_stream {
799800
struct rb_node node;
800801
};
801802

803+
struct arm_smmu_vmaster {
804+
struct arm_vsmmu *vsmmu;
805+
unsigned long vsid;
806+
};
807+
802808
struct arm_smmu_event {
803809
u8 stall : 1,
804810
ssv : 1,
@@ -824,6 +830,7 @@ struct arm_smmu_master {
824830
struct arm_smmu_device *smmu;
825831
struct device *dev;
826832
struct arm_smmu_stream *streams;
833+
struct arm_smmu_vmaster *vmaster; /* use smmu->streams_mutex */
827834
/* Locked by the iommu core using the group mutex */
828835
struct arm_smmu_ctx_desc_cfg cd_table;
829836
unsigned int num_streams;
@@ -972,6 +979,7 @@ struct arm_smmu_attach_state {
972979
bool disable_ats;
973980
ioasid_t ssid;
974981
/* Resulting state */
982+
struct arm_smmu_vmaster *vmaster;
975983
bool ats_enabled;
976984
};
977985

@@ -1055,9 +1063,37 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
10551063
struct iommu_domain *parent,
10561064
struct iommufd_ctx *ictx,
10571065
unsigned int viommu_type);
1066+
int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
1067+
struct arm_smmu_nested_domain *nested_domain);
1068+
void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state);
1069+
void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master);
1070+
int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt);
10581071
#else
10591072
#define arm_smmu_hw_info NULL
10601073
#define arm_vsmmu_alloc NULL
1074+
1075+
static inline int
1076+
arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
1077+
struct arm_smmu_nested_domain *nested_domain)
1078+
{
1079+
return 0;
1080+
}
1081+
1082+
static inline void
1083+
arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state)
1084+
{
1085+
}
1086+
1087+
static inline void
1088+
arm_smmu_master_clear_vmaster(struct arm_smmu_master *master)
1089+
{
1090+
}
1091+
1092+
static inline int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster,
1093+
u64 *evt)
1094+
{
1095+
return -EOPNOTSUPP;
1096+
}
10611097
#endif /* CONFIG_ARM_SMMU_V3_IOMMUFD */
10621098

10631099
#endif /* _ARM_SMMU_V3_H */

0 commit comments

Comments
 (0)