Skip to content

Commit 5da39dc

Browse files
committed
Merge tag 'drm-xe-next-fixes-2025-03-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
Core Changes: - Fix kernel-doc for gpusvm (Lucas) Driver Changes: - Drop duplicated pc_start call (Rodrigo) - Drop sentinels from rtp (Lucas) - Fix MOCS debugfs missing forcewake (Tvrtko) - Ring flush invalitation (Tvrtko) - Fix type for width alignement (Tvrtko) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/fsztfqcddrarwjlxjwm2k4wvc6u5vntceh6b7nsnxjmwzgtunj@sbkshjow65rf
2 parents 64fc5dc + 7b7b07c commit 5da39dc

File tree

13 files changed

+122
-121
lines changed

13 files changed

+122
-121
lines changed

Documentation/gpu/rfc/gpusvm.rst

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -67,14 +67,19 @@ Agreed upon design principles
6767
Overview of baseline design
6868
===========================
6969

70-
Baseline design is simple as possible to get a working basline in which can be
71-
built upon.
72-
73-
.. kernel-doc:: drivers/gpu/drm/xe/drm_gpusvm.c
70+
.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
7471
:doc: Overview
72+
73+
.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
7574
:doc: Locking
76-
:doc: Migrataion
75+
76+
.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
77+
:doc: Migration
78+
79+
.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
7780
:doc: Partial Unmapping of Ranges
81+
82+
.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c
7883
:doc: Examples
7984

8085
Possible future design features

drivers/gpu/drm/drm_gpusvm.c

Lines changed: 69 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -23,37 +23,42 @@
2323
* DOC: Overview
2424
*
2525
* GPU Shared Virtual Memory (GPU SVM) layer for the Direct Rendering Manager (DRM)
26-
*
27-
* The GPU SVM layer is a component of the DRM framework designed to manage shared
28-
* virtual memory between the CPU and GPU. It enables efficient data exchange and
29-
* processing for GPU-accelerated applications by allowing memory sharing and
26+
* is a component of the DRM framework designed to manage shared virtual memory
27+
* between the CPU and GPU. It enables efficient data exchange and processing
28+
* for GPU-accelerated applications by allowing memory sharing and
3029
* synchronization between the CPU's and GPU's virtual address spaces.
3130
*
3231
* Key GPU SVM Components:
33-
* - Notifiers: Notifiers: Used for tracking memory intervals and notifying the
34-
* GPU of changes, notifiers are sized based on a GPU SVM
35-
* initialization parameter, with a recommendation of 512M or
36-
* larger. They maintain a Red-BlacK tree and a list of ranges that
37-
* fall within the notifier interval. Notifiers are tracked within
38-
* a GPU SVM Red-BlacK tree and list and are dynamically inserted
39-
* or removed as ranges within the interval are created or
40-
* destroyed.
41-
* - Ranges: Represent memory ranges mapped in a DRM device and managed
42-
* by GPU SVM. They are sized based on an array of chunk sizes, which
43-
* is a GPU SVM initialization parameter, and the CPU address space.
44-
* Upon GPU fault, the largest aligned chunk that fits within the
45-
* faulting CPU address space is chosen for the range size. Ranges are
46-
* expected to be dynamically allocated on GPU fault and removed on an
47-
* MMU notifier UNMAP event. As mentioned above, ranges are tracked in
48-
* a notifier's Red-Black tree.
49-
* - Operations: Define the interface for driver-specific GPU SVM operations
50-
* such as range allocation, notifier allocation, and
51-
* invalidations.
52-
* - Device Memory Allocations: Embedded structure containing enough information
53-
* for GPU SVM to migrate to / from device memory.
54-
* - Device Memory Operations: Define the interface for driver-specific device
55-
* memory operations release memory, populate pfns,
56-
* and copy to / from device memory.
32+
*
33+
* - Notifiers:
34+
* Used for tracking memory intervals and notifying the GPU of changes,
35+
* notifiers are sized based on a GPU SVM initialization parameter, with a
36+
* recommendation of 512M or larger. They maintain a Red-BlacK tree and a
37+
* list of ranges that fall within the notifier interval. Notifiers are
38+
* tracked within a GPU SVM Red-BlacK tree and list and are dynamically
39+
* inserted or removed as ranges within the interval are created or
40+
* destroyed.
41+
* - Ranges:
42+
* Represent memory ranges mapped in a DRM device and managed by GPU SVM.
43+
* They are sized based on an array of chunk sizes, which is a GPU SVM
44+
* initialization parameter, and the CPU address space. Upon GPU fault,
45+
* the largest aligned chunk that fits within the faulting CPU address
46+
* space is chosen for the range size. Ranges are expected to be
47+
* dynamically allocated on GPU fault and removed on an MMU notifier UNMAP
48+
* event. As mentioned above, ranges are tracked in a notifier's Red-Black
49+
* tree.
50+
*
51+
* - Operations:
52+
* Define the interface for driver-specific GPU SVM operations such as
53+
* range allocation, notifier allocation, and invalidations.
54+
*
55+
* - Device Memory Allocations:
56+
* Embedded structure containing enough information for GPU SVM to migrate
57+
* to / from device memory.
58+
*
59+
* - Device Memory Operations:
60+
* Define the interface for driver-specific device memory operations
61+
* release memory, populate pfns, and copy to / from device memory.
5762
*
5863
* This layer provides interfaces for allocating, mapping, migrating, and
5964
* releasing memory ranges between the CPU and GPU. It handles all core memory
@@ -63,14 +68,18 @@
6368
* below.
6469
*
6570
* Expected Driver Components:
66-
* - GPU page fault handler: Used to create ranges and notifiers based on the
67-
* fault address, optionally migrate the range to
68-
* device memory, and create GPU bindings.
69-
* - Garbage collector: Used to unmap and destroy GPU bindings for ranges.
70-
* Ranges are expected to be added to the garbage collector
71-
* upon a MMU_NOTIFY_UNMAP event in notifier callback.
72-
* - Notifier callback: Used to invalidate and DMA unmap GPU bindings for
73-
* ranges.
71+
*
72+
* - GPU page fault handler:
73+
* Used to create ranges and notifiers based on the fault address,
74+
* optionally migrate the range to device memory, and create GPU bindings.
75+
*
76+
* - Garbage collector:
77+
* Used to unmap and destroy GPU bindings for ranges. Ranges are expected
78+
* to be added to the garbage collector upon a MMU_NOTIFY_UNMAP event in
79+
* notifier callback.
80+
*
81+
* - Notifier callback:
82+
* Used to invalidate and DMA unmap GPU bindings for ranges.
7483
*/
7584

7685
/**
@@ -83,9 +92,9 @@
8392
* range RB tree and list, as well as the range's DMA mappings and sequence
8493
* number. GPU SVM manages all necessary locking and unlocking operations,
8594
* except for the recheck range's pages being valid
86-
* (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings. This
87-
* lock corresponds to the 'driver->update' lock mentioned in the HMM
88-
* documentation (TODO: Link). Future revisions may transition from a GPU SVM
95+
* (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings.
96+
* This lock corresponds to the ``driver->update`` lock mentioned in
97+
* Documentation/mm/hmm.rst. Future revisions may transition from a GPU SVM
8998
* global lock to a per-notifier lock if finer-grained locking is deemed
9099
* necessary.
91100
*
@@ -102,11 +111,11 @@
102111
* DOC: Migration
103112
*
104113
* The migration support is quite simple, allowing migration between RAM and
105-
* device memory at the range granularity. For example, GPU SVM currently does not
106-
* support mixing RAM and device memory pages within a range. This means that upon GPU
107-
* fault, the entire range can be migrated to device memory, and upon CPU fault, the
108-
* entire range is migrated to RAM. Mixed RAM and device memory storage within a range
109-
* could be added in the future if required.
114+
* device memory at the range granularity. For example, GPU SVM currently does
115+
* not support mixing RAM and device memory pages within a range. This means
116+
* that upon GPU fault, the entire range can be migrated to device memory, and
117+
* upon CPU fault, the entire range is migrated to RAM. Mixed RAM and device
118+
* memory storage within a range could be added in the future if required.
110119
*
111120
* The reasoning for only supporting range granularity is as follows: it
112121
* simplifies the implementation, and range sizes are driver-defined and should
@@ -119,11 +128,11 @@
119128
* Partial unmapping of ranges (e.g., 1M out of 2M is unmapped by CPU resulting
120129
* in MMU_NOTIFY_UNMAP event) presents several challenges, with the main one
121130
* being that a subset of the range still has CPU and GPU mappings. If the
122-
* backing store for the range is in device memory, a subset of the backing store has
123-
* references. One option would be to split the range and device memory backing store,
124-
* but the implementation for this would be quite complicated. Given that
125-
* partial unmappings are rare and driver-defined range sizes are relatively
126-
* small, GPU SVM does not support splitting of ranges.
131+
* backing store for the range is in device memory, a subset of the backing
132+
* store has references. One option would be to split the range and device
133+
* memory backing store, but the implementation for this would be quite
134+
* complicated. Given that partial unmappings are rare and driver-defined range
135+
* sizes are relatively small, GPU SVM does not support splitting of ranges.
127136
*
128137
* With no support for range splitting, upon partial unmapping of a range, the
129138
* driver is expected to invalidate and destroy the entire range. If the range
@@ -144,6 +153,8 @@
144153
*
145154
* 1) GPU page fault handler
146155
*
156+
* .. code-block:: c
157+
*
147158
* int driver_bind_range(struct drm_gpusvm *gpusvm, struct drm_gpusvm_range *range)
148159
* {
149160
* int err = 0;
@@ -208,7 +219,9 @@
208219
* return err;
209220
* }
210221
*
211-
* 2) Garbage Collector.
222+
* 2) Garbage Collector
223+
*
224+
* .. code-block:: c
212225
*
213226
* void __driver_garbage_collector(struct drm_gpusvm *gpusvm,
214227
* struct drm_gpusvm_range *range)
@@ -231,7 +244,9 @@
231244
* __driver_garbage_collector(gpusvm, range);
232245
* }
233246
*
234-
* 3) Notifier callback.
247+
* 3) Notifier callback
248+
*
249+
* .. code-block:: c
235250
*
236251
* void driver_invalidation(struct drm_gpusvm *gpusvm,
237252
* struct drm_gpusvm_notifier *notifier,
@@ -499,7 +514,7 @@ drm_gpusvm_notifier_invalidate(struct mmu_interval_notifier *mni,
499514
return true;
500515
}
501516

502-
/**
517+
/*
503518
* drm_gpusvm_notifier_ops - MMU interval notifier operations for GPU SVM
504519
*/
505520
static const struct mmu_interval_notifier_ops drm_gpusvm_notifier_ops = {
@@ -2055,7 +2070,6 @@ static int __drm_gpusvm_migrate_to_ram(struct vm_area_struct *vas,
20552070

20562071
/**
20572072
* drm_gpusvm_range_evict - Evict GPU SVM range
2058-
* @pagemap: Pointer to the GPU SVM structure
20592073
* @range: Pointer to the GPU SVM range to be removed
20602074
*
20612075
* This function evicts the specified GPU SVM range. This function will not
@@ -2146,8 +2160,8 @@ static vm_fault_t drm_gpusvm_migrate_to_ram(struct vm_fault *vmf)
21462160
return err ? VM_FAULT_SIGBUS : 0;
21472161
}
21482162

2149-
/**
2150-
* drm_gpusvm_pagemap_ops() - Device page map operations for GPU SVM
2163+
/*
2164+
* drm_gpusvm_pagemap_ops - Device page map operations for GPU SVM
21512165
*/
21522166
static const struct dev_pagemap_ops drm_gpusvm_pagemap_ops = {
21532167
.page_free = drm_gpusvm_page_free,

drivers/gpu/drm/xe/display/xe_fb_pin.c

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ write_dpt_remapped(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs,
8282
static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
8383
const struct i915_gtt_view *view,
8484
struct i915_vma *vma,
85-
u64 physical_alignment)
85+
unsigned int alignment)
8686
{
8787
struct xe_device *xe = to_xe_device(fb->base.dev);
8888
struct xe_tile *tile0 = xe_device_get_root_tile(xe);
@@ -108,23 +108,23 @@ static int __xe_pin_fb_vma_dpt(const struct intel_framebuffer *fb,
108108
XE_BO_FLAG_VRAM0 |
109109
XE_BO_FLAG_GGTT |
110110
XE_BO_FLAG_PAGETABLE,
111-
physical_alignment);
111+
alignment);
112112
else
113113
dpt = xe_bo_create_pin_map_at_aligned(xe, tile0, NULL,
114114
dpt_size, ~0ull,
115115
ttm_bo_type_kernel,
116116
XE_BO_FLAG_STOLEN |
117117
XE_BO_FLAG_GGTT |
118118
XE_BO_FLAG_PAGETABLE,
119-
physical_alignment);
119+
alignment);
120120
if (IS_ERR(dpt))
121121
dpt = xe_bo_create_pin_map_at_aligned(xe, tile0, NULL,
122122
dpt_size, ~0ull,
123123
ttm_bo_type_kernel,
124124
XE_BO_FLAG_SYSTEM |
125125
XE_BO_FLAG_GGTT |
126126
XE_BO_FLAG_PAGETABLE,
127-
physical_alignment);
127+
alignment);
128128
if (IS_ERR(dpt))
129129
return PTR_ERR(dpt);
130130

@@ -194,7 +194,7 @@ write_ggtt_rotated(struct xe_bo *bo, struct xe_ggtt *ggtt, u32 *ggtt_ofs, u32 bo
194194
static int __xe_pin_fb_vma_ggtt(const struct intel_framebuffer *fb,
195195
const struct i915_gtt_view *view,
196196
struct i915_vma *vma,
197-
u64 physical_alignment)
197+
unsigned int alignment)
198198
{
199199
struct drm_gem_object *obj = intel_fb_bo(&fb->base);
200200
struct xe_bo *bo = gem_to_xe_bo(obj);
@@ -277,7 +277,7 @@ static int __xe_pin_fb_vma_ggtt(const struct intel_framebuffer *fb,
277277

278278
static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
279279
const struct i915_gtt_view *view,
280-
u64 physical_alignment)
280+
unsigned int alignment)
281281
{
282282
struct drm_device *dev = fb->base.dev;
283283
struct xe_device *xe = to_xe_device(dev);
@@ -327,9 +327,9 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
327327

328328
vma->bo = bo;
329329
if (intel_fb_uses_dpt(&fb->base))
330-
ret = __xe_pin_fb_vma_dpt(fb, view, vma, physical_alignment);
330+
ret = __xe_pin_fb_vma_dpt(fb, view, vma, alignment);
331331
else
332-
ret = __xe_pin_fb_vma_ggtt(fb, view, vma, physical_alignment);
332+
ret = __xe_pin_fb_vma_ggtt(fb, view, vma, alignment);
333333
if (ret)
334334
goto err_unpin;
335335

@@ -422,15 +422,15 @@ int intel_plane_pin_fb(struct intel_plane_state *new_plane_state,
422422
struct i915_vma *vma;
423423
struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
424424
struct intel_plane *plane = to_intel_plane(new_plane_state->uapi.plane);
425-
u64 phys_alignment = plane->min_alignment(plane, fb, 0);
425+
unsigned int alignment = plane->min_alignment(plane, fb, 0);
426426

427427
if (reuse_vma(new_plane_state, old_plane_state))
428428
return 0;
429429

430430
/* We reject creating !SCANOUT fb's, so this is weird.. */
431431
drm_WARN_ON(bo->ttm.base.dev, !(bo->flags & XE_BO_FLAG_SCANOUT));
432432

433-
vma = __xe_pin_fb_vma(intel_fb, &new_plane_state->view.gtt, phys_alignment);
433+
vma = __xe_pin_fb_vma(intel_fb, &new_plane_state->view.gtt, alignment);
434434

435435
if (IS_ERR(vma))
436436
return PTR_ERR(vma);

drivers/gpu/drm/xe/tests/xe_rtp_test.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -320,7 +320,7 @@ static void xe_rtp_process_to_sr_tests(struct kunit *test)
320320
count_rtp_entries++;
321321

322322
xe_rtp_process_ctx_enable_active_tracking(&ctx, &active, count_rtp_entries);
323-
xe_rtp_process_to_sr(&ctx, param->entries, reg_sr);
323+
xe_rtp_process_to_sr(&ctx, param->entries, count_rtp_entries, reg_sr);
324324

325325
xa_for_each(&reg_sr->xa, idx, sre) {
326326
if (idx == param->expected_reg.addr)

drivers/gpu/drm/xe/xe_guc.c

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1496,14 +1496,6 @@ void xe_guc_stop(struct xe_guc *guc)
14961496

14971497
int xe_guc_start(struct xe_guc *guc)
14981498
{
1499-
if (!IS_SRIOV_VF(guc_to_xe(guc))) {
1500-
int err;
1501-
1502-
err = xe_guc_pc_start(&guc->pc);
1503-
xe_gt_WARN(guc_to_gt(guc), err, "Failed to start GuC PC: %pe\n",
1504-
ERR_PTR(err));
1505-
}
1506-
15071499
return xe_guc_submit_start(guc);
15081500
}
15091501

drivers/gpu/drm/xe/xe_hw_engine.c

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -400,10 +400,9 @@ xe_hw_engine_setup_default_lrc_state(struct xe_hw_engine *hwe)
400400
PREEMPT_GPGPU_THREAD_GROUP_LEVEL)),
401401
XE_RTP_ENTRY_FLAG(FOREACH_ENGINE)
402402
},
403-
{}
404403
};
405404

406-
xe_rtp_process_to_sr(&ctx, lrc_setup, &hwe->reg_lrc);
405+
xe_rtp_process_to_sr(&ctx, lrc_setup, ARRAY_SIZE(lrc_setup), &hwe->reg_lrc);
407406
}
408407

409408
static void
@@ -459,10 +458,9 @@ hw_engine_setup_default_state(struct xe_hw_engine *hwe)
459458
XE_RTP_ACTIONS(SET(CSFE_CHICKEN1(0), CS_PRIORITY_MEM_READ,
460459
XE_RTP_ACTION_FLAG(ENGINE_BASE)))
461460
},
462-
{}
463461
};
464462

465-
xe_rtp_process_to_sr(&ctx, engine_entries, &hwe->reg_sr);
463+
xe_rtp_process_to_sr(&ctx, engine_entries, ARRAY_SIZE(engine_entries), &hwe->reg_sr);
466464
}
467465

468466
static const struct engine_info *find_engine_info(enum xe_engine_class class, int instance)

drivers/gpu/drm/xe/xe_mocs.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -781,7 +781,9 @@ void xe_mocs_dump(struct xe_gt *gt, struct drm_printer *p)
781781
flags = get_mocs_settings(xe, &table);
782782

783783
xe_pm_runtime_get_noresume(xe);
784-
fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
784+
fw_ref = xe_force_wake_get(gt_to_fw(gt),
785+
flags & HAS_LNCF_MOCS ?
786+
XE_FORCEWAKE_ALL : XE_FW_GT);
785787
if (!fw_ref)
786788
goto err_fw;
787789

drivers/gpu/drm/xe/xe_reg_whitelist.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,6 @@ static const struct xe_rtp_entry_sr register_whitelist[] = {
8888
RING_FORCE_TO_NONPRIV_ACCESS_RD |
8989
RING_FORCE_TO_NONPRIV_RANGE_4))
9090
},
91-
{}
9291
};
9392

9493
static void whitelist_apply_to_hwe(struct xe_hw_engine *hwe)
@@ -137,7 +136,8 @@ void xe_reg_whitelist_process_engine(struct xe_hw_engine *hwe)
137136
{
138137
struct xe_rtp_process_ctx ctx = XE_RTP_PROCESS_CTX_INITIALIZER(hwe);
139138

140-
xe_rtp_process_to_sr(&ctx, register_whitelist, &hwe->reg_whitelist);
139+
xe_rtp_process_to_sr(&ctx, register_whitelist, ARRAY_SIZE(register_whitelist),
140+
&hwe->reg_whitelist);
141141
whitelist_apply_to_hwe(hwe);
142142
}
143143

0 commit comments

Comments
 (0)