Skip to content

Commit baf4afc

Browse files
author
Philipp Stanner
committed
drm/sched: Improve teardown documentation
If jobs are still enqueued in struct drm_gpu_scheduler.pending_list when drm_sched_fini() gets called, those jobs will be leaked since that function stops both job-submission and (automatic) job-cleanup. It is, thus, up to the driver to take care of preventing leaks. The related function drm_sched_wqueue_stop() also prevents automatic job cleanup. Those pitfals are not reflected in the documentation, currently. Explicitly inform about the leak problem in the docstring of drm_sched_fini(). Additionally, detail the purpose of drm_sched_wqueue_{start,stop} and hint at the consequences for automatic cleanup. Signed-off-by: Philipp Stanner <pstanner@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105143137.71893-2-pstanner@redhat.com
1 parent 21c23e4 commit baf4afc

File tree

1 file changed

+21
-2
lines changed

1 file changed

+21
-2
lines changed

drivers/gpu/drm/scheduler/sched_main.c

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1350,6 +1350,19 @@ EXPORT_SYMBOL(drm_sched_init);
13501350
* @sched: scheduler instance
13511351
*
13521352
* Tears down and cleans up the scheduler.
1353+
*
1354+
* This stops submission of new jobs to the hardware through
1355+
* drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job()
1356+
* will not be called for all jobs still in drm_gpu_scheduler.pending_list.
1357+
* There is no solution for this currently. Thus, it is up to the driver to make
1358+
* sure that
1359+
* a) drm_sched_fini() is only called after for all submitted jobs
1360+
* drm_sched_backend_ops.free_job() has been called or that
1361+
* b) the jobs for which drm_sched_backend_ops.free_job() has not been called
1362+
* after drm_sched_fini() ran are freed manually.
1363+
*
1364+
* FIXME: Take care of the above problem and prevent this function from leaking
1365+
* the jobs in drm_gpu_scheduler.pending_list under any circumstances.
13531366
*/
13541367
void drm_sched_fini(struct drm_gpu_scheduler *sched)
13551368
{
@@ -1445,8 +1458,10 @@ EXPORT_SYMBOL(drm_sched_wqueue_ready);
14451458

14461459
/**
14471460
* drm_sched_wqueue_stop - stop scheduler submission
1448-
*
14491461
* @sched: scheduler instance
1462+
*
1463+
* Stops the scheduler from pulling new jobs from entities. It also stops
1464+
* freeing jobs automatically through drm_sched_backend_ops.free_job().
14501465
*/
14511466
void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched)
14521467
{
@@ -1458,8 +1473,12 @@ EXPORT_SYMBOL(drm_sched_wqueue_stop);
14581473

14591474
/**
14601475
* drm_sched_wqueue_start - start scheduler submission
1461-
*
14621476
* @sched: scheduler instance
1477+
*
1478+
* Restarts the scheduler after drm_sched_wqueue_stop() has stopped it.
1479+
*
1480+
* This function is not necessary for 'conventional' startup. The scheduler is
1481+
* fully operational after drm_sched_init() succeeded.
14631482
*/
14641483
void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched)
14651484
{

0 commit comments

Comments
 (0)