Skip to content

Commit ccfc508

Browse files
committed
fix queue-wide barriers with multiple active command queues
This is another attempt to fix an issue where the L0 Adapter crashes when urEnqueueEventsWaitWithBarrier in presence of multiple active command queues with batched cmdlists. The core issue is that the current queue implementation only allows for two command lists to be active open batches, one for copy and one for compute. If that assumption doesn't hold, the getAvailableCommandList function, when executed multiple times for a command queue of the same type, will override the active open command batch. So only the last retrieved command list can actually be batched. After this, when the code attempts to execute all the command lists it collected, with batching enabled. And this is where we hit an assert because the active open command list doesn't match what is being used. The proper fix here is to allow open command batches for each command queue. But that's a fairly risky change to do this late in the release cycle. My previous attempt at a fix simply disabled batching for queue-wide barriers (#1555). That introduced regressions in tests that assumed that batching happens. It might also been a performance regression. Instead, this patch fixes getAvailableCommandList when batching is enabled and specific command queue is required, and disables batching only for cases where the open batch cmdlist is different than the one we are executing.
1 parent 5a23b18 commit ccfc508

File tree

2 files changed

+14
-1
lines changed

2 files changed

+14
-1
lines changed

source/adapters/level_zero/context.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -679,6 +679,11 @@ ur_result_t ur_context_handle_t_::getAvailableCommandList(
679679
if (Queue->hasOpenCommandList(UseCopyEngine)) {
680680
if (AllowBatching) {
681681
bool batchingAllowed = true;
682+
if (ForcedCmdQueue &&
683+
CommandBatch.OpenCommandList->second.ZeQueue != *ForcedCmdQueue) {
684+
// Current open batch doesn't match the forced command queue
685+
batchingAllowed = false;
686+
}
682687
if (!UrL0OutOfOrderIntegratedSignalEvent &&
683688
Queue->Device->isIntegrated()) {
684689
batchingAllowed = eventCanBeBatched(Queue, UseCopyEngine,

source/adapters/level_zero/event.cpp

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -368,8 +368,16 @@ UR_APIEXPORT ur_result_t UR_APICALL urEnqueueEventsWaitWithBarrier(
368368
}
369369

370370
// Execute each command list so the barriers can be encountered.
371-
for (ur_command_list_ptr_t &CmdList : CmdLists)
371+
for (ur_command_list_ptr_t &CmdList : CmdLists) {
372+
bool IsCopy =
373+
CmdList->second.isCopy(reinterpret_cast<ur_queue_handle_t>(Queue));
374+
const auto &CommandBatch =
375+
(IsCopy) ? Queue->CopyCommandBatch : Queue->ComputeCommandBatch;
376+
// Only batch if the matching CmdList is already open.
377+
OkToBatch = CommandBatch.OpenCommandList == CmdList;
378+
372379
UR_CALL(Queue->executeCommandList(CmdList, false, OkToBatch));
380+
}
373381

374382
UR_CALL(Queue->ActiveBarriers.clear());
375383
auto UREvent = reinterpret_cast<ur_event_handle_t>(*Event);

0 commit comments

Comments
 (0)