Skip to content
This repository was archived by the owner on Apr 28, 2023. It is now read-only.

Commit 497d3c4

Browse files
ftynseSven Verdoolaege
authored andcommitted
complete thread mapping in a single band
Since mapping nested bands to threads is invalid, we may complete the mapping by assigning unused thread dimensions to zero immediately after mapping coincident and reduction members to threads. Note that it is still necessary to insert synchronizations if some band outside the one mapped to threads has non-coincident members. The mapping also needs to be completed before inserting synchronizations between children of a sequence node, some of which may not be mapped to threads at all.
1 parent 88764d3 commit 497d3c4

File tree

1 file changed

+10
-6
lines changed

1 file changed

+10
-6
lines changed

tc/core/polyhedral/cuda/mapped_scop.cc

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -467,20 +467,24 @@ size_t MappedScop::mapInnermostBandsToThreads(detail::ScheduleTree* st) {
467467
if (auto band = st->elemAs<detail::ScheduleTreeElemBand>()) {
468468
if (n == 0) {
469469
// If children were not mapped to threads, the current band can be mapped.
470+
// First, map the coincidence and reduction dimension to threads.
471+
// Then, if some threads were mapped, fix unused thread dimensions to 0
472+
// because we cannot map parent bands anyway.
470473
auto nMapped = mapToThreads(st);
471-
markUnroll(scop_->scheduleRoot(), st, unroll);
472-
return nMapped;
474+
if (nMapped > 0) {
475+
mapRemaining<mapping::ThreadId>(st, nMapped, numThreads.view.size());
476+
markUnroll(scop_->scheduleRoot(), st, unroll);
477+
return numThreads.view.size();
478+
}
473479
} else if (anyNonCoincidentMember(band)) {
474480
// If children were mapped to threads, and this band has a non-coincident
475481
// member, insert a synchronization after its last child.
476-
// This also implies the mapping must be completed first.
477482
// The node must have children if some of them were mapped to threads,
478483
// double-check. Note that a band node has at most one child.
479484
CHECK_EQ(st->numChildren(), 1);
480-
mapRemaining<mapping::ThreadId>(
481-
st->child({0}), n, numThreads.view.size());
485+
// The mapping should be always complete, double-check.
486+
CHECK_EQ(n, numThreads.view.size());
482487
scop_->insertSyncAfter(st->child({0}));
483-
return numThreads.view.size();
484488
}
485489
}
486490

0 commit comments

Comments
 (0)