promoteToSharedBelow: take into account the mapping to blocks

ftynse · ftynse · commit 7c89338a8628 · 2018-07-12T13:11:43.000+02:00
Currenly, promotion to shared memory is only performed below the loops
mapped to blocks.  Thus, tensor reference groups implictly account for
blocks.  The scope of mapping in the tree will be changed in upcoming
commits.  Explicitly include block mapping into the partial schedule
within which the tensor reference groups are computed.
diff --git a/tc/core/polyhedral/cuda/memory_promotion_heuristic.cc b/tc/core/polyhedral/cuda/memory_promotion_heuristic.cc
@@ -440,9 +440,10 @@ void promoteToSharedBelow(
     size_t& remainingMemory) {
   auto root = scop.scheduleRoot();
   auto partialSched = partialSchedule(root, bandNode);
+  auto mapping = collectMappingsTo<mapping::BlockId>(scop);
 
   auto groupMap = TensorReferenceGroup::accessedWithin(
-      partialSched, scop.reads, scop.writes);
+      partialSched.intersect_domain(mapping), scop.reads, scop.writes);
   // Pure affine schedule without (mapping) filters.
   auto partialSchedMupa = partialScheduleMupa(root, bandNode);