-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[mlir][scf] fix getSuccessorRegions func in scf.forall #147491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
In accordance with the semantics of forall, its body is executed in parallel by multiple threads. We should not expect to branch back into the forall body after the region's execution is complete.
@llvm/pr-subscribers-mlir-scf @llvm/pr-subscribers-mlir Author: donald chen (cxy-1993) ChangesIn accordance with the semantics of forall, its body is executed in parallel by multiple threads. We should not expect to branch back into the forall body after the region's execution is complete. Full diff: https://github.com/llvm/llvm-project/pull/147491.diff 1 Files Affected:
diff --git a/mlir/lib/Dialect/SCF/IR/SCF.cpp b/mlir/lib/Dialect/SCF/IR/SCF.cpp
index 5a3bd984530db..2e0a7b85af756 100644
--- a/mlir/lib/Dialect/SCF/IR/SCF.cpp
+++ b/mlir/lib/Dialect/SCF/IR/SCF.cpp
@@ -1927,11 +1927,13 @@ void ForallOp::getCanonicalizationPatterns(RewritePatternSet &results,
/// not a constant.
void ForallOp::getSuccessorRegions(RegionBranchPoint point,
SmallVectorImpl<RegionSuccessor> ®ions) {
- // Both the operation itself and the region may be branching into the body or
- // back into the operation itself. It is possible for loop not to enter the
- // body.
- regions.push_back(RegionSuccessor(&getRegion()));
- regions.push_back(RegionSuccessor());
+ // In accordance with the semantics of forall, its body is executed in
+ // parallel by multiple threads. We should not expect to branch back into
+ // the forall body after the region's execution is complete.
+ if (point.isParent())
+ regions.push_back(RegionSuccessor(&getRegion()));
+ else
+ regions.push_back(RegionSuccessor());
}
//===----------------------------------------------------------------------===//
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this stemming from some real use case or bug?
The control flow interfaces are not modeling parallelism currently, and this is a bit tricky. In particular, control flow interfaces are used by dataflow analyses to understand whether a block may be executed before or after another block, including a different "instance" of the same block. This may be the case here. I'd like to better think through this before we decide one way or another.
We're using dataflow analysis on a very large program, and the compilation time is excessively long because scf.forall assumes the body may execute after body. The current implementation doesn't cause functional issues, but it's conservative, increasing compilation time and potentially making the analysis overly cautious. The forall semantic is meant to indicate parallel execution across multiple threads, so for a single thread, it's more reasonable for the forall body to exit after execution (similar to an outlined gpu program). |
There can't be any "data" flowing from one instance of a parallel block to another. I don't quite get why this was modeled this way in the first place though, I may be missing something here. |
For reference, I found these old discussions / PRs:
The last PR may have accidentally changed the behavior. |
Ping, looks like we've reached a consensus? |
In accordance with the semantics of forall, its body is executed in parallel by multiple threads. We should not expect to branch back into the forall body after the region's execution is complete.