-
Notifications
You must be signed in to change notification settings - Fork 62
[Coalesce]: Enhance the Intel coalescing pass to support while loops. #4290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ad/descriptor_store operation that uses it Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
Signed-off-by: Tiotto, Ettore <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Enhance the Intel GPU coalescing and layout-propagation passes to handle scf::WhileOp
and generalize descriptor-to-pointer lowering.
- Extend
findDefiningMakeTensorPtrOp
, propagation, and templated loops inCoalesce.cpp
to supportscf::WhileOp
andscf::ConditionOp
. - Add
updateAdvanceOpChain
inRemoveLayoutConversions.cpp
for chainedAdvanceOp
s and insert module verification asserts. - Refactor
TensorDescToBlockPointer.cpp
to simplify descriptor rewriting, unify pointer creation, and remove legacy helper functions.
Reviewed Changes
Copilot reviewed 3 out of 7 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
third_party/intel/lib/TritonIntelGPUTransforms/RemoveLayoutConversions.cpp | Add updateAdvanceOpChain and rewrite store logic with verification asserts |
third_party/intel/lib/TritonIntelGPUTransforms/Coalesce.cpp | Templatize propagation, add WhileOp /ConditionOp support, refactor debug logging |
third_party/intel/lib/Dialect/Triton/Transforms/TensorDescToBlockPointer.cpp | Overhaul descriptor-to-block-pointer pass, unify MakeTensorPtrOp creation |
Files not reviewed (4)
- test/Triton/Intel/TensorDescToBlockPointer/basic.mlir: Language not supported
- test/Triton/Intel/TensorDescToBlockPointer/loop.mlir: Language not supported
- test/TritonIntelGPU/backward_combine_dpas_dot_layout.mlir: Language not supported
- test/TritonIntelGPU/coalesce.mlir: Language not supported
Comments suppressed due to low confidence (2)
third_party/intel/lib/TritonIntelGPUTransforms/RemoveLayoutConversions.cpp:793
- The variable
value
is undefined here; it should referencestoreOp.getValue()
or another valid operand value.
Value dataToStore = getValueAs(value, encoding);
third_party/intel/lib/Dialect/Triton/Transforms/TensorDescToBlockPointer.cpp:149
- Always pushing a zero offset discards the original
op.getIndices()
. Use the descriptor's indices for offsets instead of a constant zero.
offsets.push_back(zero);
Signed-off-by: Tiotto, Ettore <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please create an issue to track upstreaming while loop support in coalesce pass.
Signed-off-by: Tiotto, Ettore <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ettore agreed to revert back to the old debug print lines if we can reuse functions in lib/Dialect/TritonGPU/Transforms/Coalesce.cpp
.
Created #4307 |
Enhance the Intel GPU coalescing pass to handle
scf::WhileOp
.