How to Customize and Dynamically Adapt Tiling Strategy in IREE Based on Target Core Count #20884
Replies: 2 comments 22 replies
-
I know what you are trying to achieve. Could you share IRs? I can suggest places in the codebase to look at. |
Beta Was this translation helpful? Give feedback.
-
I have one such dump: https://gist.github.com/pashu123/eacb04e6acb53af0b4c0f6af265dd4a3 . In the SelectLoweringStrategy, it selects a lowering configuration and associates it with the linalg.mmt4d operation. This configuration essentially consists of the tile sizes: [[workgroup], [parallel], [reduction]] that are attached to the operation. Later passes, such as TileRootFuseProducerConsumer, examine the tile sizes to determine whether they are parallel or for reduction, and then apply tiling and fusion accordingly. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi IREE team,
I'm working on optimizing dispatch operations—particularly linalg.matmul—and would like to better control the tiling strategy based on the hardware's parallelism, such as the number of available cores.
Specifically, I'm trying to:
Customize the Tiling Configuration
Set or override tile sizes, loop levels, and other tiling parameters for specific dispatch operations like linalg.matmul.
Understand where and how to plug in these tiling configurations for a specific op.
Dynamically Adapt Based on Target Hardware
Adjust tile sizes or the number of tile levels at compile time, depending on hardware features like:
- Number of physical cores (e.g., for CPU)
- Backend type (CPU vs GPU)
- Cache size, vector width (if applicable)
I'm aware IREE has mechanisms for target-specific configuration—can these parameters be used to influence tiling?
Understand Key Codebase Locations
Could you point me to the parts of the code responsible for:
- Fetching and applying tiling configurations (getTileAndDistributeConfig?)
- Emitting workgroup/tile distributions and loop nests
- Injecting or utilizing hardware-specific properties, such as querying core count or target backend
Any pointers to examples, documentation, or guidance on best practices would be greatly appreciated. I’m open to extending passes or creating a custom pipeline if necessary.
Beta Was this translation helpful? Give feedback.
All reactions