PTX: Demote GEP indices to 32 bits, if possible.

In https://github.com/JuliaGPU/CUDA.jl/pull/1895, I made the size tuple of `CuDeviceArray` 32 bits so that we can emit better code (lowering register pressure, making it possible to execute compute & indexing instructions in parallel, etc) However, the NVPTX back-end defaults to using 64 bits for indexing pointers, resulting in 64-bits GEPs being introduced by the front-end and optimization. I tried to change that by specifying a 32-bit pointer index size in the data layout, https://github.com/JuliaGPU/GPUCompiler.jl/pull/444, but that breaks 64-bits indices which can still get reintroduced by optimization (see e.g. https://github.com/JuliaGPU/GPUCompiler.jl/issues/461).

Either we try this again on LLVM 17 (where a bug has been fixed that was introducing 64-bits GEP offsets), or we instead create an optimization pass that demotes GEP indices to 32 bits if possible (e.g., if they are constants, or come from the size field of a device array).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PTX: Demote GEP indices to 32 bits, if possible. #463

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PTX: Demote GEP indices to 32 bits, if possible. #463

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions