Skip to content

[DirectX] Simplify and correct the flattening of GEPs in DXILFlattenArrays #146173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 141 additions & 140 deletions llvm/lib/Target/DirectX/DXILFlattenArrays.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include "llvm/IR/InstVisitor.h"
#include "llvm/IR/ReplaceConstant.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Transforms/Utils/Local.h"
#include <cassert>
#include <cstddef>
Expand All @@ -40,18 +41,19 @@ class DXILFlattenArraysLegacy : public ModulePass {
static char ID; // Pass identification.
};

struct GEPData {
ArrayType *ParentArrayType;
Value *ParentOperand;
SmallVector<Value *> Indices;
SmallVector<uint64_t> Dims;
bool AllIndicesAreConstInt;
struct GEPInfo {
ArrayType *RootFlattenedArrayType;
Value *RootPointerOperand;
SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset;
};

class DXILFlattenArraysVisitor
: public InstVisitor<DXILFlattenArraysVisitor, bool> {
public:
DXILFlattenArraysVisitor() {}
DXILFlattenArraysVisitor(
DenseMap<GlobalVariable *, GlobalVariable *> &GlobalMap)
Copy link
Member

@farzonl farzonl Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear why this was neccessary. Why do we need a reference of GlobalMap in the DXILFlattenArraysVisitor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary because otherwise the DXILFlattenArraysVisitor won't have visibility of the GlobalMap which comes from a stack-allocated variable here

static bool flattenArrays(Module &M) {
bool MadeChange = false;
DenseMap<GlobalVariable *, GlobalVariable *> GlobalMap;
flattenGlobalArrays(M, GlobalMap);
DXILFlattenArraysVisitor Impl(GlobalMap);

which is necessary to determine the type of the GEP

// We should try to determine the type of the root from the pointer rather
// than the GEP's source element type because this could be a scalar GEP
// into an array-typed pointer from an Alloca or Global Variable.
Type *RootTy = GEP.getSourceElementType();
if (auto *GlobalVar = dyn_cast<GlobalVariable>(PtrOperand)) {
if (GlobalMap.contains(GlobalVar))
GlobalVar = GlobalMap[GlobalVar];
Info.RootPointerOperand = GlobalVar;
RootTy = GlobalVar->getValueType();
} else if (auto *Alloca = dyn_cast<AllocaInst>(PtrOperand)) {
RootTy = Alloca->getAllocatedType();
}

: GlobalMap(GlobalMap) {}
bool visit(Function &F);
// InstVisitor methods. They return true if the instruction was scalarized,
// false if nothing changed.
Expand All @@ -78,35 +80,20 @@ class DXILFlattenArraysVisitor

private:
SmallVector<WeakTrackingVH> PotentiallyDeadInstrs;
DenseMap<GetElementPtrInst *, GEPData> GEPChainMap;
DenseMap<GEPOperator *, GEPInfo> GEPChainInfoMap;
DenseMap<GlobalVariable *, GlobalVariable *> &GlobalMap;
bool finish();
ConstantInt *genConstFlattenIndices(ArrayRef<Value *> Indices,
ArrayRef<uint64_t> Dims,
IRBuilder<> &Builder);
Value *genInstructionFlattenIndices(ArrayRef<Value *> Indices,
ArrayRef<uint64_t> Dims,
IRBuilder<> &Builder);

// Helper function to collect indices and dimensions from a GEP instruction
void collectIndicesAndDimsFromGEP(GetElementPtrInst &GEP,
SmallVectorImpl<Value *> &Indices,
SmallVectorImpl<uint64_t> &Dims,
bool &AllIndicesAreConstInt);

void
recursivelyCollectGEPs(GetElementPtrInst &CurrGEP,
ArrayType *FlattenedArrayType, Value *PtrOperand,
unsigned &GEPChainUseCount,
SmallVector<Value *> Indices = SmallVector<Value *>(),
SmallVector<uint64_t> Dims = SmallVector<uint64_t>(),
bool AllIndicesAreConstInt = true);
bool visitGetElementPtrInstInGEPChain(GetElementPtrInst &GEP);
bool visitGetElementPtrInstInGEPChainBase(GEPData &GEPInfo,
GetElementPtrInst &GEP);
};
} // namespace

bool DXILFlattenArraysVisitor::finish() {
GEPChainInfoMap.clear();
Copy link
Member

@farzonl farzonl Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine. not sure if its necessary though. The GEPChainInfoMap will get recreated on every construction of th DXILFlattenArraysVisitor which should be the same as clearing it. That said the idea makes sense, there are going to be many invalid gep chains the longer we don't clear because we are flattening as we walk the chains. So I think it makes sense, but we should probably be clearing the GEPChainInfoMap after each function, while I think this clears it after each module.

Copy link
Contributor Author

@Icohedron Icohedron Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine. not sure if its necessary though.

This is a change I added later because I found that tests with multiple functions had GEPs that were incorrectly flattened because GEPOperator* from previous functions happened to collide with GEPOperator* in the GEPChainInfoMap while processing the current function.

The GEPChainInfoMap will get recreated on every construction of th DXILFlattenArraysVisitor which should be the same as clearing it.

The DXILFlattenArraysVisitor is only created once and it's for the module.

static bool flattenArrays(Module &M) {
bool MadeChange = false;
DenseMap<GlobalVariable *, GlobalVariable *> GlobalMap;
flattenGlobalArrays(M, GlobalMap);
DXILFlattenArraysVisitor Impl(GlobalMap);
for (auto &F : make_early_inc_range(M.functions())) {
if (F.isDeclaration())
continue;
MadeChange |= Impl.visit(F);
}
for (auto &[Old, New] : GlobalMap) {
Old->replaceAllUsesWith(New);
Old->eraseFromParent();
MadeChange = true;
}
return MadeChange;
}
PreservedAnalyses DXILFlattenArrays::run(Module &M, ModuleAnalysisManager &) {
bool MadeChanges = flattenArrays(M);
if (!MadeChanges)
return PreservedAnalyses::all();
PreservedAnalyses PA;
return PA;
}
bool DXILFlattenArraysLegacy::runOnModule(Module &M) {
return flattenArrays(M);
}

we should probably be clearing the GEPChainInfoMap after each function

The finish() function is called at the end of DXILFlattenArraysVisitor::visit(Function &F), so the GEPChainInfoMap is indeed cleared after each function.

bool DXILFlattenArraysVisitor::visit(Function &F) {
bool MadeChange = false;
ReversePostOrderTraversal<Function *> RPOT(&F);
for (BasicBlock *BB : make_early_inc_range(RPOT)) {
for (Instruction &I : make_early_inc_range(*BB))
MadeChange |= InstVisitor::visit(I);
}
finish();
return MadeChange;
}

RecursivelyDeleteTriviallyDeadInstructionsPermissive(PotentiallyDeadInstrs);
return true;
}
Expand Down Expand Up @@ -225,131 +212,145 @@ bool DXILFlattenArraysVisitor::visitAllocaInst(AllocaInst &AI) {
return true;
}

void DXILFlattenArraysVisitor::collectIndicesAndDimsFromGEP(
GetElementPtrInst &GEP, SmallVectorImpl<Value *> &Indices,
SmallVectorImpl<uint64_t> &Dims, bool &AllIndicesAreConstInt) {

Type *CurrentType = GEP.getSourceElementType();
bool DXILFlattenArraysVisitor::visitGetElementPtrInst(GetElementPtrInst &GEP) {
// Do not visit GEPs more than once
if (GEPChainInfoMap.contains(cast<GEPOperator>(&GEP)))
return false;

// Note index 0 is the ptr index.
for (Value *Index : llvm::drop_begin(GEP.indices(), 1)) {
Indices.push_back(Index);
AllIndicesAreConstInt &= isa<ConstantInt>(Index);
Value *PtrOperand = GEP.getPointerOperand();

if (auto *ArrayTy = dyn_cast<ArrayType>(CurrentType)) {
Dims.push_back(ArrayTy->getNumElements());
CurrentType = ArrayTy->getElementType();
} else {
assert(false && "Expected array type in GEP chain");
}
// Replace a GEP ConstantExpr pointer operand with a GEP instruction so that
// it can be visited
if (auto *PtrOpGEPCE = dyn_cast<ConstantExpr>(PtrOperand);
PtrOpGEPCE && PtrOpGEPCE->getOpcode() == Instruction::GetElementPtr) {
GetElementPtrInst *OldGEPI =
cast<GetElementPtrInst>(PtrOpGEPCE->getAsInstruction());
OldGEPI->insertBefore(GEP.getIterator());

IRBuilder<> Builder(&GEP);
SmallVector<Value *> Indices(GEP.indices());
Value *NewGEP =
Builder.CreateGEP(GEP.getSourceElementType(), OldGEPI, Indices,
GEP.getName(), GEP.getNoWrapFlags());
assert(isa<GetElementPtrInst>(NewGEP) &&
"Expected newly-created GEP to be an instruction");
GetElementPtrInst *NewGEPI = cast<GetElementPtrInst>(NewGEP);

GEP.replaceAllUsesWith(NewGEPI);
GEP.eraseFromParent();
visitGetElementPtrInst(*OldGEPI);
visitGetElementPtrInst(*NewGEPI);
return true;
}
}

void DXILFlattenArraysVisitor::recursivelyCollectGEPs(
GetElementPtrInst &CurrGEP, ArrayType *FlattenedArrayType,
Value *PtrOperand, unsigned &GEPChainUseCount, SmallVector<Value *> Indices,
SmallVector<uint64_t> Dims, bool AllIndicesAreConstInt) {
// Check if this GEP is already in the map to avoid circular references
if (GEPChainMap.count(&CurrGEP) > 0)
return;

// Collect indices and dimensions from the current GEP
collectIndicesAndDimsFromGEP(CurrGEP, Indices, Dims, AllIndicesAreConstInt);
bool IsMultiDimArr = isMultiDimensionalArray(CurrGEP.getSourceElementType());
if (!IsMultiDimArr) {
assert(GEPChainUseCount < FlattenedArrayType->getNumElements());
GEPChainMap.insert(
{&CurrGEP,
{std::move(FlattenedArrayType), PtrOperand, std::move(Indices),
std::move(Dims), AllIndicesAreConstInt}});
return;
}
bool GepUses = false;
for (auto *User : CurrGEP.users()) {
if (GetElementPtrInst *NestedGEP = dyn_cast<GetElementPtrInst>(User)) {
recursivelyCollectGEPs(*NestedGEP, FlattenedArrayType, PtrOperand,
++GEPChainUseCount, Indices, Dims,
AllIndicesAreConstInt);
GepUses = true;
// Construct GEPInfo for this GEP
GEPInfo Info;

// Obtain the variable and constant byte offsets computed by this GEP
const DataLayout &DL = GEP.getDataLayout();
unsigned BitWidth = DL.getIndexTypeSizeInBits(GEP.getType());
Info.ConstantOffset = {BitWidth, 0};
[[maybe_unused]] bool Success = GEP.collectOffset(
DL, BitWidth, Info.VariableOffsets, Info.ConstantOffset);
assert(Success && "Failed to collect offsets for GEP");

// If there is a parent GEP, inherit the root array type and pointer, and
// merge the byte offsets. Otherwise, this GEP is itself the root of a GEP
// chain and we need to deterine the root array type
if (auto *PtrOpGEP = dyn_cast<GEPOperator>(PtrOperand)) {
assert(GEPChainInfoMap.contains(PtrOpGEP) &&
"Expected parent GEP to be visited before this GEP");
GEPInfo &PGEPInfo = GEPChainInfoMap[PtrOpGEP];
Info.RootFlattenedArrayType = PGEPInfo.RootFlattenedArrayType;
Info.RootPointerOperand = PGEPInfo.RootPointerOperand;
for (auto &VariableOffset : PGEPInfo.VariableOffsets)
Info.VariableOffsets.insert(VariableOffset);
Info.ConstantOffset += PGEPInfo.ConstantOffset;
} else {
Info.RootPointerOperand = PtrOperand;

// We should try to determine the type of the root from the pointer rather
// than the GEP's source element type because this could be a scalar GEP
// into an array-typed pointer from an Alloca or Global Variable.
Type *RootTy = GEP.getSourceElementType();
if (auto *GlobalVar = dyn_cast<GlobalVariable>(PtrOperand)) {
if (GlobalMap.contains(GlobalVar))
GlobalVar = GlobalMap[GlobalVar];
Info.RootPointerOperand = GlobalVar;
RootTy = GlobalVar->getValueType();
} else if (auto *Alloca = dyn_cast<AllocaInst>(PtrOperand)) {
RootTy = Alloca->getAllocatedType();
}
}
// This case is just incase the gep chain doesn't end with a 1d array.
if (IsMultiDimArr && GEPChainUseCount > 0 && !GepUses) {
GEPChainMap.insert(
{&CurrGEP,
{std::move(FlattenedArrayType), PtrOperand, std::move(Indices),
std::move(Dims), AllIndicesAreConstInt}});
}
}

bool DXILFlattenArraysVisitor::visitGetElementPtrInstInGEPChain(
GetElementPtrInst &GEP) {
GEPData GEPInfo = GEPChainMap.at(&GEP);
return visitGetElementPtrInstInGEPChainBase(GEPInfo, GEP);
}
bool DXILFlattenArraysVisitor::visitGetElementPtrInstInGEPChainBase(
GEPData &GEPInfo, GetElementPtrInst &GEP) {
IRBuilder<> Builder(&GEP);
Value *FlatIndex;
if (GEPInfo.AllIndicesAreConstInt)
FlatIndex = genConstFlattenIndices(GEPInfo.Indices, GEPInfo.Dims, Builder);
else
FlatIndex =
genInstructionFlattenIndices(GEPInfo.Indices, GEPInfo.Dims, Builder);

ArrayType *FlattenedArrayType = GEPInfo.ParentArrayType;

// Don't append '.flat' to an empty string. If the SSA name isn't available
// it could conflict with the ParentOperand's name.
std::string FlatName = GEP.hasName() ? GEP.getName().str() + ".flat" : "";

Value *FlatGEP = Builder.CreateGEP(FlattenedArrayType, GEPInfo.ParentOperand,
{Builder.getInt32(0), FlatIndex}, FlatName,
GEP.getNoWrapFlags());

// Note: Old gep will become an invalid instruction after replaceAllUsesWith.
// Erase the old GEP in the map before to avoid invalid instructions
// and circular references.
GEPChainMap.erase(&GEP);

GEP.replaceAllUsesWith(FlatGEP);
GEP.eraseFromParent();
return true;
}
assert(!isMultiDimensionalArray(RootTy) &&
"Expected root array type to be flattened");

bool DXILFlattenArraysVisitor::visitGetElementPtrInst(GetElementPtrInst &GEP) {
auto It = GEPChainMap.find(&GEP);
if (It != GEPChainMap.end())
return visitGetElementPtrInstInGEPChain(GEP);
if (!isMultiDimensionalArray(GEP.getSourceElementType()))
return false;
// If the root type is not an array, we don't need to do any flattening
if (!isa<ArrayType>(RootTy))
return false;

ArrayType *ArrType = cast<ArrayType>(GEP.getSourceElementType());
IRBuilder<> Builder(&GEP);
auto [TotalElements, BaseType] = getElementCountAndType(ArrType);
ArrayType *FlattenedArrayType = ArrayType::get(BaseType, TotalElements);
Info.RootFlattenedArrayType = cast<ArrayType>(RootTy);
}

Value *PtrOperand = GEP.getPointerOperand();
// GEPs without users or GEPs with non-GEP users should be replaced such that
// the chain of GEPs they are a part of are collapsed to a single GEP into a
// flattened array.
bool ReplaceThisGEP = GEP.users().empty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems unnecessary, why can't you just do

Suggested change
bool ReplaceThisGEP = GEP.users().empty();
bool ReplaceThisGEP = false;

Copy link
Contributor Author

@Icohedron Icohedron Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to preserve the behavior of the previous implementation in the llvm/test/CodeGen/DirectX tests. There are many tests that create GEPs that are not used. So I replace unused GEPs to keep the tests working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can set bool ReplaceThisGEP = false; and then update the tests so that all the GEPs are used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can leave this as is for now.

for (Value *User : GEP.users())
if (!isa<GetElementPtrInst>(User))
ReplaceThisGEP = true;

if (ReplaceThisGEP) {
unsigned BytesPerElem =
DL.getTypeAllocSize(Info.RootFlattenedArrayType->getArrayElementType());
assert(isPowerOf2_32(BytesPerElem) &&
"Bytes per element should be a power of 2");

// Compute the 32-bit index for this flattened GEP from the constant and
// variable byte offsets in the GEPInfo
IRBuilder<> Builder(&GEP);
Value *ZeroIndex = Builder.getInt32(0);
uint64_t ConstantOffset =
Info.ConstantOffset.udiv(BytesPerElem).getZExtValue();
assert(ConstantOffset < UINT32_MAX &&
"Constant byte offset for flat GEP index must fit within 32 bits");
Value *FlattenedIndex = Builder.getInt32(ConstantOffset);
for (auto [VarIndex, Multiplier] : Info.VariableOffsets) {
assert(Multiplier.getActiveBits() <= 32 &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there a 32 bit constant we can use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, Multiplier.getActiveBits() <= 32 is already clear enough that it's testing the number held fits within an unsigned 32 bit integer.

"The multiplier for a flat GEP index must fit within 32 bits");
assert(VarIndex->getType()->isIntegerTy(32) &&
"Expected i32-typed GEP indices");
Value *VI;
if (Multiplier.getZExtValue() % BytesPerElem != 0) {
// This can happen, e.g., with i8 GEPs. To handle this we just divide
// by BytesPerElem using an instruction after multiplying VarIndex by
// Multiplier.
VI = Builder.CreateMul(VarIndex,
Builder.getInt32(Multiplier.getZExtValue()));
VI = Builder.CreateLShr(VI, Builder.getInt32(Log2_32(BytesPerElem)));
} else
VI = Builder.CreateMul(
VarIndex,
Builder.getInt32(Multiplier.getZExtValue() / BytesPerElem));
FlattenedIndex = Builder.CreateAdd(FlattenedIndex, VI);
}

unsigned GEPChainUseCount = 0;
recursivelyCollectGEPs(GEP, FlattenedArrayType, PtrOperand, GEPChainUseCount);

// NOTE: hasNUses(0) is not the same as GEPChainUseCount == 0.
// Here recursion is used to get the length of the GEP chain.
// Handle zero uses here because there won't be an update via
// a child in the chain later.
if (GEPChainUseCount == 0) {
SmallVector<Value *> Indices;
SmallVector<uint64_t> Dims;
bool AllIndicesAreConstInt = true;

// Collect indices and dimensions from the GEP
collectIndicesAndDimsFromGEP(GEP, Indices, Dims, AllIndicesAreConstInt);
GEPData GEPInfo{std::move(FlattenedArrayType), PtrOperand,
std::move(Indices), std::move(Dims), AllIndicesAreConstInt};
return visitGetElementPtrInstInGEPChainBase(GEPInfo, GEP);
// Construct a new GEP for the flattened array to replace the current GEP
Value *NewGEP = Builder.CreateGEP(
Info.RootFlattenedArrayType, Info.RootPointerOperand,
{ZeroIndex, FlattenedIndex}, GEP.getName(), GEP.getNoWrapFlags());

// Replace the current GEP with the new GEP. Store GEPInfo into the map
// for later use in case this GEP was not the end of the chain
GEPChainInfoMap.insert({cast<GEPOperator>(NewGEP), std::move(Info)});
GEP.replaceAllUsesWith(NewGEP);
GEP.eraseFromParent();
return true;
}

// This GEP is potentially dead at the end of the pass since it may not have
// any users anymore after GEP chains have been collapsed. We retain store
// GEPInfo for GEPs down the chain to use to compute their indices.
GEPChainInfoMap.insert({cast<GEPOperator>(&GEP), std::move(Info)});
PotentiallyDeadInstrs.emplace_back(&GEP);
return false;
}
Expand Down Expand Up @@ -456,9 +457,9 @@ flattenGlobalArrays(Module &M,

static bool flattenArrays(Module &M) {
bool MadeChange = false;
DXILFlattenArraysVisitor Impl;
DenseMap<GlobalVariable *, GlobalVariable *> GlobalMap;
flattenGlobalArrays(M, GlobalMap);
DXILFlattenArraysVisitor Impl(GlobalMap);
for (auto &F : make_early_inc_range(M.functions())) {
if (F.isDeclaration())
continue;
Expand Down
Loading