Skip to content

Encode type-/layout- vs. region-based alias info separately in LLVM IR #54395

@topolarity

Description

@topolarity

Currently we redundantly encode region information in both !tbaa and !alias.scope metadata for LLVM.

We'd like to separate these so that TBAA is only used to encode the layout-/type-based non-aliasing information, and !alias.scope is used just for the region-based information.

For reference, here's the existing TBAA hierarchy:

julia/src/codegen.cpp

Lines 351 to 375 in 5f7bfc0

struct jl_tbaacache_t {
// type-based alias analysis nodes. Indentation of comments indicates hierarchy.
MDNode *tbaa_root; // Everything
MDNode *tbaa_gcframe; // GC frame
// LLVM should have enough info for alias analysis of non-gcframe stack slot
// this is mainly a place holder for `jl_cgval_t::tbaa`
MDNode *tbaa_stack; // stack slot
MDNode *tbaa_unionselbyte; // a selector byte in isbits Union struct fields
MDNode *tbaa_data; // Any user data that `pointerset/ref` are allowed to alias
MDNode *tbaa_binding; // jl_binding_t::value
MDNode *tbaa_value; // jl_value_t, that is not jl_array_t or jl_genericmemory_t
MDNode *tbaa_mutab; // mutable type
MDNode *tbaa_datatype; // datatype
MDNode *tbaa_immut; // immutable type
MDNode *tbaa_ptrarraybuf; // Data in an array of boxed values
MDNode *tbaa_arraybuf; // Data in an array of POD
MDNode *tbaa_array; // jl_array_t or jl_genericmemory_t
MDNode *tbaa_arrayptr; // The pointer inside a jl_array_t (to memoryref)
MDNode *tbaa_arraysize; // A size in a jl_array_t
MDNode *tbaa_arrayselbyte; // a selector byte in a isbits Union jl_genericmemory_t
MDNode *tbaa_memoryptr; // The pointer inside a jl_genericmemory_t
MDNode *tbaa_memorylen; // The length in a jl_genericmemory_t
MDNode *tbaa_memoryown; // The owner in a foreign jl_genericmemory_t
MDNode *tbaa_const; // Memory that is immutable by the time LLVM can see it
bool initialized;

The first step is probably to (1) re-factor the code to stop using ::fromTBAA. Instead, the region information should be passed around as a separate piece of aliasing-related information. Most likely that means creating a jl_aliasinfo_t much earlier, and updating jl_cgval_t to carry it instead of just TBAA metadata. Then, (2) remove tbaa_gcframe, tbaa_stack, tbaa_const, and tbaa_data from the TBAA hierarchy.

Afterwards, an excellent follow-up will be to expand the TBAA hierarchy to encode a much broader set of types into it, including ideally user-defined structs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    compiler:codegenGeneration of LLVM IR and native codecompiler:llvmFor issues that relate to LLVMhelp wantedIndicates that a maintainer wants help on an issue or pull request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions