Skip to content

[SHT_LLVM_BB_ADDR_MAP] Emit callsite offsets in the SHT_LLVM_BB_ADDR_MAP section. #146563

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 51 additions & 9 deletions llvm/docs/Extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -405,31 +405,73 @@ This section is emitted with ``-basic-block-address-map`` and will contain
a BB address map table for every function.

The ``SHT_LLVM_BB_ADDR_MAP`` type provides backward compatibility to allow
reading older versions of the BB address map generated by older compilers. Each
function entry starts with a version byte which specifies the encoding version
to use. The following versioning schemes are currently supported.
reading older versions of the BB address map generated by older compilers (up to
two years old). Each function entry starts with a version byte which specifies
the encoding version to use. This is followed by a feature byte which specifies
the features specific to this particular entry. The function base address is
stored as a full address. Other addresses in the entry (block begin and end
addresses and callsite addresses) are stored in a running-offset fashion, as
offsets relative to prior addresses.

Version 1 (newest): basic block address offsets are computed relative to the end
of previous blocks.
The following versioning schemes are currently supported (newer versions support
features of the older versions).

Version 3 (newest): Capable of encoding callsite offsets. Enabled by the 6th bit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fifth bit? Three for the PGO Analysis Map Features, one for split functions, and then one for this. That should leave three unused?

of the feature byte.

Example:

.. code-block:: gas

.section ".llvm_bb_addr_map","",@llvm_bb_addr_map
.byte 1 # version number
.byte 0 # feature byte (reserved for future use)
.byte 3 # version number
.byte 32 # feature byte
.quad .Lfunc_begin0 # address of the function
.byte 2 # number of basic blocks
# BB record for BB_0
.uleb128 .Lfunc_beign0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero)
.byte 0 # BB_0 ID
.uleb128 .Lfunc_begin0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero)
.byte 0 # number of callsites in this block
.uleb128 .LBB_END0_0-.Lfunc_begin0 # BB_0 size
.byte x # BB_0 metadata
# BB record for BB_1
.byte 1 # BB_1 ID
.uleb128 .LBB0_1-.LBB_END0_0 # BB_1 offset relative to the end of last block (BB_0).
.uleb128 .LBB_END0_1-.LBB0_1 # BB_1 size
.byte 2 # number of callsites in this block
.uleb128 .LBB0_1_CS0-.LBB0_1 # offset of callsite relative to the previous offset (.LBB0_1)
.uleb128 .LBB0_1_CS1-.LBB0_1_CS0 # offset of callsite relative to the previous offset (.LBB0_1_CS0)
.uleb128 .LBB_END0_1-.LBB0_1_CS1 # BB_1 size offset (Offset of the block end relative to the previous offset).
.byte y # BB_1 metadata

Version 2: Capable of encoding split functions. Enabled by the 4th bit of the
feature byte. The base address of each split range is stored as a full address.
The first range corresponds to the function entry.

Example:

.. code-block:: gas

.section ".llvm_bb_addr_map","",@llvm_bb_addr_map
.byte 2 # version number
.byte 8 # feature byte
.byte 2 # number of basic block ranges
# 1st BB range (corresponding to the function entry)
.quad .Lfunc_begin0 # base address
.byte 1 # number of basic blocks in this range
# BB record for BB_0
.byte 0 # BB_0 ID
.uleb128 .Lfunc_begin0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero)
.uleb128 .LBB_END0_0-.Lfunc_begin0 # BB_0 size
.byte x # BB_0 metadata
# 2nd BB range
.quad func.part.1
.byte 1 # number of basic blocks in this range
# BB record for BB_1
.byte 1 # BB_1 ID
.uleb128 func.part.1-func.part.1 # BB_1 offset relative to the range begin (always zero)
.uleb128 .LBB_END0_1-func.part.1 # BB_1 size
.byte 1 # BB_1 metadata

PGO Analysis Map
""""""""""""""""

Expand Down
11 changes: 11 additions & 0 deletions llvm/include/llvm/CodeGen/AsmPrinter.h
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,13 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
/// default, this is equal to CurrentFnSym.
MCSymbol *CurrentFnSymForSize = nullptr;

/// Vector of symbols marking the position of callsites in the current
/// function, keyed by their containing basic block.
/// The callsite symbols of each block are stored in the order they appear
/// in that block.
DenseMap<const MachineBasicBlock *, SmallVector<MCSymbol *, 1>>
CurrentFnCallsiteSymbols;

/// Provides the profile information for constants.
const StaticDataProfileInfo *SDPI = nullptr;

Expand Down Expand Up @@ -295,6 +302,10 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
/// to emit them as well, return the whole set.
ArrayRef<MCSymbol *> getAddrLabelSymbolToEmit(const BasicBlock *BB);

/// Creates a new symbol to be used for the beginning of a callsite at the
/// specified basic block.
MCSymbol *createCallsiteSymbol(const MachineBasicBlock &MBB);

/// If the specified function has had any references to address-taken blocks
/// generated, but the block got deleted, return the symbol now so we can
/// emit it. This prevents emitting a reference to a symbol that has no
Expand Down
4 changes: 2 additions & 2 deletions llvm/include/llvm/MC/MCContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -175,8 +175,8 @@ class MCContext {
/// for the LocalLabelVal and adds it to the map if needed.
unsigned GetInstance(unsigned LocalLabelVal);

/// LLVM_BB_ADDR_MAP version to emit.
uint8_t BBAddrMapVersion = 2;
/// SHT_LLVM_BB_ADDR_MAP version to emit.
uint8_t BBAddrMapVersion = 3;

/// The file name of the log file from the environment variable
/// AS_SECURE_LOG_FILE. Which must be set before the .secure_log_unique
Expand Down
4 changes: 2 additions & 2 deletions llvm/include/llvm/Object/ELFTypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -917,8 +917,8 @@ struct BBAddrMap {
uint32_t Size = 0; // Size of the basic block.
Metadata MD = {false, false, false, false,
false}; // Metdata for this basic block.
// Offsets of callsites (end of call instructions), relative to the basic
// block start.
// Offsets of callsites (beginning of call instructions), relative to the
// basic block start.
SmallVector<uint32_t, 1> CallsiteOffsets;

BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD,
Expand Down
38 changes: 31 additions & 7 deletions llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1391,7 +1391,8 @@ static uint32_t getBBAddrMapMetadata(const MachineBasicBlock &MBB) {
}

static llvm::object::BBAddrMap::Features
getBBAddrMapFeature(const MachineFunction &MF, int NumMBBSectionRanges) {
getBBAddrMapFeature(const MachineFunction &MF, int NumMBBSectionRanges,
bool HasCalls) {
// Ensure that the user has not passed in additional options while also
// specifying all or none.
if ((PgoAnalysisMapFeatures.isSet(PGOMapFeaturesEnum::None) ||
Expand Down Expand Up @@ -1424,13 +1425,14 @@ getBBAddrMapFeature(const MachineFunction &MF, int NumMBBSectionRanges) {
BrProbEnabled,
MF.hasBBSections() && NumMBBSectionRanges > 1,
static_cast<bool>(BBAddrMapSkipEmitBBEntries),
false};
HasCalls};
}

void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
MCSection *BBAddrMapSection =
getObjFileLowering().getBBAddrMapSection(*MF.getSection());
assert(BBAddrMapSection && ".llvm_bb_addr_map section is not initialized.");
bool HasCalls = !CurrentFnCallsiteSymbols.empty();

const MCSymbol *FunctionSymbol = getFunctionBegin();

Expand All @@ -1440,7 +1442,7 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
uint8_t BBAddrMapVersion = OutStreamer->getContext().getBBAddrMapVersion();
OutStreamer->emitInt8(BBAddrMapVersion);
OutStreamer->AddComment("feature");
auto Features = getBBAddrMapFeature(MF, MBBSectionRanges.size());
auto Features = getBBAddrMapFeature(MF, MBBSectionRanges.size(), HasCalls);
OutStreamer->emitInt8(Features.encode());
// Emit BB Information for each basic block in the function.
if (Features.MultiBBRange) {
Expand Down Expand Up @@ -1493,13 +1495,24 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
// Emit the basic block offset relative to the end of the previous block.
// This is zero unless the block is padded due to alignment.
emitLabelDifferenceAsULEB128(MBBSymbol, PrevMBBEndSymbol);
// Emit the basic block size. When BBs have alignments, their size cannot
// always be computed from their offsets.
emitLabelDifferenceAsULEB128(MBB.getEndSymbol(), MBBSymbol);
const MCSymbol *CurrentLabel = MBBSymbol;
if (HasCalls) {
const SmallVectorImpl<MCSymbol *> &CallsiteSymbols =
CurrentFnCallsiteSymbols.lookup(&MBB);
OutStreamer->AddComment("number of callsites");
OutStreamer->emitULEB128IntValue(CallsiteSymbols.size());
for (const MCSymbol *CallsiteSymbol : CallsiteSymbols) {
// Emit the callsite offset.
emitLabelDifferenceAsULEB128(CallsiteSymbol, CurrentLabel);
CurrentLabel = CallsiteSymbol;
}
}
// Emit the offset to the end of the block, which can be used to compute
// the total block size.
emitLabelDifferenceAsULEB128(MBB.getEndSymbol(), CurrentLabel);
// Emit the Metadata.
OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));
}

PrevMBBEndSymbol = MBB.getEndSymbol();
}

Expand Down Expand Up @@ -1828,6 +1841,8 @@ void AsmPrinter::emitFunctionBody() {
!MI.isDebugInstr()) {
HasAnyRealCode = true;
}
if (MI.isCall() && MF->getTarget().Options.BBAddrMap)
OutStreamer->emitLabel(createCallsiteSymbol(MBB));

// If there is a pre-instruction symbol, emit a label for it here.
if (MCSymbol *S = MI.getPreInstrSymbol())
Expand Down Expand Up @@ -2775,6 +2790,14 @@ MCSymbol *AsmPrinter::getMBBExceptionSym(const MachineBasicBlock &MBB) {
return Res.first->second;
}

MCSymbol *AsmPrinter::createCallsiteSymbol(const MachineBasicBlock &MBB) {
MCContext &Ctx = MF->getContext();
MCSymbol *Sym = Ctx.createTempSymbol("BB" + Twine(MF->getFunctionNumber()) +
"_" + Twine(MBB.getNumber()) + "_CS");
CurrentFnCallsiteSymbols[&MBB].push_back(Sym);
return Sym;
}

void AsmPrinter::SetupMachineFunction(MachineFunction &MF) {
this->MF = &MF;
const Function &F = MF.getFunction();
Expand Down Expand Up @@ -2809,6 +2832,7 @@ void AsmPrinter::SetupMachineFunction(MachineFunction &MF) {
CurrentFnBegin = nullptr;
CurrentFnBeginLocal = nullptr;
CurrentSectionBeginSym = nullptr;
CurrentFnCallsiteSymbols.clear();
MBBSectionRanges.clear();
MBBSectionExceptionSyms.clear();
bool NeedsLocalForSize = MAI->needsLocalForSize();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ entry:
; CHECK: func:
; CHECK: .Lfunc_begin1:
; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text{{$}}
; CHECK-NEXT: .byte 2 # version
; CHECK-NEXT: .byte 3 # version
; BASIC-NEXT: .byte 0 # feature
; PGO-NEXT: .byte 3 # feature
; CHECK-NEXT: .quad .Lfunc_begin1 # function address
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ define dso_local i32 @_Z3barv() {
; CHECK-LABEL: _Z3barv:
; CHECK-NEXT: [[BAR_BEGIN:.Lfunc_begin[0-9]+]]:
; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text._Z3barv{{$}}
; CHECK-NEXT: .byte 2 # version
; CHECK-NEXT: .byte 3 # version
; CHECK-NEXT: .byte 0 # feature
; CHECK-NEXT: .quad [[BAR_BEGIN]] # function address

Expand All @@ -23,8 +23,8 @@ define dso_local i32 @_Z3foov() {
; CHECK-LABEL: _Z3foov:
; CHECK-NEXT: [[FOO_BEGIN:.Lfunc_begin[0-9]+]]:
; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text._Z3foov{{$}}
; CHECK-NEXT: .byte 2 # version
; CHECK-NEXT: .byte 0 # feature
; CHECK-NEXT: .byte 3 # version
; CHECK-NEXT: .byte 32 # feature
; CHECK-NEXT: .quad [[FOO_BEGIN]] # function address


Expand All @@ -36,6 +36,6 @@ define linkonce_odr dso_local i32 @_Z4fooTIiET_v() comdat {
; CHECK-LABEL: _Z4fooTIiET_v:
; CHECK-NEXT: [[FOOCOMDAT_BEGIN:.Lfunc_begin[0-9]+]]:
; CHECK: .section .llvm_bb_addr_map,"oG",@llvm_bb_addr_map,.text._Z4fooTIiET_v,_Z4fooTIiET_v,comdat{{$}}
; CHECK-NEXT: .byte 2 # version
; CHECK-NEXT: .byte 3 # version
; CHECK-NEXT: .byte 0 # feature
; CHECK-NEXT: .quad [[FOOCOMDAT_BEGIN]] # function address
26 changes: 17 additions & 9 deletions llvm/test/CodeGen/X86/basic-block-address-map-pgo-features.ll
Original file line number Diff line number Diff line change
Expand Up @@ -69,36 +69,44 @@ declare i32 @__gxx_personality_v0(...)
; CHECK-LABEL: .Lfunc_end0:

; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text._Z3bazb{{$}}
; CHECK-NEXT: .byte 2 # version
; BASIC-NEXT: .byte 0 # feature
; PGO-ALL-NEXT: .byte 7 # feature
; FEC-ONLY-NEXT:.byte 1 # feature
; BBF-ONLY-NEXT:.byte 2 # feature
; BRP-ONLY-NEXT:.byte 4 # feature
; CHECK-NEXT: .byte 3 # version
; BASIC-NEXT: .byte 32 # feature
; PGO-ALL-NEXT: .byte 39 # feature
; FEC-ONLY-NEXT:.byte 33 # feature
; BBF-ONLY-NEXT:.byte 34 # feature
; BRP-ONLY-NEXT:.byte 36 # feature
; CHECK-NEXT: .quad .Lfunc_begin0 # function address
; CHECK-NEXT: .byte 6 # number of basic blocks
; CHECK-NEXT: .byte 0 # BB id
; CHECK-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0
; CHECK-NEXT: .byte 0 # number of callsites
; CHECK-NEXT: .uleb128 .LBB_END0_0-.Lfunc_begin0
; CHECK-NEXT: .byte 8
; CHECK-NEXT: .byte 1 # BB id
; CHECK-NEXT: .uleb128 .LBB0_1-.LBB_END0_0
; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1
; CHECK-NEXT: .byte 1 # number of callsites
; CHECK-NEXT: .uleb128 .LBB0_1_CS0-.LBB0_1
; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1_CS0
; CHECK-NEXT: .byte 8
; CHECK-NEXT: .byte 3 # BB id
; CHECK-NEXT: .uleb128 .LBB0_2-.LBB_END0_1
; CHECK-NEXT: .uleb128 .LBB_END0_2-.LBB0_2
; CHECK-NEXT: .byte 1 # number of callsites
; CHECK-NEXT: .uleb128 .LBB0_2_CS0-.LBB0_2
; CHECK-NEXT: .uleb128 .LBB_END0_2-.LBB0_2_CS0
; CHECK-NEXT: .byte 8
; CHECK-NEXT: .byte 5 # BB id
; CHECK-NEXT: .uleb128 .LBB0_3-.LBB_END0_2
; CHECK-NEXT: .byte 0 # number of callsites
; CHECK-NEXT: .uleb128 .LBB_END0_3-.LBB0_3
; CHECK-NEXT: .byte 1
; CHECK-NEXT: .byte 4 # BB id
; CHECK-NEXT: .uleb128 .LBB0_4-.LBB_END0_3
; CHECK-NEXT: .byte 0 # number of callsites
; CHECK-NEXT: .uleb128 .LBB_END0_4-.LBB0_4
; CHECK-NEXT: .byte 16
; CHECK-NEXT: .byte 2 # BB id
; CHECK-NEXT: .uleb128 .LBB0_5-.LBB_END0_4
; CHECK-NEXT: .byte 0 # number of callsites
; CHECK-NEXT: .uleb128 .LBB_END0_5-.LBB0_5
; CHECK-NEXT: .byte 4

Expand Down Expand Up @@ -138,7 +146,7 @@ declare i32 @__gxx_personality_v0(...)
; PGO-BRP-NEXT: .byte 5 # successor BB ID
; PGO-BRP-NEXT: .ascii "\200\200\200\200\b" # successor branch probability

; SKIP-BB-ENTRIES: .byte 17 # feature
; SKIP-BB-ENTRIES: .byte 49 # feature
; SKIP-BB-ENTRIES-NEXT: .quad .Lfunc_begin0 # function address
; SKIP-BB-ENTRIES-NEXT: .byte 6 # number of basic blocks
; SKIP-BB-ENTRIES-NEXT: .byte 100 # function entry count
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,33 +39,41 @@ declare i32 @__gxx_personality_v0(...)
; CHECK-LABEL: .LBB_END0_1:
; CHECK: .section .text.split._Z3bazb,"ax",@progbits
; CHECK-LABEL: _Z3bazb.cold:
; CHECK-LABEL: .LBB0_2_CS0:
; CHECK-LABEL: .LBB_END0_2:
; CHECK-LABEL: .LBB0_3:
; CHECK-LABEL: .LBB0_3_CS0:
; CHECK-LABEL: .LBB_END0_3:
; CHECK-LABEL: .Lfunc_end0:

; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text.hot._Z3bazb
; CHECK-NEXT: .byte 2 # version
; CHECK-NEXT: .byte 8 # feature
; CHECK-NEXT: .byte 2 # number of basic block ranges
; CHECK-NEXT: .quad .Lfunc_begin0 # base address
; CHECK-NEXT: .byte 2 # number of basic blocks
; CHECK-NEXT: .byte 0 # BB id
; CHECK-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0
; CHECK-NEXT: .uleb128 .LBB_END0_0-.Lfunc_begin0
; CHECK-NEXT: .byte 0
; CHECK-NEXT: .byte 2 # BB id
; CHECK-NEXT: .uleb128 .LBB0_1-.LBB_END0_0
; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1
; CHECK-NEXT: .byte 5
; CHECK-NEXT: .quad _Z3bazb.cold # base address
; CHECK-NEXT: .byte 2 # number of basic blocks
; CHECK-NEXT: .byte 1 # BB id
; CHECK-NEXT: .uleb128 _Z3bazb.cold-_Z3bazb.cold
; CHECK-NEXT: .uleb128 .LBB_END0_2-_Z3bazb.cold
; CHECK-NEXT: .byte 8
; CHECK-NEXT: .byte 3 # BB id
; CHECK-NEXT: .uleb128 .LBB0_3-.LBB_END0_2
; CHECK-NEXT: .uleb128 .LBB_END0_3-.LBB0_3
; CHECK-NEXT: .byte 3 # version
; CHECK-NEXT: .byte 40 # feature
; CHECK-NEXT: .byte 2 # number of basic block ranges
; CHECK-NEXT: .quad .Lfunc_begin0 # base address
; CHECK-NEXT: .byte 2 # number of basic blocks
; CHECK-NEXT: .byte 0 # BB id
; CHECK-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0
; CHECK-NEXT: .byte 0 # number of callsites
; CHECK-NEXT: .uleb128 .LBB_END0_0-.Lfunc_begin0
; CHECK-NEXT: .byte 0
; CHECK-NEXT: .byte 2 # BB id
; CHECK-NEXT: .uleb128 .LBB0_1-.LBB_END0_0
; CHECK-NEXT: .byte 0 # number of callsites
; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1
; CHECK-NEXT: .byte 5
; CHECK-NEXT: .quad _Z3bazb.cold # base address
; CHECK-NEXT: .byte 2 # number of basic blocks
; CHECK-NEXT: .byte 1 # BB id
; CHECK-NEXT: .uleb128 _Z3bazb.cold-_Z3bazb.cold
; CHECK-NEXT: .byte 1 # number of callsites
; CHECK-NEXT: .uleb128 .LBB0_2_CS0-_Z3bazb.cold
; CHECK-NEXT: .uleb128 .LBB_END0_2-.LBB0_2_CS0
; CHECK-NEXT: .byte 8
; CHECK-NEXT: .byte 3 # BB id
; CHECK-NEXT: .uleb128 .LBB0_3-.LBB_END0_2
; CHECK-NEXT: .byte 1 # number of callsites
; CHECK-NEXT: .uleb128 .LBB0_3_CS0-.LBB0_3
; CHECK-NEXT: .uleb128 .LBB_END0_3-.LBB0_3_CS0
; CHECK-NEXT: .byte 1

Loading