Skip to content

Commit abdbaff

Browse files
authored
[DWARFLinker] Adjust DW_AT_LLVM_stmt_sequence for rewritten line tables (llvm#128953)
**Summary:** This update adds handling for `DW_AT_LLVM_stmt_sequence` attributes in the DWARF linker. These attributes point to rows in the line table, which gets rewritten during linking. Since the row positions change, the offsets in these attributes need to be updated to match the new layout in the output `.debug_line` section. The changes add new data structures and tweak existing functions to track and fix these attributes. **Background** In llvm#110192 we added support to clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for `DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434). This attribute holds a label pointing to the offset in the line table where the function's line entries begin. **Implementation details:** Here’s what’s changed in the code: - **New Tracking in `CompileUnit`:** A `StmtSeqListAttributes` vector is added to the `CompileUnit` class. It stores the locations where `DW_AT_LLVM_stmt_sequence` attributes need to be patched, recorded when cloning DIEs (debug info entries). - **Updated `emitLineTableForUnit` Function:** This function now has an optional `RowOffsets` parameter. It collects the byte offsets of each row in the output line table. We only need to use this functionality if `DW_AT_LLVM_stmt_sequence` attributes are present in the unit. - **Row Tracking with `TrackedRow`:** A `TrackedRow` struct keeps track of each input row’s original index and whether it starts a sequence in the output table. This links old rows to their new positions in the rewritten line table. Several implementations were considered and prototyped here, but so far this has proven the simplest and cleanest approach. - **Patching Step:** After the line table is written, the linker uses the data in `TrackedRow`'s objects and `RowOffsets` array to update the `DW_AT_LLVM_stmt_sequence` attributes with the correct offsets.
1 parent 139add5 commit abdbaff

File tree

9 files changed

+249
-46
lines changed

9 files changed

+249
-46
lines changed

llvm/include/llvm/DWARFLinker/Classic/DWARFLinker.h

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -122,10 +122,13 @@ class DwarfEmitter {
122122
const AddressRanges &LinkedRanges) = 0;
123123

124124
/// Emit specified \p LineTable into .debug_line table.
125-
virtual void emitLineTableForUnit(const DWARFDebugLine::LineTable &LineTable,
126-
const CompileUnit &Unit,
127-
OffsetsStringPool &DebugStrPool,
128-
OffsetsStringPool &DebugLineStrPool) = 0;
125+
/// The optional parameter RowOffsets, if provided, will be populated with the
126+
/// offsets of each line table row in the output .debug_line section.
127+
virtual void
128+
emitLineTableForUnit(const DWARFDebugLine::LineTable &LineTable,
129+
const CompileUnit &Unit, OffsetsStringPool &DebugStrPool,
130+
OffsetsStringPool &DebugLineStrPool,
131+
std::vector<uint64_t> *RowOffsets = nullptr) = 0;
129132

130133
/// Emit the .debug_pubnames contribution for \p Unit.
131134
virtual void emitPubNamesForUnit(const CompileUnit &Unit) = 0;

llvm/include/llvm/DWARFLinker/Classic/DWARFLinkerCompileUnit.h

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ struct PatchLocation {
5757

5858
using RngListAttributesTy = SmallVector<PatchLocation>;
5959
using LocListAttributesTy = SmallVector<PatchLocation>;
60+
using StmtSeqListAttributesTy = SmallVector<PatchLocation>;
6061

6162
/// Stores all information relating to a compile unit, be it in its original
6263
/// instance in the object file to its brand new cloned and generated DIE tree.
@@ -175,6 +176,12 @@ class CompileUnit {
175176
return LocationAttributes;
176177
}
177178

179+
// Provide access to the list of DW_AT_LLVM_stmt_sequence attributes that may
180+
// need to be patched.
181+
const StmtSeqListAttributesTy &getStmtSeqListAttributes() const {
182+
return StmtSeqListAttributes;
183+
}
184+
178185
/// Mark every DIE in this unit as kept. This function also
179186
/// marks variables as InDebugMap so that they appear in the
180187
/// reconstructed accelerator tables.
@@ -210,6 +217,10 @@ class CompileUnit {
210217
/// debug_loc section.
211218
void noteLocationAttribute(PatchLocation Attr);
212219

220+
// Record that the given DW_AT_LLVM_stmt_sequence attribute may need to be
221+
// patched later.
222+
void noteStmtSeqListAttribute(PatchLocation Attr);
223+
213224
/// Add a name accelerator entry for \a Die with \a Name.
214225
void addNamespaceAccelerator(const DIE *Die, DwarfStringPoolEntryRef Name);
215226

@@ -309,6 +320,12 @@ class CompileUnit {
309320
/// location expression.
310321
LocListAttributesTy LocationAttributes;
311322

323+
// List of DW_AT_LLVM_stmt_sequence attributes that may need to be patched
324+
// after the dwarf linker rewrites the line table. During line table rewrite
325+
// the line table format might change, so we have to patch any offsets that
326+
// reference its contents.
327+
StmtSeqListAttributesTy StmtSeqListAttributes;
328+
312329
/// Accelerator entries for the unit, both for the pub*
313330
/// sections and the apple* ones.
314331
/// @{

llvm/include/llvm/DWARFLinker/Classic/DWARFStreamer.h

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -149,10 +149,13 @@ class DwarfStreamer : public DwarfEmitter {
149149
}
150150

151151
/// Emit .debug_line table entry for specified \p LineTable
152-
void emitLineTableForUnit(const DWARFDebugLine::LineTable &LineTable,
153-
const CompileUnit &Unit,
154-
OffsetsStringPool &DebugStrPool,
155-
OffsetsStringPool &DebugLineStrPool) override;
152+
/// The optional parameter RowOffsets, if provided, will be populated with the
153+
/// offsets of each line table row in the output .debug_line section.
154+
void
155+
emitLineTableForUnit(const DWARFDebugLine::LineTable &LineTable,
156+
const CompileUnit &Unit, OffsetsStringPool &DebugStrPool,
157+
OffsetsStringPool &DebugLineStrPool,
158+
std::vector<uint64_t> *RowOffsets = nullptr) override;
156159

157160
uint64_t getLineSectionSize() const override { return LineSectionSize; }
158161

@@ -266,7 +269,8 @@ class DwarfStreamer : public DwarfEmitter {
266269
const DWARFDebugLine::Prologue &P, OffsetsStringPool &DebugStrPool,
267270
OffsetsStringPool &DebugLineStrPool);
268271
void emitLineTableRows(const DWARFDebugLine::LineTable &LineTable,
269-
MCSymbol *LineEndSym, unsigned AddressByteSize);
272+
MCSymbol *LineEndSym, unsigned AddressByteSize,
273+
std::vector<uint64_t> *RowOffsets = nullptr);
270274
void emitIntOffset(uint64_t Offset, dwarf::DwarfFormat Format,
271275
uint64_t &SectionSize);
272276
void emitLabelDifference(const MCSymbol *Hi, const MCSymbol *Lo,

llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp

Lines changed: 129 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1447,6 +1447,18 @@ unsigned DWARFLinker::DIECloner::cloneScalarAttribute(
14471447
->sizeOf(Unit.getOrigUnit().getFormParams());
14481448
}
14491449

1450+
if (AttrSpec.Attr == dwarf::DW_AT_LLVM_stmt_sequence) {
1451+
// If needed, we'll patch this sec_offset later with the correct offset.
1452+
auto Patch = Die.addValue(DIEAlloc, dwarf::Attribute(AttrSpec.Attr),
1453+
dwarf::DW_FORM_sec_offset,
1454+
DIEInteger(*Val.getAsSectionOffset()));
1455+
1456+
// Record this patch location so that it can be fixed up later.
1457+
Unit.noteStmtSeqListAttribute(Patch);
1458+
1459+
return Unit.getOrigUnit().getFormParams().getDwarfOffsetByteSize();
1460+
}
1461+
14501462
if (LLVM_UNLIKELY(Linker.Options.Update)) {
14511463
if (auto OptionalValue = Val.getAsUnsignedConstant())
14521464
Value = *OptionalValue;
@@ -2081,29 +2093,43 @@ void DWARFLinker::DIECloner::emitDebugAddrSection(
20812093
Emitter->emitDwarfDebugAddrsFooter(Unit, EndLabel);
20822094
}
20832095

2096+
/// A helper struct to help keep track of the association between the input and
2097+
/// output rows during line table rewriting. This is used to patch
2098+
/// DW_AT_LLVM_stmt_sequence attributes, which reference a particular line table
2099+
/// row.
2100+
struct TrackedRow {
2101+
DWARFDebugLine::Row Row;
2102+
size_t OriginalRowIndex;
2103+
bool isStartSeqInOutput;
2104+
};
2105+
20842106
/// Insert the new line info sequence \p Seq into the current
20852107
/// set of already linked line info \p Rows.
2086-
static void insertLineSequence(std::vector<DWARFDebugLine::Row> &Seq,
2087-
std::vector<DWARFDebugLine::Row> &Rows) {
2108+
static void insertLineSequence(std::vector<TrackedRow> &Seq,
2109+
std::vector<TrackedRow> &Rows) {
20882110
if (Seq.empty())
20892111
return;
20902112

2091-
if (!Rows.empty() && Rows.back().Address < Seq.front().Address) {
2113+
// Mark the first row in Seq to indicate it is the start of a sequence
2114+
// in the output line table.
2115+
Seq.front().isStartSeqInOutput = true;
2116+
2117+
if (!Rows.empty() && Rows.back().Row.Address < Seq.front().Row.Address) {
20922118
llvm::append_range(Rows, Seq);
20932119
Seq.clear();
20942120
return;
20952121
}
20962122

2097-
object::SectionedAddress Front = Seq.front().Address;
2123+
object::SectionedAddress Front = Seq.front().Row.Address;
20982124
auto InsertPoint = partition_point(
2099-
Rows, [=](const DWARFDebugLine::Row &O) { return O.Address < Front; });
2125+
Rows, [=](const TrackedRow &O) { return O.Row.Address < Front; });
21002126

21012127
// FIXME: this only removes the unneeded end_sequence if the
21022128
// sequences have been inserted in order. Using a global sort like
2103-
// described in generateLineTableForUnit() and delaying the end_sequene
2129+
// described in generateLineTableForUnit() and delaying the end_sequence
21042130
// elimination to emitLineTableForUnit() we can get rid of all of them.
2105-
if (InsertPoint != Rows.end() && InsertPoint->Address == Front &&
2106-
InsertPoint->EndSequence) {
2131+
if (InsertPoint != Rows.end() && InsertPoint->Row.Address == Front &&
2132+
InsertPoint->Row.EndSequence) {
21072133
*InsertPoint = Seq.front();
21082134
Rows.insert(InsertPoint + 1, Seq.begin() + 1, Seq.end());
21092135
} else {
@@ -2171,75 +2197,144 @@ void DWARFLinker::DIECloner::generateLineTableForUnit(CompileUnit &Unit) {
21712197
LineTable.Rows.clear();
21722198

21732199
LineTable.Sequences = LT->Sequences;
2200+
2201+
Emitter->emitLineTableForUnit(LineTable, Unit, DebugStrPool,
2202+
DebugLineStrPool);
21742203
} else {
2175-
// This vector is the output line table.
2176-
std::vector<DWARFDebugLine::Row> NewRows;
2177-
NewRows.reserve(LT->Rows.size());
2204+
// Create TrackedRow objects for all input rows.
2205+
std::vector<TrackedRow> InputRows;
2206+
InputRows.reserve(LT->Rows.size());
2207+
for (size_t i = 0; i < LT->Rows.size(); i++)
2208+
InputRows.emplace_back(TrackedRow{LT->Rows[i], i, false});
2209+
2210+
// This vector is the output line table (still in TrackedRow form).
2211+
std::vector<TrackedRow> OutputRows;
2212+
OutputRows.reserve(InputRows.size());
21782213

21792214
// Current sequence of rows being extracted, before being inserted
2180-
// in NewRows.
2181-
std::vector<DWARFDebugLine::Row> Seq;
2215+
// in OutputRows.
2216+
std::vector<TrackedRow> Seq;
2217+
Seq.reserve(InputRows.size());
21822218

21832219
const auto &FunctionRanges = Unit.getFunctionRanges();
21842220
std::optional<AddressRangeValuePair> CurrRange;
21852221

21862222
// FIXME: This logic is meant to generate exactly the same output as
21872223
// Darwin's classic dsymutil. There is a nicer way to implement this
2188-
// by simply putting all the relocated line info in NewRows and simply
2189-
// sorting NewRows before passing it to emitLineTableForUnit. This
2224+
// by simply putting all the relocated line info in OutputRows and simply
2225+
// sorting OutputRows before passing it to emitLineTableForUnit. This
21902226
// should be correct as sequences for a function should stay
21912227
// together in the sorted output. There are a few corner cases that
21922228
// look suspicious though, and that required to implement the logic
21932229
// this way. Revisit that once initial validation is finished.
21942230

21952231
// Iterate over the object file line info and extract the sequences
21962232
// that correspond to linked functions.
2197-
for (DWARFDebugLine::Row Row : LT->Rows) {
2233+
for (size_t i = 0; i < InputRows.size(); i++) {
2234+
TrackedRow TR = InputRows[i];
2235+
21982236
// Check whether we stepped out of the range. The range is
2199-
// half-open, but consider accept the end address of the range if
2237+
// half-open, but consider accepting the end address of the range if
22002238
// it is marked as end_sequence in the input (because in that
22012239
// case, the relocation offset is accurate and that entry won't
22022240
// serve as the start of another function).
2203-
if (!CurrRange || !CurrRange->Range.contains(Row.Address.Address)) {
2204-
// We just stepped out of a known range. Insert a end_sequence
2241+
if (!CurrRange || !CurrRange->Range.contains(TR.Row.Address.Address)) {
2242+
// We just stepped out of a known range. Insert an end_sequence
22052243
// corresponding to the end of the range.
22062244
uint64_t StopAddress =
22072245
CurrRange ? CurrRange->Range.end() + CurrRange->Value : -1ULL;
2208-
CurrRange = FunctionRanges.getRangeThatContains(Row.Address.Address);
2246+
CurrRange =
2247+
FunctionRanges.getRangeThatContains(TR.Row.Address.Address);
22092248
if (StopAddress != -1ULL && !Seq.empty()) {
22102249
// Insert end sequence row with the computed end address, but
22112250
// the same line as the previous one.
22122251
auto NextLine = Seq.back();
2213-
NextLine.Address.Address = StopAddress;
2214-
NextLine.EndSequence = 1;
2215-
NextLine.PrologueEnd = 0;
2216-
NextLine.BasicBlock = 0;
2217-
NextLine.EpilogueBegin = 0;
2252+
NextLine.Row.Address.Address = StopAddress;
2253+
NextLine.Row.EndSequence = 1;
2254+
NextLine.Row.PrologueEnd = 0;
2255+
NextLine.Row.BasicBlock = 0;
2256+
NextLine.Row.EpilogueBegin = 0;
22182257
Seq.push_back(NextLine);
2219-
insertLineSequence(Seq, NewRows);
2258+
insertLineSequence(Seq, OutputRows);
22202259
}
22212260

22222261
if (!CurrRange)
22232262
continue;
22242263
}
22252264

22262265
// Ignore empty sequences.
2227-
if (Row.EndSequence && Seq.empty())
2266+
if (TR.Row.EndSequence && Seq.empty())
22282267
continue;
22292268

22302269
// Relocate row address and add it to the current sequence.
2231-
Row.Address.Address += CurrRange->Value;
2232-
Seq.emplace_back(Row);
2270+
TR.Row.Address.Address += CurrRange->Value;
2271+
Seq.push_back(TR);
22332272

2234-
if (Row.EndSequence)
2235-
insertLineSequence(Seq, NewRows);
2273+
if (TR.Row.EndSequence)
2274+
insertLineSequence(Seq, OutputRows);
22362275
}
22372276

2238-
LineTable.Rows = std::move(NewRows);
2277+
// Materialize the tracked rows into final DWARFDebugLine::Row objects.
2278+
LineTable.Rows.clear();
2279+
LineTable.Rows.reserve(OutputRows.size());
2280+
for (auto &TR : OutputRows)
2281+
LineTable.Rows.push_back(TR.Row);
2282+
2283+
// Use OutputRowOffsets to store the offsets of each line table row in the
2284+
// output .debug_line section.
2285+
std::vector<uint64_t> OutputRowOffsets;
2286+
2287+
// The unit might not have any DW_AT_LLVM_stmt_sequence attributes, so use
2288+
// hasStmtSeq to skip the patching logic.
2289+
bool hasStmtSeq = Unit.getStmtSeqListAttributes().size() > 0;
2290+
Emitter->emitLineTableForUnit(LineTable, Unit, DebugStrPool,
2291+
DebugLineStrPool,
2292+
hasStmtSeq ? &OutputRowOffsets : nullptr);
2293+
2294+
if (hasStmtSeq) {
2295+
assert(OutputRowOffsets.size() == OutputRows.size() &&
2296+
"must have an offset for each row");
2297+
2298+
// Create a map of stmt sequence offsets to original row indices.
2299+
DenseMap<uint64_t, unsigned> SeqOffToOrigRow;
2300+
for (const DWARFDebugLine::Sequence &Seq : LT->Sequences)
2301+
SeqOffToOrigRow[Seq.StmtSeqOffset] = Seq.FirstRowIndex;
2302+
2303+
// Create a map of original row indices to new row indices.
2304+
DenseMap<size_t, size_t> OrigRowToNewRow;
2305+
for (size_t i = 0; i < OutputRows.size(); ++i)
2306+
OrigRowToNewRow[OutputRows[i].OriginalRowIndex] = i;
2307+
2308+
// Patch DW_AT_LLVM_stmt_sequence attributes in the compile unit DIE
2309+
// with the correct offset into the .debug_line section.
2310+
for (const auto &StmtSeq : Unit.getStmtSeqListAttributes()) {
2311+
uint64_t OrigStmtSeq = StmtSeq.get();
2312+
// 1. Get the original row index from the stmt list offset.
2313+
auto OrigRowIter = SeqOffToOrigRow.find(OrigStmtSeq);
2314+
assert(OrigRowIter != SeqOffToOrigRow.end() &&
2315+
"Stmt list offset not found in sequence offsets map");
2316+
size_t OrigRowIndex = OrigRowIter->second;
2317+
2318+
// 2. Get the new row index from the original row index.
2319+
auto NewRowIter = OrigRowToNewRow.find(OrigRowIndex);
2320+
if (NewRowIter == OrigRowToNewRow.end()) {
2321+
// If the original row index is not found in the map, update the
2322+
// stmt_sequence attribute to the 'invalid offset' magic value.
2323+
StmtSeq.set(UINT64_MAX);
2324+
continue;
2325+
}
2326+
2327+
// 3. Get the offset of the new row in the output .debug_line section.
2328+
assert(NewRowIter->second < OutputRowOffsets.size() &&
2329+
"New row index out of bounds");
2330+
uint64_t NewStmtSeqOffset = OutputRowOffsets[NewRowIter->second];
2331+
2332+
// 4. Patch the stmt_list attribute with the new offset.
2333+
StmtSeq.set(NewStmtSeqOffset);
2334+
}
2335+
}
22392336
}
22402337

2241-
Emitter->emitLineTableForUnit(LineTable, Unit, DebugStrPool,
2242-
DebugLineStrPool);
22432338
} else
22442339
Linker.reportWarning("Cann't load line table.", ObjFile);
22452340
}

llvm/lib/DWARFLinker/Classic/DWARFLinkerCompileUnit.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,10 @@ void CompileUnit::noteLocationAttribute(PatchLocation Attr) {
185185
LocationAttributes.emplace_back(Attr);
186186
}
187187

188+
void CompileUnit::noteStmtSeqListAttribute(PatchLocation Attr) {
189+
StmtSeqListAttributes.emplace_back(Attr);
190+
}
191+
188192
void CompileUnit::addNamespaceAccelerator(const DIE *Die,
189193
DwarfStringPoolEntryRef Name) {
190194
Namespaces.emplace_back(Name, Die);

llvm/lib/DWARFLinker/Classic/DWARFStreamer.cpp

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -809,7 +809,8 @@ void DwarfStreamer::emitDwarfDebugLocListsTableFragment(
809809

810810
void DwarfStreamer::emitLineTableForUnit(
811811
const DWARFDebugLine::LineTable &LineTable, const CompileUnit &Unit,
812-
OffsetsStringPool &DebugStrPool, OffsetsStringPool &DebugLineStrPool) {
812+
OffsetsStringPool &DebugStrPool, OffsetsStringPool &DebugLineStrPool,
813+
std::vector<uint64_t> *RowOffsets) {
813814
// Switch to the section where the table will be emitted into.
814815
MS->switchSection(MC->getObjectFileInfo()->getDwarfLineSection());
815816

@@ -830,7 +831,7 @@ void DwarfStreamer::emitLineTableForUnit(
830831

831832
// Emit rows.
832833
emitLineTableRows(LineTable, LineEndSym,
833-
Unit.getOrigUnit().getAddressByteSize());
834+
Unit.getOrigUnit().getAddressByteSize(), RowOffsets);
834835
}
835836

836837
void DwarfStreamer::emitLineTablePrologue(const DWARFDebugLine::Prologue &P,
@@ -1036,7 +1037,7 @@ void DwarfStreamer::emitLineTableProloguePayload(
10361037

10371038
void DwarfStreamer::emitLineTableRows(
10381039
const DWARFDebugLine::LineTable &LineTable, MCSymbol *LineEndSym,
1039-
unsigned AddressByteSize) {
1040+
unsigned AddressByteSize, std::vector<uint64_t> *RowOffsets) {
10401041

10411042
MCDwarfLineTableParams Params;
10421043
Params.DWARF2LineOpcodeBase = LineTable.Prologue.OpcodeBase;
@@ -1068,6 +1069,11 @@ void DwarfStreamer::emitLineTableRows(
10681069
unsigned RowsSinceLastSequence = 0;
10691070

10701071
for (const DWARFDebugLine::Row &Row : LineTable.Rows) {
1072+
// If we're tracking row offsets, record the current section size as the
1073+
// offset of this row.
1074+
if (RowOffsets)
1075+
RowOffsets->push_back(LineSectionSize);
1076+
10711077
int64_t AddressDelta;
10721078
if (Address == -1ULL) {
10731079
MS->emitIntValue(dwarf::DW_LNS_extended_op, 1);

0 commit comments

Comments
 (0)