Skip to content

Commit 268089b

Browse files
committed
Fix the encoding and decoding of UniqueCStringMap<T> objects when saved to cache files.
UniqueCStringMap<T> objects are a std::vector<UniqueCStringMap::Entry> objects where the Entry object contains a ConstString + T. The values in the vector are sorted first by ConstString and then by the T value. ConstString objects are simply uniqued "const char *" values and when we compare we use the actual string pointer as the value we sort by. This caused a problem when we saved the symbol table name indexes and debug info indexes to disk in one process when they were sorted, and then loaded them into another process when decoding them from the cache files. Why? Because the order in which the ConstString objects were created are now completely different and the string pointers will no longer be sorted in the new process the cache was loaded into. The unit tests created for the initial patch didn't catch the encoding and decoding issues of UniqueCStringMap<T> because they were happening in the same process and encoding and decoding would end up createing sorted UniqueCStringMap<T> objects due to the constant string pool being exactly the same. This patch does the sort and also reserves the right amount of entries in the UniqueCStringMap::m_map prior to adding them all to avoid doing multiple allocations. Added a unit test that loads an object file from yaml, and then I created a cache file for the original file and removed the cache file's signature mod time check since we will generate an object file from the YAML, and use that as the object file for the Symtab object. Then we load the cache data from the array of symtab cache bytes so that the ConstString "const char *" values will not match the current process, and verify we can lookup the 4 names from the object file in the symbol table. Differential Revision: https://reviews.llvm.org/D124572
1 parent 8bdfc73 commit 268089b

File tree

3 files changed

+436
-0
lines changed

3 files changed

+436
-0
lines changed

lldb/source/Plugins/SymbolFile/DWARF/NameToDIE.cpp

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ bool NameToDIE::Decode(const DataExtractor &data, lldb::offset_t *offset_ptr,
100100
if (identifier != kIdentifierNameToDIE)
101101
return false;
102102
const uint32_t count = data.GetU32(offset_ptr);
103+
m_map.Reserve(count);
103104
for (uint32_t i = 0; i < count; ++i) {
104105
llvm::StringRef str(strtab.Get(data.GetU32(offset_ptr)));
105106
// No empty strings allowed in the name to DIE maps.
@@ -110,6 +111,16 @@ bool NameToDIE::Decode(const DataExtractor &data, lldb::offset_t *offset_ptr,
110111
else
111112
return false;
112113
}
114+
// We must sort the UniqueCStringMap after decoding it since it is a vector
115+
// of UniqueCStringMap::Entry objects which contain a ConstString and type T.
116+
// ConstString objects are sorted by "const char *" and then type T and
117+
// the "const char *" are point values that will depend on the order in which
118+
// ConstString objects are created and in which of the 256 string pools they
119+
// are created in. So after we decode all of the entries, we must sort the
120+
// name map to ensure name lookups succeed. If we encode and decode within
121+
// the same process we wouldn't need to sort, so unit testing didn't catch
122+
// this issue when first checked in.
123+
m_map.Sort(std::less<DIERef>());
113124
return true;
114125
}
115126

lldb/source/Symbol/Symtab.cpp

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1204,6 +1204,7 @@ bool DecodeCStrMap(const DataExtractor &data, lldb::offset_t *offset_ptr,
12041204
if (identifier != kIdentifierCStrMap)
12051205
return false;
12061206
const uint32_t count = data.GetU32(offset_ptr);
1207+
cstr_map.Reserve(count);
12071208
for (uint32_t i=0; i<count; ++i)
12081209
{
12091210
llvm::StringRef str(strtab.Get(data.GetU32(offset_ptr)));
@@ -1213,6 +1214,16 @@ bool DecodeCStrMap(const DataExtractor &data, lldb::offset_t *offset_ptr,
12131214
return false;
12141215
cstr_map.Append(ConstString(str), value);
12151216
}
1217+
// We must sort the UniqueCStringMap after decoding it since it is a vector
1218+
// of UniqueCStringMap::Entry objects which contain a ConstString and type T.
1219+
// ConstString objects are sorted by "const char *" and then type T and
1220+
// the "const char *" are point values that will depend on the order in which
1221+
// ConstString objects are created and in which of the 256 string pools they
1222+
// are created in. So after we decode all of the entries, we must sort the
1223+
// name map to ensure name lookups succeed. If we encode and decode within
1224+
// the same process we wouldn't need to sort, so unit testing didn't catch
1225+
// this issue when first checked in.
1226+
cstr_map.Sort();
12161227
return true;
12171228
}
12181229

0 commit comments

Comments
 (0)