Skip to content

Support compressed ELF sections #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

GHF
Copy link

@GHF GHF commented May 28, 2025

Note

Work in progress

For extremely lightweight debug symbolication (lighter than MiniDebugInfo), we've been using debug-stripped binaries with intact .symtab (generated with an objcopy --keep-symbols invocation). However, .strtab + .symtab contents still comprise the majority of the file footprint.

I realized that .symtab and .strtab should be extremely compressible and there's no rule that non-.debug ELF sections can't be SHF_COMPRESSED, so I gave it a try:

> elfutils-elfcompress --force --permissive --verbose --type=zstd --name=.strtab --name=.symtab a.out
processing: a.out
[47] .symtab compressed (11185272 => 2504475 22.39%)
[48] .strtab compressed (66983971 => 4424907 6.61%)

This produces a binary that is actually still symbolicable by lldb (but not gdb: BFD: BFD (GNU Binutils) 2.42.50 internal error, aborting at /usr/src/debug/gdb-cross-canadian-x86-64/15.1/bfd/bfd.c:1236 in _bfd_doprnt).

I wrote up some patches to load from compressed tables but I haven't written tests or a CMake feature flag. Initial tests with my input on a few IO-constrained embedded devices show that the zstd-compressed symbols actually load much faster while zlib is right around break even.

What do you think as far as the concept? AFAIK while this is legal per ELF spec, there's no compiler support to generate compressed table sections. On the flip side, the ecosystem support has to start somewhere and I happen to have a decent use case for right here. I don't think there's much cost to adding and maintaining this feature but it's your call.

GHF added 3 commits May 23, 2025 11:25
Add parsing for sh_flags ELF section header field, detecting whether
SHF_SHF_COMPRESSED is set on any table sections detected.
@jeremy-rifkin
Copy link
Owner

Thanks for taking the time to put this together and contribute to the project!

I have some initial thoughts / considerations:

  • Firstly, I'm in general happy to support anything that's useful to people for debugging/diagnostics. I do have complexity and testing in mind as well, which it sounds like you do too
  • This adds some notable complexity and it's something I'd want to be able to thoroughly test, which might be tricky given tools don't currently take advantage of compressed strtabs
  • This adds a dependency on zlib and zstd to cpptrace directly, which might not be a massive lift given that libdwarf needs those libraries but it does complicate cpptrace's cmake a bit more
  • Cpptrace might be configured to not use libdwarf as a back-end and that complicates things a bit with regards to zlib/zstd

I think overall I'm open to this, I just have concerns about added complexity. I'd be much more eager to accept the complexity if tools were actually using this (and if gdb didn't error about it, I take that as a sign this is pretty niche). Overall I'm impressed that this PR (even though it's a draft) isn't as huge as I might have expected. Especially if it's behind a cmake feature flag that makes it easier to justify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants