Skip to content

stage2-generated binaries are slower and more bloated than stage1-generated binaries #11498

Closed
@andrewrk

Description

@andrewrk

Here I build the self-hosted compiler with -OReleaseFast with stage1 and with stage2, then compare their sizes:

./zig build -p stage2-release -Denable-llvm -Dskip-install-lib-files -Drelease
./stage3/bin/zig build -p stage3-release -Denable-llvm -Dskip-install-lib-files -Drelease
cp stage2-release/bin/zig ~/tmp/zig2
cp stage3-release/bin/zig ~/tmp/zig3

Without stripping, we have:

$ ls -hl ~/tmp/zig{2,3}
-rwxr-xr-x 1 andy users 190M Apr 22 09:27 /home/andy/tmp/zig2
-rwxr-xr-x 1 andy users 205M Apr 22 09:27 /home/andy/tmp/zig3

However that may be due to better debug info in stage2-generated binaries, so let's look at stripped versions:

$ strip ~/tmp/zig2
$ strip ~/tmp/zig3
$ ls -hl ~/tmp/zig{2,3}
-rwxr-xr-x 1 andy users 138M Apr 22 09:30 /home/andy/tmp/zig2
-rwxr-xr-x 1 andy users 140M Apr 22 09:30 /home/andy/tmp/zig3

So there is an extra 2 MiB here unaccounted for. Also, I would have expected smaller binary size because in many ways we generate more efficient LLVM IR in stage2 than in stage1.

It is indeed the .text segment:

$ bloaty zig3 -- zig2
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +3.0% +2.63Mi  +3.0% +2.63Mi    .text
  +0.5%  +143Ki  +0.5%  +143Ki    .rodata
  +0.5% +39.0Ki  +0.5% +39.0Ki    .eh_frame
  +0.1%    +672  +0.1%    +672    .eh_frame_hdr
  +0.5%    +192  +0.5%    +192    .data
  [NEW]     +80  [NEW]     +16    .tdata
  [ = ]       0   +50%      +8    [LOAD #5 [RW]]
  +2.2%      +7  [ = ]       0    .shstrtab
  +4.2%      +1  [ = ]       0    [Unmapped]
  -0.3%      -2  -0.3%      -2    .gnu.version
  -0.2%      -6  -0.2%      -6    .dynstr
  -0.3%      -8  -0.3%      -8    .got.plt
  -0.3%      -8  -0.3%      -8    .hash
  -0.3%     -16  -0.3%     -16    .plt
  -0.3%     -24  -0.3%     -24    .dynsym
  -0.3%     -24  -0.3%     -24    .rela.plt
  [ = ]       0 -38.5%     -40    .tbss
  [ = ]       0  -0.0%    -160    .bss
  -0.8% -71.2Ki  -0.8% -71.3Ki    .data.rel.ro
  +2.0% +2.74Mi  +2.0% +2.74Mi    TOTAL

I also noticed that stage2-generated binaries are slightly worse in terms of runtime performance, possibly for the same reason:

$ time ./stage2-release/bin/zig build -p stage3 -Dskip-install-lib-files -Denable-llvm

real	0m44.388s
user	0m43.291s
sys	0m1.522s

$ time ./stage3-release/bin/zig build -p stage3 -Dskip-install-lib-files -Denable-llvm

real	0m46.431s
user	0m45.316s
sys	0m1.748s

Metadata

Metadata

Assignees

No one assigned

    Labels

    backend-llvmThe LLVM backend outputs an LLVM IR Module.enhancementSolving this issue will likely involve adding new logic or components to the codebase.frontendTokenization, parsing, AstGen, Sema, and Liveness.optimization

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions