Skip to content

sqfstar: improve tar parser stability #309

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 14, 2025
Merged

Conversation

ferivoz
Copy link
Contributor

@ferivoz ferivoz commented Mar 21, 2025

These commits fix different issues in sqfstar's tar parser and partially
depend on each other to mitigate the discussed issues fully:

  1. Simple undefined C behaviour

Signed integer overflows are undefined behaviour and should be avoided.
From a C perspective, I do not care too much about them, but they can
lead to crashes later on in the code, e.g. because sqfstar does not
expect to encouter actually negative numbers for sizes later on.

Commits:

  • sqfstar: prevent tarfile size overflows
  • sqfstar: prevent st_rdev signed integer overflows
  • sqfstar: terminate pax extended header string

Proof of Concepts:

This crashes sqfstar because of negative sizes:

cat > poc.tar.zst.b64 << EOF
KLUv/WQABe0BAPh1d3UAMDAwNjQ0IAAwgP/+MCAyMgAgMAB1c3RhcgAwCwCQ4H4AGsCGAL9BMgCX
JdW6AGKcxpUNoD09ASJ2tBTh
EOF
base64 -d poc.tar.zst.b64 | zstd -d > poc.tar
sqfstar poc.sqfs < poc.tar

The crash occurs because the parsed number -2 leads to a read of 0 bytes
and a fragment added into the fragment processor. The processor itself
tries to create a checksum and eventually reads past the end of
allocated memory.

This triggers a signed integer overflow (compile with UBSAN):

cat > poc.tar.zst.b64 << EOF
KLUv/WQABd0BABQCdXd1ADAwMDY0NCAAMCA2MTA1ACAzAHVzdGFyADAwAID/CiAwg3qk+DAKtpiX
DmDKJ3nBFsN1euJEqAiTSA==
EOF
base64 -d poc.tar.zst.b64 | zstd -d > poc.tar
sqfstar poc.sqfs < poc.tar

This triggers an out ouf boundary read:

cat > poc.tar.zst.b64 << EOF
KLUv/WQAEc0BAAQCdXd1ADAwMDA2NDQAMDE2NjA3ACB4AHVzdGFyADAwADAIAPwHvpgD6gBc4hc3
bnK30gMYTk+ACCQC//M=
EOF
base64 -d poc.tar.zst.b64 | zstd -d > poc.tar
sqfstar poc.sqfs < poc.tar

It is not covered by ASAN and hard to see in reality as a crash.
In fact, it took an OpenBSD system with its guard pages to prove that
out of boundary access can occur.

  1. Compatibility with other tar implementations

The sqfstar code has some unique traits in it which allows different
views on the same tar archive with different tools, e.g. GNU tar,
libarchive, docker (Go), and so on. It is possible to create a tar
archive which sneaks in special information into a squash filesystem
which otherwise is not detected. Sylab's singularity utilizes sqfstar
for conversions from OCI containers to sqfs. By making sqfstar behave
more like the other tar parsers and stay closer to the POSIX standard,
such attacks will be harder to be performed.

Commits:

  • sqfstar: check isextended on bit level
  • sqfstar: expect exactly one blank in pax header

Proof of Concepts:

This file can be parsed by GNU tar and libarchive, not by sqfstar:

cat > poc.tar.zst.b64 << EOF
KLUv/aQACAIA9AQA8oUQE8AlHWCT9gcB6gaZZJKJw6t6RAbkTaS73tnP38rIbPuqxbJWezIRXTiX
ZhiDjBAkMAjKAMS4p3I4bd2KY58kpwxKMaBwmEopwxhwgyZJDmj7GYdFd5iEOF9pPOuwa9lnahAD
uHhrKKgrUlSqVKtcrwReWVk5KC5rlSmD8gDRanvxwQys7dc+LoT3D0xPB5yHCdwkUZiVE9s9wRYD
QAAAcLdDWw==
EOF
base64 -d poc.tar.zst.b64 | zstd -d > poc.tar
sqfstar poc.sqfs < poc.tar

This file sneaks in a different link into sqfs from tar:

cat > poc.tar.zst.b64 << EOF
KLUv/WQADQ0FACJGFRpwiXONJDQBG6zXin+fK31U962UNAD+z76XAsd2l/N8NRJVlVkS6gLKGTFy
nXWR7CqCAksIw2FhTtG/V2JN83NO1SPmfx8H8v9uTkPNUklSgycRECAbIAADMSko6AGkkA+D+n5x
iAEHZTQAAri5lNbij4QBmg12LHgA2W8C7ChCwDAOo+qZzVIf4GGtU8N8URKQU/a1XjgHBcMJ01sC
o2bKaw==
EOF
base64 -d poc.tar.zst.b64 | zstd -d > poc.tar
sqfstar poc.sqfs < poc.tar
tar -tvf poc.tar
unsquashfs -d poc poc.sqfs
ls -l poc

You can see that the poc.sqfs contains a link to a different target than
the tar file implies with GNU tar or libarchive's bsdtar.

  1. Attacker controlled out of boundary write

The sparse map parser allows a targeted out of boundary write with
attacker controlled data. Together with heap spraying due to leaking
memory if pax keywords are supplied multiple times, this might allow
an attacker to perform elevated attacks. Together with number 2, this
can be targeted specifically against sqfstar without triggering issues
with other tar implementations.

Commit:

  • sqfstar: use stricter pax number checks

Proof of Concept (compile with ASAN):

cat > poc.tar.zst.b64 << EOF
KLUv/WQAA7UDACLGFRtwa3NAk2AzqkPP0YNBwH1NI5KSytRA+l9dngx3Dvhi2IJ4VCWaAbWm0zHB
9K9NGBiKwtIWmedLeSAWUbZhzCKEgQAdz76GNbbdAZ1SSRXevky17WuOFgoApTj14Cnuxcm1OaAO
wCXeicNLQSsIgOH0BIi3SxxV
EOF
base64 -d poc.tar.zst.b64 | zstd -d > poc.tar
sqfstar poc.sqfs < poc.tar

This allows an attacker to write multiples of 16 bytes beyond a multiple
of 16 bytes of allocated memory.

  1. Tighter checks

Some values might overflow system data types, e.g. times. These commits
are merely cleanups.

Commits:

  • sqfstar: check st_mtime assignment
  • sqfstar: make 2 GB pax header limit more obvious

ferivoz added 8 commits March 27, 2025 11:54
The binary representation of sizes can overflow long long. Check for
such values and stop further processing.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
GNU tar and libarchive check the isextended byte on bit level, not as
real number. The code currently only works because isextended being
-1 is still considered true and 0 (false) occurs because the octal
parsing ends at '\0' and takes the already parsed value.

This means that 0x00 becomes 0, everything else -1 and thus true.

The compatibility ends if the ASCII character '0' is inserted, which
sqfstar interprets as 0 (false), while GNU tar and libarchive treat
it as != 0x00 and thus true.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
Reading major and minor numbers of character and block devices within a
tar archive could lead to subsequent left shifts triggering signed
integer overflows.

Prevent these overflows and check afterwards if st_rdev calculation is
suitable for the architecture or not.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
Fail with an error if read mtime cannot be represented with current
system.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
The pax extended header could be perfectly aligned with 512 bytes and
thus contain no ending \0 character.

Add it explicitly to make sure that string parsing functions never run
out of boundary.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
Pax headers are supposed to have exactly one blank between length and
keyword. Skipping multiple blanks or even tabs can lead to diverging
views on a tar file across multiple implementations.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
The code cannot read more than INT_MAX bytes, which fails because the
length value is an int and eventually compared with the original size
kept in a long long.

Print a proper error message if this happens.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
Do not use sscanf but a custom function which guarantees positive
numbers and counts the parsed bytes to mimic sscanf for easier
replacement.

Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
@plougher plougher merged commit 466380f into plougher:master May 14, 2025
@plougher
Copy link
Owner

This fixes a couple of issues, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants