Skip to content

Commit 0e61bb4

Browse files
Merge pull request #731 from wcampbell0x2a/rel-0-23-0
2 parents 01b7a72 + 6ca770c commit 0e61bb4

File tree

7 files changed

+445
-369
lines changed

7 files changed

+445
-369
lines changed

BENCHMARK.md

Lines changed: 99 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -18,44 +18,58 @@ These benchmarks are created from `bench.bash`, on the following CPU running arc
1818

1919
```
2020
$ lscpu
21-
Architecture: x86_64
22-
CPU op-mode(s): 32-bit, 64-bit
23-
Address sizes: 39 bits physical, 48 bits virtual
24-
Byte Order: Little Endian
25-
CPU(s): 8
26-
On-line CPU(s) list: 0-7
27-
Vendor ID: GenuineIntel
28-
Model name: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
29-
CPU family: 6
30-
Model: 158
31-
Thread(s) per core: 2
32-
Core(s) per socket: 4
33-
Socket(s): 1
34-
Stepping: 9
35-
CPU(s) scaling MHz: 76%
36-
CPU max MHz: 4500.0000
37-
CPU min MHz: 800.0000
38-
BogoMIPS: 8403.00
39-
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush
40-
dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_ts
41-
c art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf
42-
pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm p
43-
cid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdra
44-
nd lahf_lm abm 3dnowprefetch cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow fle
45-
xpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid m
46-
px rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida
47-
arat pln pts hwp hwp_notify hwp_act_window hwp_epp vnmi md_clear flush_l1d arch
48-
_capabilities
21+
Architecture: x86_64
22+
CPU op-mode(s): 32-bit, 64-bit
23+
Address sizes: 48 bits physical, 48 bits virtual
24+
Byte Order: Little Endian
25+
CPU(s): 16
26+
On-line CPU(s) list: 0-15
27+
Vendor ID: AuthenticAMD
28+
Model name: AMD Ryzen 7 9800X3D 8-Core Processor
29+
CPU family: 26
30+
Model: 68
31+
Thread(s) per core: 2
32+
Core(s) per socket: 8
33+
Socket(s): 1
34+
Stepping: 0
35+
Frequency boost: enabled
36+
CPU(s) scaling MHz: 72%
37+
CPU max MHz: 5271.6221
38+
CPU min MHz: 603.3790
39+
BogoMIPS: 9399.97
40+
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pn
41+
i pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mw
42+
aitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx
43+
512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx_vnni avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassist
44+
s pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid bus_lock_detect movdiri movdir64b overflow_recov succor s
45+
mca fsrm avx512_vp2intersect flush_l1d amd_lbr_pmc_freeze
4946
Virtualization features:
50-
Virtualization: VT-x
47+
Virtualization: AMD-V
5148
Caches (sum of all):
52-
L1d: 128 KiB (4 instances)
53-
L1i: 128 KiB (4 instances)
54-
L2: 1 MiB (4 instances)
55-
L3: 8 MiB (1 instance)
49+
L1d: 384 KiB (8 instances)
50+
L1i: 256 KiB (8 instances)
51+
L2: 8 MiB (8 instances)
52+
L3: 96 MiB (1 instance)
5653
NUMA:
57-
NUMA node(s): 1
58-
NUMA node0 CPU(s): 0-7
54+
NUMA node(s): 1
55+
NUMA node0 CPU(s): 0-15
56+
Vulnerabilities:
57+
Gather data sampling: Not affected
58+
Ghostwrite: Not affected
59+
Indirect target selection: Not affected
60+
Itlb multihit: Not affected
61+
L1tf: Not affected
62+
Mds: Not affected
63+
Meltdown: Not affected
64+
Mmio stale data: Not affected
65+
Reg file data sampling: Not affected
66+
Retbleed: Not affected
67+
Spec rstack overflow: Mitigation; IBPB on VMEXIT only
68+
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
69+
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
70+
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP always-on; PBRSB-eIBRS Not affected; BHI Not affected
71+
Srbds: Not affected
72+
Tsx async abort: Not affected
5973
```
6074

6175
</details>
@@ -65,77 +79,78 @@ $ ./bench.bash
6579
```
6680

6781
## Wall time: `backhand/unsquashfs` vs `squashfs-tools/unsquashfs-4.6.1`
82+
### `openwrt-22.03.2-ath79-generic-tplink_archer-a7-v5-squashfs-factory.bin`
6883
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
6984
|:---|---:|---:|---:|---:|
70-
| `backhand-dist-v0.22.0-musl` | 84.7 ± 5.2 | 76.7 | 94.5 | 1.55 ± 0.12 |
71-
| `backhand-dist-musl` | 61.7 ± 2.4 | 57.1 | 66.5 | 1.13 ± 0.07 |
72-
| `backhand-dist-musl-native` | 62.3 ± 3.8 | 58.6 | 75.9 | 1.14 ± 0.09 |
73-
| `backhand-dist-gnu` | 56.1 ± 2.7 | 50.8 | 62.1 | 1.03 ± 0.07 |
74-
| `backhand-dist-gnu-native` | 54.6 ± 2.4 | 51.1 | 61.1 | 1.00 |
75-
| `squashfs-tools` | 67.9 ± 9.2 | 54.0 | 92.3 | 1.24 ± 0.18 |
85+
| `backhand-dist-v0.22.0-musl` | 41.6 ± 2.7 | 36.6 | 48.1 | 1.78 ± 0.15 |
86+
| `backhand-dist-musl` | 28.5 ± 1.7 | 25.1 | 34.1 | 1.22 ± 0.10 |
87+
| `backhand-dist-musl-native` | 28.5 ± 1.7 | 25.1 | 32.0 | 1.22 ± 0.10 |
88+
| `backhand-dist-gnu` | 23.4 ± 1.4 | 20.7 | 27.1 | 1.00 ± 0.08 |
89+
| `backhand-dist-gnu-native` | 23.4 ± 1.2 | 20.3 | 26.0 | 1.00 |
90+
| `squashfs-tools` | 71.9 ± 8.4 | 50.5 | 86.5 | 3.07 ± 0.39 |
7691
### `openwrt-22.03.2-ipq40xx-generic-netgear_ex6100v2-squashfs-factory.img`
7792
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
7893
|:---|---:|---:|---:|---:|
79-
| `backhand-dist-v0.22.0-musl` | 87.0 ± 5.4 | 78.0 | 99.5 | 1.56 ± 0.12 |
80-
| `backhand-dist-musl` | 62.3 ± 2.2 | 57.9 | 66.6 | 1.12 ± 0.07 |
81-
| `backhand-dist-musl-native` | 63.1 ± 2.0 | 58.6 | 65.9 | 1.13 ± 0.07 |
82-
| `backhand-dist-gnu` | 55.7 ± 2.7 | 51.2 | 60.8 | 1.00 |
83-
| `backhand-dist-gnu-native` | 56.1 ± 2.5 | 52.2 | 62.9 | 1.01 ± 0.07 |
84-
| `squashfs-tools` | 68.1 ± 7.3 | 55.9 | 81.5 | 1.22 ± 0.14 |
94+
| `backhand-dist-v0.22.0-musl` | 43.1 ± 2.9 | 36.4 | 48.9 | 1.82 ± 0.15 |
95+
| `backhand-dist-musl` | 29.1 ± 1.5 | 26.5 | 32.5 | 1.23 ± 0.09 |
96+
| `backhand-dist-musl-native` | 28.7 ± 1.5 | 25.8 | 34.7 | 1.21 ± 0.09 |
97+
| `backhand-dist-gnu` | 23.8 ± 1.4 | 21.2 | 27.6 | 1.01 ± 0.07 |
98+
| `backhand-dist-gnu-native` | 23.7 ± 1.1 | 20.8 | 25.6 | 1.00 |
99+
| `squashfs-tools` | 64.3 ± 10.6 | 34.5 | 83.3 | 2.72 ± 0.46 |
85100
### `870D97.squashfs`
86101
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
87102
|:---|---:|---:|---:|---:|
88-
| `backhand-dist-v0.22.0-musl` | 446.2 ± 23.7 | 408.1 | 471.3 | 2.64 ± 0.15 |
89-
| `backhand-dist-musl` | 219.6 ± 3.0 | 216.1 | 226.8 | 1.30 ± 0.03 |
90-
| `backhand-dist-musl-native` | 218.7 ± 1.6 | 216.9 | 221.2 | 1.29 ± 0.03 |
91-
| `backhand-dist-gnu` | 185.0 ± 2.7 | 181.3 | 189.6 | 1.09 ± 0.03 |
92-
| `backhand-dist-gnu-native` | 185.2 ± 3.2 | 181.7 | 191.3 | 1.09 ± 0.03 |
93-
| `squashfs-tools` | 169.3 ± 3.1 | 165.7 | 175.3 | 1.00 |
103+
| `backhand-dist-v0.22.0-musl` | 229.2 ± 5.4 | 219.7 | 236.3 | 3.38 ± 0.11 |
104+
| `backhand-dist-musl` | 84.6 ± 2.1 | 80.4 | 89.8 | 1.25 ± 0.04 |
105+
| `backhand-dist-musl-native` | 83.4 ± 1.8 | 79.3 | 87.8 | 1.23 ± 0.04 |
106+
| `backhand-dist-gnu` | 67.9 ± 1.6 | 65.1 | 71.5 | 1.00 ± 0.03 |
107+
| `backhand-dist-gnu-native` | 67.9 ± 1.7 | 64.9 | 71.2 | 1.00 |
108+
| `squashfs-tools` | 88.5 ± 12.0 | 67.8 | 111.3 | 1.30 ± 0.18 |
94109
### `img-1571203182_vol-ubi_rootfs.ubifs`
95110
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
96111
|:---|---:|---:|---:|---:|
97-
| `backhand-dist-v0.22.0-musl` | 375.1 ± 10.6 | 360.2 | 389.1 | 1.57 ± 0.06 |
98-
| `backhand-dist-musl` | 281.9 ± 7.4 | 270.0 | 291.9 | 1.18 ± 0.04 |
99-
| `backhand-dist-musl-native` | 277.9 ± 6.1 | 268.3 | 289.3 | 1.17 ± 0.04 |
100-
| `backhand-dist-gnu` | 238.3 ± 6.1 | 229.5 | 249.9 | 1.00 |
101-
| `backhand-dist-gnu-native` | 238.5 ± 3.5 | 233.5 | 244.3 | 1.00 ± 0.03 |
102-
| `squashfs-tools` | 261.8 ± 10.6 | 248.6 | 277.1 | 1.10 ± 0.05 |
112+
| `backhand-dist-v0.22.0-musl` | 144.9 ± 5.4 | 137.0 | 157.9 | 1.77 ± 0.12 |
113+
| `backhand-dist-musl` | 98.4 ± 2.6 | 92.6 | 102.4 | 1.20 ± 0.08 |
114+
| `backhand-dist-musl-native` | 97.0 ± 3.7 | 91.1 | 105.1 | 1.19 ± 0.08 |
115+
| `backhand-dist-gnu` | 81.7 ± 4.7 | 76.6 | 95.0 | 1.00 |
116+
| `backhand-dist-gnu-native` | 81.7 ± 3.9 | 77.0 | 91.6 | 1.00 ± 0.08 |
117+
| `squashfs-tools` | 113.5 ± 6.6 | 95.6 | 126.6 | 1.39 ± 0.11 |
103118
### `2611E3.squashfs`
104119
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
105120
|:---|---:|---:|---:|---:|
106-
| `backhand-dist-v0.22.0-musl` | 183.7 ± 6.1 | 175.6 | 195.4 | 1.62 ± 0.08 |
107-
| `backhand-dist-musl` | 132.2 ± 4.0 | 126.1 | 138.9 | 1.17 ± 0.06 |
108-
| `backhand-dist-musl-native` | 132.8 ± 4.6 | 125.9 | 141.6 | 1.17 ± 0.06 |
109-
| `backhand-dist-gnu` | 113.0 ± 4.5 | 108.8 | 122.7 | 1.00 |
110-
| `backhand-dist-gnu-native` | 113.5 ± 4.4 | 106.8 | 123.9 | 1.00 ± 0.06 |
111-
| `squashfs-tools` | 150.9 ± 11.2 | 131.7 | 161.5 | 1.34 ± 0.11 |
121+
| `backhand-dist-v0.22.0-musl` | 76.7 ± 5.0 | 68.9 | 89.2 | 1.83 ± 0.16 |
122+
| `backhand-dist-musl` | 52.0 ± 2.5 | 47.3 | 60.0 | 1.24 ± 0.09 |
123+
| `backhand-dist-musl-native` | 51.6 ± 3.4 | 47.4 | 60.0 | 1.23 ± 0.11 |
124+
| `backhand-dist-gnu` | 42.0 ± 2.5 | 38.1 | 47.7 | 1.00 |
125+
| `backhand-dist-gnu-native` | 42.7 ± 2.8 | 37.8 | 49.7 | 1.02 ± 0.09 |
126+
| `squashfs-tools` | 109.6 ± 9.8 | 88.1 | 123.5 | 2.61 ± 0.28 |
112127
### `Plexamp-4.6.1.AppImage`
113128
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
114129
|:---|---:|---:|---:|---:|
115-
| `backhand-dist-v0.22.0-musl` | 559.5 ± 7.4 | 551.4 | 574.2 | 2.72 ± 0.05 |
116-
| `backhand-dist-musl` | 313.2 ± 2.0 | 311.1 | 317.4 | 1.52 ± 0.02 |
117-
| `backhand-dist-musl-native` | 300.1 ± 2.6 | 295.9 | 304.0 | 1.46 ± 0.02 |
118-
| `backhand-dist-gnu` | 296.5 ± 1.9 | 293.6 | 299.2 | 1.44 ± 0.02 |
119-
| `backhand-dist-gnu-native` | 285.8 ± 2.5 | 280.9 | 288.4 | 1.39 ± 0.02 |
120-
| `squashfs-tools` | 205.4 ± 2.5 | 201.0 | 208.2 | 1.00 |
130+
| `backhand-dist-v0.22.0-musl` | 288.4 ± 1.6 | 286.2 | 291.7 | 3.74 ± 0.37 |
131+
| `backhand-dist-musl` | 123.3 ± 1.7 | 120.9 | 127.4 | 1.60 ± 0.16 |
132+
| `backhand-dist-musl-native` | 122.7 ± 1.3 | 120.5 | 125.1 | 1.59 ± 0.16 |
133+
| `backhand-dist-gnu` | 113.0 ± 2.7 | 109.3 | 117.7 | 1.47 ± 0.15 |
134+
| `backhand-dist-gnu-native` | 109.9 ± 2.3 | 106.6 | 115.8 | 1.43 ± 0.15 |
135+
| `squashfs-tools` | 77.1 ± 7.7 | 66.0 | 88.9 | 1.00 |
121136
### `crates-io.squashfs`
122-
| Command | Mean [s] | Min [s] | Max [s] | Relative |
137+
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
123138
|:---|---:|---:|---:|---:|
124-
| `backhand-dist-v0.22.0-musl` | 1.225 ± 0.009 | 1.207 | 1.238 | 1.00 |
125-
| `backhand-dist-musl` | 1.243 ± 0.008 | 1.232 | 1.256 | 1.01 ± 0.01 |
126-
| `backhand-dist-musl-native` | 1.245 ± 0.008 | 1.234 | 1.259 | 1.02 ± 0.01 |
127-
| `backhand-dist-gnu` | 1.454 ± 0.013 | 1.422 | 1.465 | 1.19 ± 0.01 |
128-
| `backhand-dist-gnu-native` | 1.463 ± 0.009 | 1.448 | 1.480 | 1.19 ± 0.01 |
129-
| `squashfs-tools` | 1.816 ± 0.025 | 1.790 | 1.860 | 1.48 ± 0.02 |
139+
| `backhand-dist-v0.22.0-musl` | 351.7 ± 1.6 | 349.3 | 354.4 | 1.00 ± 0.01 |
140+
| `backhand-dist-musl` | 350.6 ± 2.1 | 347.9 | 354.6 | 1.00 |
141+
| `backhand-dist-musl-native` | 355.3 ± 4.0 | 350.8 | 364.4 | 1.01 ± 0.01 |
142+
| `backhand-dist-gnu` | 403.7 ± 4.0 | 398.0 | 408.7 | 1.15 ± 0.01 |
143+
| `backhand-dist-gnu-native` | 412.5 ± 4.1 | 404.6 | 418.9 | 1.18 ± 0.01 |
144+
| `squashfs-tools` | 754.3 ± 12.6 | 734.8 | 771.6 | 2.15 ± 0.04 |
130145
### `airootfs.sfs`
131146
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
132147
|:---|---:|---:|---:|---:|
133-
| `backhand-dist-v0.22.0-musl` | 5.7 ± 0.3 | 4.8 | 6.4 | 1.00 |
134-
| `backhand-dist-musl` | 6.4 ± 0.3 | 5.1 | 6.9 | 1.13 ± 0.07 |
135-
| `backhand-dist-musl-native` | 6.1 ± 0.3 | 4.9 | 6.5 | 1.07 ± 0.07 |
136-
| `backhand-dist-gnu` | 6.0 ± 0.4 | 4.7 | 6.5 | 1.05 ± 0.08 |
137-
| `backhand-dist-gnu-native` | 5.9 ± 0.2 | 5.6 | 6.3 | 1.04 ± 0.06 |
138-
| `squashfs-tools` | 6.7 ± 0.3 | 5.9 | 7.2 | 1.18 ± 0.07 |
148+
| `backhand-dist-v0.22.0-musl` | 2.8 ± 0.2 | 2.0 | 3.6 | 1.02 ± 0.13 |
149+
| `backhand-dist-musl` | 3.2 ± 0.3 | 2.0 | 4.1 | 1.17 ± 0.17 |
150+
| `backhand-dist-musl-native` | 3.1 ± 0.2 | 1.9 | 3.5 | 1.15 ± 0.14 |
151+
| `backhand-dist-gnu` | 2.7 ± 0.3 | 1.7 | 3.6 | 1.00 |
152+
| `backhand-dist-gnu-native` | 3.2 ± 0.3 | 1.8 | 3.8 | 1.20 ± 0.17 |
153+
| `squashfs-tools` | 3.4 ± 0.2 | 2.0 | 3.9 | 1.25 ± 0.16 |
139154

140155
## Heap Usage: `backhand/unsquashfs` vs `squashfs-tools/unsquashfs-4.6.1`
141156
```

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,19 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

88
## [Unreleased]
9+
10+
## [v0.23.0] - 2025-06-19
911
### `backhand`
1012
- Add feature `parallel`, which enables internal parallelization when de-compressing data. When `parallel` is not used, the old behavior of reading without parallelization is used. ([#716](https://github.com/wcampbell0x2a/backhand/pull/716))
13+
- This substantially increases the speed of backhand-unsquashfs, removing about half of the wall time! See the new benchmarks for details.
1114
- Fix misaligned pointer loads when using Deku. thanks @bdash! ([#713](https://github.com/wcampbell0x2a/backhand/pull/713))
15+
- Fix incorrect assertion about file size ([#730](https://github.com/wcampbell0x2a/backhand/pull/730))
1216
- Use rust library `liblzma` instead of `xz2`. This bumps the version of XZ used to 5.8.1. ([#712](https://github.com/wcampbell0x2a/backhand/pull/712))
1317
- This also removes the need for `HAVE_DECODER` defines/CFLAGS when building xz, as `liblzma` enables them when building by default.
1418

19+
### `backhand-cli`
20+
- unsquashfs: Properly flush the file writer
21+
1522
### `backhand-cli`
1623
- Use `backhand` features `parallel` by default (and in release builds). Exposed by using `backhand-parallel`.
1724

0 commit comments

Comments
 (0)