fix: Add some crate features for performance #2477

larseggert · 2025-03-06T10:20:18Z

Let's see if they do.

Also, @mxinden, I was wondering why we went with a multi-threaded tokio client and server. I'm wondering if the thread-management overheads are worth it compared to using just the rt scheduler?

@mxinden

Let's see if they do. Also, @mxinden, I was wondering why we went with a multi-threaded `tokio` client and server. I'm wondering if the thread-management overheads are worth it compared to using just the `rt` scheduler?

codecov · 2025-03-06T10:30:14Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.91%. Comparing base (76a8a60) to head (fa664b2).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2477   +/-   ##
=======================================
  Coverage   94.91%   94.91%           
=======================================
  Files         115      115           
  Lines       34286    34286           
  Branches    34286    34286           
=======================================
  Hits        32543    32543           
  Misses       1734     1734           
  Partials        9        9

Components	Coverage Δ
neqo-common	`97.10% <ø> (ø)`
neqo-crypto	`89.64% <ø> (ø)`
neqo-http3	`93.71% <ø> (ø)`
neqo-qpack	`95.45% <ø> (ø)`
neqo-transport	`96.00% <ø> (ø)`
neqo-udp	`89.85% <ø> (ø)`

github-actions · 2025-03-06T10:46:11Z

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 37c3aee.

neqo-latest as client

neqo-latest vs. aioquic: 🚀~~S R Z V2~~ ⚠️C20 L1 C1 C2 BP BA
neqo-latest vs. go-x-net: DC 🚀A ⚠️M 6 BP BA
neqo-latest vs. haproxy: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2 BP BA
neqo-latest vs. kwik: 🚀S ⚠️H LR C20 R 🚀3 B 🚀A ⚠️U L1 L2 C1 ⚠️6 V2 BP BA
neqo-latest vs. linuxquic: 🚀R ⚠️LR Z U A L1 🚀L2 C1 🚀C2 ⚠️BP BA CM
neqo-latest vs. lsquic: 🚀A ⚠️LR L1 C1 🚀~~6 V2~~ BP 🚀~~BA CM~~
neqo-latest vs. msquic: 🚀H ⚠️DC LR 🚀~~C20 Z~~ ⚠️R A L1 C1 🚀BA
neqo-latest vs. mvfst: H 🚀~~DC LR R 3~~ A L1 C1 🚀~~BP BA~~
neqo-latest vs. neqo: ⚠️H DC 3 U A 🚀C2 ⚠️BA
neqo-latest vs. neqo-latest: ⚠️DC C20 M R Z 3 B A ⚠️C1 C2 6 🚀V2
neqo-latest vs. nginx: ⚠️DC LR 🚀~~C20 B~~ ⚠️M R L1 C1 ⚠️C2 BP BA
neqo-latest vs. ngtcp2: 🚀~~M S R Z~~ ⚠️DC 3 🚀~~A L2~~ ⚠️B E 6 CM
neqo-latest vs. picoquic: ⚠️C20 B A L1 C1 C2 BP
neqo-latest vs. quic-go: ⚠️DC S Z A 🚀L2 C1 ⚠️BP BA
neqo-latest vs. quiche: 🚀~~H R~~ B 🚀~~A L2 C1 6~~ ⚠️U BP BA
neqo-latest vs. quinn: 🚀~~H DC LR~~ C20 🚀3 ⚠️M S R Z A 🚀L1 L2 🚀6 ⚠️BP
neqo-latest vs. s2n-quic: H 🚀~~3 B~~ ⚠️M R L2 C2 🚀BP ⚠️6 BA CM
neqo-latest vs. tquic: ⚠️C20 S ⚠️R Z A C1 ⚠️6 BP BA
neqo-latest vs. xquic: ⚠️LR C20 M R ⚠️Z 3 A L1 C1 C2

neqo-latest as server

aioquic vs. neqo-latest: run cancelled after 20 min
chrome vs. neqo-latest: ⚠️3
go-x-net vs. neqo-latest: 🚀~~H L2~~ ⚠️C2 6 🚀CM
kwik vs. neqo-latest: 🚀H ⚠️S BP 🚀CM
linuxquic vs. neqo-latest: run cancelled after 20 min
lsquic vs. neqo-latest: ⚠️H DC ⚠️LR S 🚀~~L1 C2~~ ⚠️L2
msquic vs. neqo-latest: ⚠️DC LR 🚀~~L2 C2 6 CM~~ ⚠️Z B C1
mvfst vs. neqo-latest: Z 🚀~~3 B~~ A L1 C1 ⚠️6 CM
neqo vs. neqo-latest: ⚠️H C20 S ⚠️R Z 3 🚀E ⚠️U A 🚀V2 ⚠️L2 6 BP
ngtcp2 vs. neqo-latest: run cancelled after 20 min
openssl vs. neqo-latest: 🚀DC ⚠️H LR M ⚠️S R A 🚀~~L2 BP~~ CM
picoquic vs. neqo-latest: ⚠️R Z 3 B L2 BA
quic-go vs. neqo-latest: ⚠️H LR R B U A L2 C2 🚀~~BP BA~~ ⚠️6 CM
quiche vs. neqo-latest: 🚀~~3 L1 C2~~ ⚠️DC LR M S R Z C1 6 🚀~~BP BA~~ CM
quinn vs. neqo-latest: 🚀DC LR R 🚀~~B L2~~ ⚠️Z C1 6 V2 🚀BP CM
s2n-quic vs. neqo-latest: H DC 🚀~~B L1~~ ⚠️LR L2 🚀6 BP ⚠️BA CM
tquic vs. neqo-latest: ⚠️B L1 C2 CM
xquic vs. neqo-latest: 🚀DC LR C20 M 🚀3 B ⚠️L2 CM

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: H DC LR ⚠️C20 M 🚀~~S R Z~~ 3 B U A ⚠️L1 L2 ⚠️C1 C2 6 ⚠️BP BA 🚀V2
neqo-latest vs. go-x-net: H LR ⚠️M B U 🚀A L2 C2 ⚠️6
neqo-latest vs. kwik: ⚠️H DC ⚠️LR C20 M 🚀S Z ⚠️U 🚀~~3 A~~ C2 ⚠️6 V2
neqo-latest vs. linuxquic: H DC ⚠️LR C20 M S ⚠️Z 🚀R 3 B ⚠️U E 🚀~~L2 C2~~ 6 V2 ⚠️BP BA
neqo-latest vs. lsquic: H DC ⚠️LR C20 M S R Z 3 B U E 🚀A L2 C2 🚀~~6 V2 BA CM~~
neqo-latest vs. msquic: ⚠️DC 🚀~~H C20~~ M S ⚠️R 🚀Z B U L2 C2 6 V2 BP 🚀BA
neqo-latest vs. mvfst: 🚀~~DC LR~~ M 🚀R Z 🚀3 B U L2 C2 6 🚀~~BP BA~~
neqo-latest vs. neqo: ⚠️H DC LR C20 M S R Z ⚠️3 B ⚠️U E L1 L2 C1 🚀C2 6 V2 BP ⚠️BA CM
neqo-latest vs. neqo-latest: H ⚠️DC LR ⚠️C20 M S ⚠️R U E L1 L2 ⚠️C1 🚀V2 BP BA CM
neqo-latest vs. nginx: H ⚠️DC M 🚀~~C20~~ S ⚠️R Z 3 🚀B U A L2 ⚠️C2 6
neqo-latest vs. ngtcp2: H ⚠️DC LR C20 ⚠️B 🚀~~M S R Z~~ U ⚠️E 🚀A L1 🚀L2 C1 C2 ⚠️6 V2 BP BA
neqo-latest vs. picoquic: 🚀~~H DC LR M S R Z 3 U E L2 6 V2 BA~~
neqo-latest vs. quic-go: H ⚠️DC LR C20 M ⚠️S R ⚠️Z 3 B U L1 🚀L2 C2 6 ⚠️BP BA
neqo-latest vs. quiche: 🚀H DC LR C20 M S 🚀R Z 3 ⚠️U 🚀A L1 🚀~~L2 C1~~ C2 🚀6
neqo-latest vs. quinn: ⚠️M S R Z 🚀~~H DC LR 3~~ B U E 🚀L1 C1 C2 ⚠️BP 🚀6 BA
neqo-latest vs. s2n-quic: DC LR C20 ⚠️M S ⚠️R 🚀~~3 B~~ U E A L1 ⚠️L2 C1 ⚠️6 🚀BP
neqo-latest vs. tquic: H DC LR ⚠️C20 M ⚠️R Z 3 B U L1 L2 C2 ⚠️6
neqo-latest vs. xquic: H DC ⚠️LR C20 M Z 3 B U L2 6 BP BA

neqo-latest as server

go-x-net vs. neqo-latest: 🚀H DC LR 🚀M B 🚀U A ⚠️C2 🚀L2 BP 🚀BA
kwik vs. neqo-latest: 🚀H DC LR C20 ⚠️S 🚀M R Z 3 B 🚀U A L1 L2 C1 C2 6 V2
lsquic vs. neqo-latest: ⚠️H LR M R 3 B E A ⚠️L2 🚀L1 C1 🚀C2 6 V2 ⚠️BA 🚀~~BP CM~~
msquic vs. neqo-latest: H ⚠️DC C20 🚀M S R ⚠️Z B 🚀U L1 ⚠️C1 🚀~~L2 C2 6~~ V2 🚀BP BA
mvfst vs. neqo-latest: H DC LR ⚠️M 🚀~~3 B~~ L2 C2 ⚠️6 BP BA
neqo vs. neqo-latest: ⚠️H DC LR ⚠️C20 M ⚠️R Z B ⚠️U 🚀E L1 ⚠️L2 C1 C2 ⚠️6 BP 🚀V2 BA CM
openssl vs. neqo-latest: ⚠️H 🚀DC C20 ⚠️S R 3 B 🚀L2 C2 6 🚀BP BA
picoquic vs. neqo-latest: 🚀~~H DC LR C20 M S U E A L1 C1 C2 6 V2 BP CM~~
quic-go vs. neqo-latest: ⚠️H DC ⚠️LR C20 M S ⚠️R Z 3 ⚠️B U L1 C1 ⚠️6 🚀~~BP BA~~
quiche vs. neqo-latest: H ⚠️DC LR M S R Z 🚀3 B A 🚀L1 L2 ⚠️C1 🚀~~C2 BP BA~~
quinn vs. neqo-latest: H 🚀DC C20 M S ⚠️Z 3 🚀B U E A L1 ⚠️C1 🚀L2 C2 ⚠️6 🚀BP BA
s2n-quic vs. neqo-latest: ⚠️LR M S R 3 🚀B E A 🚀L1 C1 C2 ⚠️BA 🚀6
tquic vs. neqo-latest: 🚀~~H DC LR M S R Z 3 A L2 C1 6 BP BA~~
xquic vs. neqo-latest: H 🚀DC S R Z 🚀~~3 U~~ A L1 ⚠️L2 C1 C2 6 🚀~~BP BA~~

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: E CM
neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2 CM
neqo-latest vs. haproxy: E CM
neqo-latest vs. kwik: E CM
neqo-latest vs. msquic: 3 E CM
neqo-latest vs. mvfst: C20 S E V2 CM
neqo-latest vs. nginx: E V2 CM
neqo-latest vs. picoquic: CM
neqo-latest vs. quic-go: E V2 CM
neqo-latest vs. quiche: E V2 CM
neqo-latest vs. quinn: V2 CM
neqo-latest vs. s2n-quic: Z V2
neqo-latest vs. tquic: E V2 CM
neqo-latest vs. xquic: S E V2 CM

neqo-latest as server

chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2 BP BA CM
go-x-net vs. neqo-latest: C20 M S R Z 3 U E L1 C1 V2 BA CM
kwik vs. neqo-latest: M U E BA CM
lsquic vs. neqo-latest: C20 Z U BP CM BA
msquic vs. neqo-latest: M 3 U E A BP CM
mvfst vs. neqo-latest: C20 M S R U E V2
openssl vs. neqo-latest: Z U E L1 C1 V2
quic-go vs. neqo-latest: E V2
quiche vs. neqo-latest: C20 U E V2
s2n-quic vs. neqo-latest: C20 Z U V2
tquic vs. neqo-latest: C20 U E V2
xquic vs. neqo-latest: U E V2 BP BA CM

github-actions · 2025-03-06T11:25:52Z

Benchmark results

Performance differences relative to a341259.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved.

       time:   [198.58 ms 198.94 ms 199.30 ms]
       thrpt:  [501.76 MiB/s 502.67 MiB/s 503.58 MiB/s]
change:
       time:   [−2.0523% −1.7659% −1.4888%] (p = 0.00 < 0.05)
       thrpt:  [+1.5113% +1.7976% +2.0953%]
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high mild

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.

       time:   [302.98 ms 304.36 ms 305.74 ms]
       thrpt:  [32.708 Kelem/s 32.856 Kelem/s 33.006 Kelem/s]
change:
       time:   [−0.0873% +0.5531% +1.2132%] (p = 0.09 > 0.05)
       thrpt:  [−1.1987% −0.5500% +0.0874%]

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.

       time:   [27.349 ms 27.439 ms 27.558 ms]
       thrpt:  [36.287   B/s 36.444   B/s 36.564   B/s]
change:
       time:   [−0.7745% −0.2244% +0.3429%] (p = 0.45 > 0.05)
       thrpt:  [−0.3417% +0.2249% +0.7806%]
Found 22 outliers among 100 measurements (22.00%)

2 (2.00%) low severe

20 (20.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💚 Performance has improved.

       time:   [622.90 ms 627.59 ms 632.27 ms]
       thrpt:  [158.16 MiB/s 159.34 MiB/s 160.54 MiB/s]
change:
       time:   [−5.0735% −4.1248% −3.2188%] (p = 0.00 < 0.05)
       thrpt:  [+3.3259% +4.3022% +5.3446%]
Found 9 outliers among 100 measurements (9.00%)

3 (3.00%) low severe

3 (3.00%) low mild

1 (1.00%) high mild

2 (2.00%) high severe

decode 4096 bytes, mask ff: Change within noise threshold.

       time:   [11.629 µs 11.672 µs 11.721 µs]
       change: [−1.4598% −1.0726% −0.5453%] (p = 0.00 < 0.05)
Found 15 outliers among 100 measurements (15.00%)

1 (1.00%) low severe

3 (3.00%) low mild

11 (11.00%) high severe

decode 1048576 bytes, mask ff: Change within noise threshold.

       time:   [3.0583 ms 3.0679 ms 3.0805 ms]
       change: [+0.7403% +1.2257% +1.7377%] (p = 0.00 < 0.05)
Found 8 outliers among 100 measurements (8.00%)

1 (1.00%) low mild

7 (7.00%) high severe

decode 4096 bytes, mask 7f: 💚 Performance has improved.

       time:   [19.363 µs 19.446 µs 19.562 µs]
       change: [−4.0416% −3.0524% −2.3825%] (p = 0.00 < 0.05)
Found 17 outliers among 100 measurements (17.00%)

4 (4.00%) low severe

1 (1.00%) low mild

12 (12.00%) high severe

decode 1048576 bytes, mask 7f: Change within noise threshold.

       time:   [5.0845 ms 5.0972 ms 5.1105 ms]
       change: [+0.4872% +0.8734% +1.2405%] (p = 0.00 < 0.05)
Found 15 outliers among 100 measurements (15.00%)

15 (15.00%) high severe

decode 4096 bytes, mask 3f: 💚 Performance has improved.

       time:   [5.5305 µs 5.5588 µs 5.5930 µs]
       change: [−33.157% −32.863% −32.534%] (p = 0.00 < 0.05)
Found 7 outliers among 100 measurements (7.00%)

2 (2.00%) high mild

5 (5.00%) high severe

decode 1048576 bytes, mask 3f: 💔 Performance has regressed.

       time:   [1.7873 ms 1.7997 ms 1.8123 ms]
       change: [+12.244% +13.138% +13.972%] (p = 0.00 < 0.05)

1000 streams of 1 bytes/multistream: 💔 Performance has regressed.

       time:   [47.764 ns 47.945 ns 48.127 ns]
       change: [+29.708% +31.271% +32.821%] (p = 0.00 < 0.05)
Found 1 outliers among 500 measurements (0.20%)

1 (0.20%) high mild

1000 streams of 1000 bytes/multistream: 💔 Performance has regressed.

       time:   [47.002 ns 47.177 ns 47.353 ns]
       change: [+22.520% +23.830% +25.204%] (p = 0.00 < 0.05)

coalesce_acked_from_zero 1+1 entries: No change in performance detected.

       time:   [88.065 ns 88.418 ns 88.774 ns]
       change: [−0.4540% −0.0593% +0.3303%] (p = 0.77 > 0.05)
Found 8 outliers among 100 measurements (8.00%)

7 (7.00%) high mild

1 (1.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.

       time:   [105.56 ns 106.12 ns 106.83 ns]
       change: [−0.4586% +0.0028% +0.4900%] (p = 0.99 > 0.05)
Found 16 outliers among 100 measurements (16.00%)

1 (1.00%) high mild

15 (15.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.

       time:   [104.67 ns 105.03 ns 105.46 ns]
       change: [−0.3186% +0.1905% +0.9788%] (p = 0.63 > 0.05)
Found 9 outliers among 100 measurements (9.00%)

3 (3.00%) low severe

1 (1.00%) low mild

5 (5.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.

       time:   [88.603 ns 88.775 ns 88.973 ns]
       change: [−1.1413% −0.3373% +0.4524%] (p = 0.44 > 0.05)
Found 11 outliers among 100 measurements (11.00%)

6 (6.00%) high mild

5 (5.00%) high severe

RxStreamOrderer::inbound_frame(): No change in performance detected.

       time:   [107.89 ms 107.96 ms 108.03 ms]
       change: [−0.2639% −0.0098% +0.1676%] (p = 0.94 > 0.05)
Found 11 outliers among 100 measurements (11.00%)

10 (10.00%) low mild

1 (1.00%) high severe

sent::Packets::take_ranges: No change in performance detected.

       time:   [8.0893 µs 8.3207 µs 8.5443 µs]
       change: [−3.3996% +3.7280% +13.696%] (p = 0.52 > 0.05)
Found 22 outliers among 100 measurements (22.00%)

4 (4.00%) low severe

11 (11.00%) low mild

3 (3.00%) high mild

4 (4.00%) high severe

transfer/pacing-false/varying-seeds: Change within noise threshold.

       time:   [37.216 ms 37.293 ms 37.371 ms]
       change: [+0.5064% +0.8545% +1.2038%] (p = 0.00 < 0.05)

transfer/pacing-true/varying-seeds: Change within noise threshold.

       time:   [37.913 ms 38.031 ms 38.155 ms]
       change: [+0.7331% +1.1975% +1.6586%] (p = 0.00 < 0.05)
Found 3 outliers among 100 measurements (3.00%)

1 (1.00%) low mild

1 (1.00%) high mild

1 (1.00%) high severe

transfer/pacing-false/same-seed: Change within noise threshold.

       time:   [36.575 ms 36.650 ms 36.735 ms]
       change: [−0.6683% −0.3588% −0.0384%] (p = 0.02 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

transfer/pacing-true/same-seed: Change within noise threshold.

       time:   [38.728 ms 38.817 ms 38.911 ms]
       change: [+1.6254% +1.9650% +2.2815%] (p = 0.00 < 0.05)
Found 2 outliers among 100 measurements (2.00%)

1 (1.00%) low mild

1 (1.00%) high severe

Client/server transfer results

Performance differences relative to a341259.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params)	Mean ± σ	Min	Max	MiB/s ± σ	Δ `main`	Δ `main`
google vs. google	457.0 ± 3.9	451.4	467.3	70.0 ± 8.2
google vs. neqo (cubic, paced)	272.9 ± 4.2	266.3	285.4	117.3 ± 7.6	-0.0	-0.0%
msquic vs. msquic	127.6 ± 15.5	109.5	192.1	250.8 ± 2.1
msquic vs. neqo (cubic, paced)	141.1 ± 15.1	118.2	199.7	226.8 ± 2.1	💚 -5.8	-4.0%
neqo vs. google (cubic, paced)	762.0 ± 4.0	755.8	773.1	42.0 ± 8.0	-0.9	-0.1%
neqo vs. msquic (cubic, paced)	155.3 ± 4.4	148.7	163.5	206.0 ± 7.3	💚 -1.4	-0.9%
neqo vs. neqo (cubic)	89.0 ± 4.5	82.4	103.8	359.5 ± 7.1	💚 -2.5	-2.8%
neqo vs. neqo (cubic, paced)	91.5 ± 5.0	83.9	109.1	349.6 ± 6.4	-0.1	-0.1%
neqo vs. neqo (reno)	89.1 ± 4.8	80.5	106.8	359.1 ± 6.7	💚 -2.0	-2.2%
neqo vs. neqo (reno, paced)	90.6 ± 4.2	83.8	101.1	353.2 ± 7.6	💚 -1.6	-1.7%
neqo vs. quiche (cubic, paced)	193.5 ± 4.5	185.6	204.4	165.4 ± 7.1	💔 1.8	0.9%
neqo vs. s2n (cubic, paced)	218.9 ± 4.1	211.4	226.5	146.2 ± 7.8	💔 1.3	0.6%
quiche vs. neqo (cubic, paced)	160.6 ± 5.1	150.8	171.4	199.2 ± 6.3	💔 2.7	1.7%
quiche vs. quiche	147.6 ± 4.9	139.8	160.4	216.8 ± 6.5
s2n vs. neqo (cubic, paced)	170.8 ± 4.6	162.0	181.4	187.3 ± 7.0	-0.5	-0.3%
s2n vs. s2n	249.8 ± 27.6	231.2	352.9	128.1 ± 1.2

Download data for profiler.firefox.com or download performance comparison data.

mxinden · 2025-03-06T14:37:59Z

Also, @mxinden, I was wondering why we went with a multi-threaded tokio client and server.

I chose multi-threaded as it is the de-facto default. No other reason.

I'm wondering if the thread-management overheads are worth it compared to using just the rt scheduler?

👍 worth experimenting. Intuitively, given that it is a single future only, there is no cross-thread communication and thus no significant overhead.

Signed-off-by: Lars Eggert <lars@eggert.org>

mxinden · 2025-05-23T19:59:25Z

I am fine merging here. That said, I would prefer individual pull requests per feature, to ensure each change, and not just all changes as a whole, have a positive performance impact. In addition, I don't think we should merge here before we have reliable benchmarks, i.e. not merge here before #2657 is fixed.

Signed-off-by: Lars Eggert <lars@eggert.org>

github-actions · 2025-06-24T13:41:47Z

Bencher Report

Branch	fix-features
Testbed	t-linux64-ms-280

🚨 1 Alert

Benchmark	Measure Units	View	Benchmark Result (Result Δ%)	Upper Boundary (Limit %)
decode 1048576 bytes, mask ff	Latency milliseconds (ms)	📈 plot 🚷 threshold 🚨 alert (🔔)	3.07 ms (+1.15%) Baseline: 3.04 ms	3.07 ms (100.04%)

Click to view all benchmark results

Benchmark	Latency	Benchmark Result nanoseconds (ns) (Result Δ%)	Upper Boundary nanoseconds (ns) (Limit %)
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client	📈 view plot 🚷 view threshold	643,680,000.00 ns (-3.09%) Baseline: 664,191,369.86 ns	732,128,729.16 ns (87.92%)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client	📈 view plot 🚷 view threshold	636,170,000.00 ns (+0.77%) Baseline: 631,338,493.15 ns	833,045,917.44 ns (76.37%)
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client	📈 view plot 🚷 view threshold	27,135,000.00 ns (-0.21%) Baseline: 27,193,041.10 ns	27,664,766.42 ns (98.09%)
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client	📈 view plot 🚷 view threshold	303,130,000.00 ns (-0.56%) Baseline: 304,823,150.68 ns	316,111,262.80 ns (95.89%)
1000 streams of 1 bytes/multistream	📈 view plot 🚷 view threshold	35.80 ns (-4.67%) Baseline: 37.55 ns	54.48 ns (65.71%)
1000 streams of 1000 bytes/multistream	📈 view plot 🚷 view threshold	33.94 ns (-8.45%) Baseline: 37.07 ns	54.08 ns (62.76%)
RxStreamOrderer::inbound_frame()	📈 view plot 🚷 view threshold	111,300,000.00 ns (+0.73%) Baseline: 110,495,356.16 ns	114,591,968.70 ns (97.13%)
coalesce_acked_from_zero 1+1 entries	📈 view plot 🚷 view threshold	88.48 ns (-0.23%) Baseline: 88.69 ns	89.30 ns (99.08%)
coalesce_acked_from_zero 10+1 entries	📈 view plot 🚷 view threshold	106.24 ns (+0.28%) Baseline: 105.94 ns	106.89 ns (99.39%)
coalesce_acked_from_zero 1000+1 entries	📈 view plot 🚷 view threshold	89.08 ns (-0.24%) Baseline: 89.29 ns	91.62 ns (97.23%)
coalesce_acked_from_zero 3+1 entries	📈 view plot 🚷 view threshold	106.45 ns (-0.08%) Baseline: 106.53 ns	107.40 ns (99.11%)
decode 1048576 bytes, mask 3f	📈 view plot 🚷 view threshold	1,760,800.00 ns (+8.76%) Baseline: 1,618,995.89 ns	1,772,698.80 ns (99.33%)
decode 1048576 bytes, mask 7f	📈 view plot 🚷 view threshold	5,090,000.00 ns (+0.51%) Baseline: 5,064,032.88 ns	5,092,127.33 ns (99.96%)
decode 1048576 bytes, mask ff	📈 view plot 🚷 view threshold 🚨 view alert (🔔)	3,071,200.00 ns (+1.15%) Baseline: 3,036,191.78 ns	3,069,876.22 ns (100.04%)
decode 4096 bytes, mask 3f	📈 view plot 🚷 view threshold	5,534.50 ns (-29.78%) Baseline: 7,881.99 ns	10,226.42 ns (54.12%)
decode 4096 bytes, mask 7f	📈 view plot 🚷 view threshold	19,422.00 ns (-2.45%) Baseline: 19,910.25 ns	20,417.73 ns (95.12%)
decode 4096 bytes, mask ff	📈 view plot 🚷 view threshold	11,627.00 ns (-1.57%) Baseline: 11,812.75 ns	11,980.92 ns (97.05%)
sent::Packets::take_ranges	📈 view plot 🚷 view threshold	8,283.40 ns (-1.34%) Baseline: 8,396.24 ns	8,610.25 ns (96.20%)
transfer/pacing-false/same-seed	📈 view plot 🚷 view threshold	35,787,000.00 ns (+2.48%) Baseline: 34,920,095.89 ns	36,602,828.37 ns (97.77%)
transfer/pacing-false/varying-seeds	📈 view plot 🚷 view threshold	35,447,000.00 ns (+1.09%) Baseline: 35,063,780.82 ns	36,793,170.04 ns (96.34%)
transfer/pacing-true/same-seed	📈 view plot 🚷 view threshold	37,384,000.00 ns (+2.34%) Baseline: 36,530,616.44 ns	38,137,907.42 ns (98.02%)
transfer/pacing-true/varying-seeds	📈 view plot 🚷 view threshold	36,577,000.00 ns (+1.88%) Baseline: 35,901,520.55 ns	37,524,596.72 ns (97.47%)

🐰 View full continuous benchmarking report in Bencher

github-actions · 2025-06-27T14:02:58Z

Bencher Report

Branch	fix-features
Testbed	t-linux64-ms-279

Click to view all benchmark results

Benchmark	Latency	Benchmark Result nanoseconds (ns) (Result Δ%)	Upper Boundary nanoseconds (ns) (Limit %)
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client	📈 view plot 🚷 view threshold	627,590,000.00 ns (-16.09%) Baseline: 747,942,500.00 ns	1,240,194,385.25 ns (50.60%)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client	📈 view plot 🚷 view threshold	198,940,000.00 ns (-52.41%) Baseline: 418,047,500.00 ns	1,407,118,298.59 ns (14.14%)
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client	📈 view plot 🚷 view threshold	27,439,000.00 ns (+0.77%) Baseline: 27,228,750.00 ns	28,333,645.74 ns (96.84%)
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client	📈 view plot 🚷 view threshold	304,360,000.00 ns (+1.17%) Baseline: 300,840,000.00 ns	313,432,123.06 ns (97.11%)
1000 streams of 1 bytes/multistream	📈 view plot 🚷 view threshold	47.95 ns (+33.04%) Baseline: 36.04 ns	69.67 ns (68.81%)
1000 streams of 1000 bytes/multistream	📈 view plot 🚷 view threshold	47.18 ns (+22.32%) Baseline: 38.57 ns	62.10 ns (75.97%)
RxStreamOrderer::inbound_frame()	📈 view plot 🚷 view threshold	107,960,000.00 ns (-0.88%) Baseline: 108,920,000.00 ns	116,055,769.08 ns (93.02%)
coalesce_acked_from_zero 1+1 entries	📈 view plot 🚷 view threshold	88.42 ns (+0.04%) Baseline: 88.39 ns	89.08 ns (99.25%)
coalesce_acked_from_zero 10+1 entries	📈 view plot 🚷 view threshold	105.03 ns (-0.12%) Baseline: 105.16 ns	106.22 ns (98.88%)
coalesce_acked_from_zero 1000+1 entries	📈 view plot 🚷 view threshold	88.78 ns (-2.06%) Baseline: 90.64 ns	99.03 ns (89.65%)
coalesce_acked_from_zero 3+1 entries	📈 view plot 🚷 view threshold	106.12 ns (+0.17%) Baseline: 105.94 ns	106.90 ns (99.27%)
decode 1048576 bytes, mask 3f	📈 view plot 🚷 view threshold	1,799,700.00 ns (+9.55%) Baseline: 1,642,750.00 ns	2,054,214.67 ns (87.61%)
decode 1048576 bytes, mask 7f	📈 view plot 🚷 view threshold	5,097,200.00 ns (+0.65%) Baseline: 5,064,475.00 ns	5,154,319.19 ns (98.89%)
decode 1048576 bytes, mask ff	📈 view plot 🚷 view threshold	3,067,900.00 ns (+0.93%) Baseline: 3,039,775.00 ns	3,113,852.76 ns (98.52%)
decode 4096 bytes, mask 3f	📈 view plot 🚷 view threshold	5,558.80 ns (-26.89%) Baseline: 7,603.60 ns	12,964.41 ns (42.88%)
decode 4096 bytes, mask 7f	📈 view plot 🚷 view threshold	19,446.00 ns (-2.06%) Baseline: 19,854.25 ns	20,924.94 ns (92.93%)
decode 4096 bytes, mask ff	📈 view plot 🚷 view threshold	11,672.00 ns (-1.22%) Baseline: 11,816.50 ns	12,226.78 ns (95.46%)
sent::Packets::take_ranges	📈 view plot 🚷 view threshold	8,320.70 ns (+1.78%) Baseline: 8,175.35 ns	8,875.44 ns (93.75%)
transfer/pacing-false/same-seed	📈 view plot 🚷 view threshold	36,650,000.00 ns (+1.92%) Baseline: 35,961,000.00 ns	39,418,403.09 ns (92.98%)
transfer/pacing-false/varying-seeds	📈 view plot 🚷 view threshold	37,293,000.00 ns (+2.89%) Baseline: 36,245,500.00 ns	40,338,863.04 ns (92.45%)
transfer/pacing-true/same-seed	📈 view plot 🚷 view threshold	38,817,000.00 ns (+3.64%) Baseline: 37,452,500.00 ns	42,197,319.94 ns (91.99%)
transfer/pacing-true/varying-seeds	📈 view plot 🚷 view threshold	38,031,000.00 ns (+3.16%) Baseline: 36,865,250.00 ns	41,255,039.36 ns (92.19%)

🐰 View full continuous benchmarking report in Bencher

github-actions · 2025-06-27T14:03:00Z

Bencher Report

Branch	fix-features
Testbed	t-linux64-ms-279

Click to view all benchmark results

Benchmark	Latency	Benchmark Result milliseconds (ms) (Result Δ%)	Upper Boundary milliseconds (ms) (Limit %)
s2n vs. neqo (cubic, paced)	📈 view plot 🚷 view threshold	170.83 ms (-28.03%) Baseline: 237.36 ms	538.39 ms (31.73%)

🐰 View full continuous benchmarking report in Bencher

…atures

github-actions · 2025-06-30T14:47:25Z

Bencher Report

Branch	fix-features
Testbed	t-linux64-ms-278

Click to view all benchmark results

Benchmark	Latency	Benchmark Result nanoseconds (ns) (Result Δ%)	Upper Boundary nanoseconds (ns) (Limit %)
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client	📈 view plot 🚷 view threshold	644,780,000.00 ns (-22.62%) Baseline: 833,280,000.00 ns	1,135,572,397.07 ns (56.78%)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client	📈 view plot 🚷 view threshold	655,620,000.00 ns (+0.28%) Baseline: 653,791,666.67 ns	675,797,849.36 ns (97.01%)
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client	📈 view plot 🚷 view threshold	27,119,000.00 ns (-0.06%) Baseline: 27,134,166.67 ns	27,330,233.80 ns (99.23%)
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client	📈 view plot 🚷 view threshold	298,380,000.00 ns (-0.07%) Baseline: 298,586,666.67 ns	301,752,185.80 ns (98.88%)
1000 streams of 1 bytes/multistream	📈 view plot 🚷 view threshold	36.50 ns (+13.87%) Baseline: 32.05 ns	44.97 ns (81.16%)
1000 streams of 1000 bytes/multistream	📈 view plot 🚷 view threshold	48.14 ns (+40.65%) Baseline: 34.23 ns	57.58 ns (83.60%)
RxStreamOrderer::inbound_frame()	📈 view plot 🚷 view threshold	108,040,000.00 ns (+0.44%) Baseline: 107,566,666.67 ns	108,789,772.31 ns (99.31%)
coalesce_acked_from_zero 1+1 entries	📈 view plot 🚷 view threshold	88.51 ns (-0.42%) Baseline: 88.88 ns	90.79 ns (97.49%)
coalesce_acked_from_zero 10+1 entries	📈 view plot 🚷 view threshold	105.27 ns (-0.52%) Baseline: 105.82 ns	107.01 ns (98.37%)
coalesce_acked_from_zero 1000+1 entries	📈 view plot 🚷 view threshold	90.32 ns (+1.01%) Baseline: 89.42 ns	91.19 ns (99.05%)
coalesce_acked_from_zero 3+1 entries	📈 view plot 🚷 view threshold	105.76 ns (-0.53%) Baseline: 106.32 ns	107.76 ns (98.14%)
decode 1048576 bytes, mask 3f	📈 view plot 🚷 view threshold	1,784,300.00 ns (+9.65%) Baseline: 1,627,216.67 ns	1,863,844.66 ns (95.73%)
decode 1048576 bytes, mask 7f	📈 view plot 🚷 view threshold	5,098,800.00 ns (+0.77%) Baseline: 5,059,716.67 ns	5,121,583.33 ns (99.56%)
decode 1048576 bytes, mask ff	📈 view plot 🚷 view threshold	3,073,400.00 ns (+1.09%) Baseline: 3,040,133.33 ns	3,090,833.22 ns (99.44%)
decode 4096 bytes, mask 3f	📈 view plot 🚷 view threshold	5,558.10 ns (-29.14%) Baseline: 7,844.00 ns	11,284.48 ns (49.25%)
decode 4096 bytes, mask 7f	📈 view plot 🚷 view threshold	19,372.00 ns (-2.70%) Baseline: 19,909.00 ns	20,724.64 ns (93.47%)
decode 4096 bytes, mask ff	📈 view plot 🚷 view threshold	11,642.00 ns (-1.47%) Baseline: 11,816.00 ns	12,085.52 ns (96.33%)
sent::Packets::take_ranges	📈 view plot 🚷 view threshold	8,293.30 ns (+1.08%) Baseline: 8,204.87 ns	8,385.40 ns (98.90%)
transfer/pacing-false/same-seed	📈 view plot 🚷 view threshold	35,240,000.00 ns (+1.40%) Baseline: 34,753,000.00 ns	35,696,133.31 ns (98.72%)
transfer/pacing-false/varying-seeds	📈 view plot 🚷 view threshold	35,181,000.00 ns (+1.13%) Baseline: 34,789,000.00 ns	35,833,247.53 ns (98.18%)
transfer/pacing-true/same-seed	📈 view plot 🚷 view threshold	36,688,000.00 ns (+1.16%) Baseline: 36,266,166.67 ns	37,342,750.44 ns (98.25%)
transfer/pacing-true/varying-seeds	📈 view plot 🚷 view threshold	36,264,000.00 ns (+1.93%) Baseline: 35,575,833.33 ns	36,866,018.78 ns (98.37%)

🐰 View full continuous benchmarking report in Bencher

github-actions · 2025-06-30T14:47:27Z

Bencher Report

Branch	fix-features
Testbed	t-linux64-ms-278

Click to view all benchmark results

Benchmark	Latency	Benchmark Result milliseconds (ms) (Result Δ%)	Upper Boundary milliseconds (ms) (Limit %)
s2n vs. neqo (cubic, paced)	📈 view plot 🚷 view threshold	300.26 ms (-1.12%) Baseline: 303.67 ms	315.23 ms (95.25%)

🐰 View full continuous benchmarking report in Bencher

github-actions · 2025-07-01T16:11:54Z

Benchmark results

Performance differences relative to 5387454.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved.

       time:   [199.06 ms 199.35 ms 199.65 ms]
       thrpt:  [500.87 MiB/s 501.62 MiB/s 502.36 MiB/s]
change:
       time:   [−2.2498% −1.8915% −1.5688%] (p = 0.00 < 0.05)
       thrpt:  [+1.5938% +1.9280% +2.3015%]
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high mild

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.

       time:   [300.97 ms 302.53 ms 304.09 ms]
       thrpt:  [32.885 Kelem/s 33.054 Kelem/s 33.226 Kelem/s]
change:
       time:   [−1.1631% −0.4865% +0.1677%] (p = 0.17 > 0.05)
       thrpt:  [−0.1674% +0.4889% +1.1768%]
Found 2 outliers among 100 measurements (2.00%)

1 (1.00%) low mild

1 (1.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.

       time:   [27.466 ms 27.552 ms 27.652 ms]
       thrpt:  [36.164   B/s 36.295   B/s 36.408   B/s]
change:
       time:   [−0.7318% −0.1418% +0.3798%] (p = 0.63 > 0.05)
       thrpt:  [−0.3784% +0.1420% +0.7372%]
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💚 Performance has improved.

       time:   [632.99 ms 636.81 ms 640.61 ms]
       thrpt:  [156.10 MiB/s 157.03 MiB/s 157.98 MiB/s]
change:
       time:   [−3.5012% −2.5727% −1.5710%] (p = 0.00 < 0.05)
       thrpt:  [+1.5961% +2.6406% +3.6282%]
Found 4 outliers among 100 measurements (4.00%)

1 (1.00%) low severe

2 (2.00%) low mild

1 (1.00%) high severe

decode 4096 bytes, mask ff: 💚 Performance has improved.

       time:   [11.617 µs 11.651 µs 11.693 µs]
       change: [−1.8753% −1.5051% −1.1287%] (p = 0.00 < 0.05)
Found 14 outliers among 100 measurements (14.00%)

3 (3.00%) low severe

2 (2.00%) low mild

9 (9.00%) high severe

decode 1048576 bytes, mask ff: Change within noise threshold.

       time:   [3.0609 ms 3.0704 ms 3.0816 ms]
       change: [+0.8765% +1.3615% +1.8152%] (p = 0.00 < 0.05)
Found 11 outliers among 100 measurements (11.00%)

2 (2.00%) high mild

9 (9.00%) high severe

decode 4096 bytes, mask 7f: 💚 Performance has improved.

       time:   [19.380 µs 19.433 µs 19.490 µs]
       change: [−3.0589% −2.6482% −2.2598%] (p = 0.00 < 0.05)
Found 20 outliers among 100 measurements (20.00%)

1 (1.00%) low severe

3 (3.00%) low mild

1 (1.00%) high mild

15 (15.00%) high severe

decode 1048576 bytes, mask 7f: Change within noise threshold.

       time:   [5.0899 ms 5.1123 ms 5.1425 ms]
       change: [+0.2416% +0.9940% +1.8198%] (p = 0.01 < 0.05)
Found 17 outliers among 100 measurements (17.00%)

1 (1.00%) high mild

16 (16.00%) high severe

decode 4096 bytes, mask 3f: 💚 Performance has improved.

       time:   [5.5223 µs 5.5392 µs 5.5632 µs]
       change: [−33.607% −33.133% −32.618%] (p = 0.00 < 0.05)
Found 14 outliers among 100 measurements (14.00%)

2 (2.00%) low mild

1 (1.00%) high mild

11 (11.00%) high severe

decode 1048576 bytes, mask 3f: 💔 Performance has regressed.

       time:   [1.7577 ms 1.7579 ms 1.7580 ms]
       change: [+9.9304% +10.412% +10.805%] (p = 0.00 < 0.05)
Found 6 outliers among 100 measurements (6.00%)

2 (2.00%) low mild

3 (3.00%) high mild

1 (1.00%) high severe

1000 streams of 1 bytes/multistream: 💔 Performance has regressed.

       time:   [46.961 ns 47.141 ns 47.319 ns]
       change: [+25.604% +27.038% +28.507%] (p = 0.00 < 0.05)
Found 3 outliers among 500 measurements (0.60%)

3 (0.60%) low mild

1000 streams of 1000 bytes/multistream: 💔 Performance has regressed.

       time:   [46.879 ns 47.087 ns 47.298 ns]
       change: [+31.174% +32.753% +34.387%] (p = 0.00 < 0.05)
Found 1 outliers among 500 measurements (0.20%)

1 (0.20%) high severe

coalesce_acked_from_zero 1+1 entries: No change in performance detected.

       time:   [88.056 ns 88.395 ns 88.731 ns]
       change: [−0.7580% −0.1239% +0.6309%] (p = 0.74 > 0.05)
Found 16 outliers among 100 measurements (16.00%)

12 (12.00%) high mild

4 (4.00%) high severe

coalesce_acked_from_zero 3+1 entries: Change within noise threshold.

       time:   [105.52 ns 105.89 ns 106.27 ns]
       change: [−1.1268% −0.6448% −0.1497%] (p = 0.01 < 0.05)
Found 14 outliers among 100 measurements (14.00%)

14 (14.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.

       time:   [104.94 ns 105.35 ns 105.87 ns]
       change: [−0.6239% +0.1725% +1.2132%] (p = 0.77 > 0.05)
Found 18 outliers among 100 measurements (18.00%)

2 (2.00%) low severe

3 (3.00%) low mild

3 (3.00%) high mild

10 (10.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.

       time:   [88.740 ns 88.837 ns 88.954 ns]
       change: [−1.4282% −0.4340% +0.6132%] (p = 0.43 > 0.05)
Found 14 outliers among 100 measurements (14.00%)

6 (6.00%) high mild

8 (8.00%) high severe

RxStreamOrderer::inbound_frame(): No change in performance detected.

       time:   [107.94 ms 108.07 ms 108.22 ms]
       change: [−0.4014% −0.1339% +0.0813%] (p = 0.32 > 0.05)
Found 3 outliers among 100 measurements (3.00%)

3 (3.00%) high severe

sent::Packets::take_ranges: No change in performance detected.

       time:   [8.0711 µs 8.2591 µs 8.4299 µs]
       change: [−1.8308% +4.6497% +16.282%] (p = 0.39 > 0.05)
Found 18 outliers among 100 measurements (18.00%)

5 (5.00%) low severe

9 (9.00%) low mild

2 (2.00%) high mild

2 (2.00%) high severe

transfer/pacing-false/varying-seeds: Change within noise threshold.

       time:   [36.908 ms 37.009 ms 37.125 ms]
       change: [+0.5332% +0.9068% +1.2643%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

transfer/pacing-true/varying-seeds: Change within noise threshold.

       time:   [37.866 ms 37.985 ms 38.106 ms]
       change: [+1.2718% +1.7401% +2.1818%] (p = 0.00 < 0.05)

transfer/pacing-false/same-seed: Change within noise threshold.

       time:   [36.809 ms 36.895 ms 36.995 ms]
       change: [+1.4011% +1.7223% +2.0394%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

transfer/pacing-true/same-seed: Change within noise threshold.

       time:   [38.809 ms 38.888 ms 38.970 ms]
       change: [+1.9028% +2.2730% +2.6189%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high mild

Download data for profiler.firefox.com or download performance comparison data.

github-actions · 2025-07-01T16:11:55Z

Client/server transfer results

Performance differences relative to 5387454.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params)	Mean ± σ	Min	Max	MiB/s ± σ	Δ `main`	Δ `main`
google vs. google	459.2 ± 4.1	452.6	468.4	69.7 ± 7.8
google vs. neqo (cubic, paced)	271.6 ± 4.2	264.3	287.6	117.8 ± 7.6	💚 -1.9	-0.7%
msquic vs. msquic	131.2 ± 19.9	110.8	243.9	244.0 ± 1.6
msquic vs. neqo (cubic, paced)	149.6 ± 37.0	125.3	393.5	213.8 ± 0.9	4.7	3.2%
neqo vs. google (cubic, paced)	762.1 ± 5.2	754.6	778.0	42.0 ± 6.2	-1.4	-0.2%
neqo vs. msquic (cubic, paced)	156.4 ± 4.9	148.7	181.4	204.6 ± 6.5	0.4	0.2%
neqo vs. neqo (cubic)	90.5 ± 4.7	81.4	100.0	353.7 ± 6.8	💚 -3.0	-3.2%
neqo vs. neqo (cubic, paced)	91.3 ± 4.0	82.4	98.9	350.4 ± 8.0	💚 -1.4	-1.5%
neqo vs. neqo (reno)	90.0 ± 5.0	82.7	115.1	355.6 ± 6.4	-1.1	-1.2%
neqo vs. neqo (reno, paced)	91.3 ± 5.1	83.5	112.4	350.4 ± 6.3	-0.5	-0.5%
neqo vs. quiche (cubic, paced)	193.6 ± 4.8	185.9	206.5	165.3 ± 6.7	💚 -1.8	-0.9%
neqo vs. s2n (cubic, paced)	219.7 ± 4.7	211.4	236.4	145.7 ± 6.8	0.3	0.1%
quiche vs. neqo (cubic, paced)	160.4 ± 6.4	148.4	194.6	199.5 ± 5.0	💔 1.8	1.1%
quiche vs. quiche	149.2 ± 5.2	141.2	163.6	214.5 ± 6.2
s2n vs. neqo (cubic, paced)	170.6 ± 5.5	161.4	198.9	187.6 ± 5.8	-1.0	-0.6%
s2n vs. s2n	245.8 ± 22.4	230.0	346.3	130.2 ± 1.4

Download data for profiler.firefox.com or download performance comparison data.

larseggert · 2025-07-03T15:27:14Z

I think there is some weak signal that there is some performance benefit here.

Cargo.toml

Co-authored-by: Martin Thomson <mt@lowentropy.net> Signed-off-by: Lars Eggert <lars@eggert.org>

github-actions · 2025-07-04T06:54:02Z

Client/server transfer results

Performance differences relative to 76a8a60.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params)	Mean ± σ	Min	Max	MiB/s ± σ	Δ `main`	Δ `main`
google vs. google	459.9 ± 5.1	449.8	470.2	69.6 ± 6.3
google vs. neqo (cubic, paced)	273.5 ± 5.4	264.0	288.8	117.0 ± 5.9	0.8	0.3%
msquic vs. msquic	133.7 ± 35.3	110.5	356.9	239.3 ± 0.9
msquic vs. neqo (cubic, paced)	143.6 ± 13.6	121.7	183.2	222.8 ± 2.4	-2.1	-1.4%
neqo vs. google (cubic, paced)	760.9 ± 4.4	753.8	776.8	42.1 ± 7.3	0.0	0.0%
neqo vs. msquic (cubic, paced)	157.3 ± 4.6	149.6	171.9	203.5 ± 7.0	1.3	0.8%
neqo vs. neqo (cubic)	89.8 ± 4.7	81.6	103.7	356.4 ± 6.8	0.7	0.8%
neqo vs. neqo (cubic, paced)	92.0 ± 4.7	84.2	108.6	347.7 ± 6.8	0.0	0.0%
neqo vs. neqo (reno)	91.6 ± 4.2	80.5	102.4	349.3 ± 7.6	💔 1.4	1.6%
neqo vs. neqo (reno, paced)	92.7 ± 4.5	83.6	113.1	345.3 ± 7.1	💔 1.8	2.0%
neqo vs. quiche (cubic, paced)	194.2 ± 4.3	186.8	200.6	164.8 ± 7.4	💔 1.7	0.9%
neqo vs. s2n (cubic, paced)	218.9 ± 5.1	211.2	227.3	146.2 ± 6.3	-0.7	-0.3%
quiche vs. neqo (cubic, paced)	161.5 ± 6.5	150.7	194.8	198.2 ± 4.9	0.3	0.2%
quiche vs. quiche	148.3 ± 5.6	140.2	169.5	215.8 ± 5.7
s2n vs. neqo (cubic, paced)	171.8 ± 4.2	164.3	179.6	186.3 ± 7.6	0.6	0.4%
s2n vs. s2n	250.6 ± 29.6	231.7	350.9	127.7 ± 1.1

Download data for profiler.firefox.com or download performance comparison data.

github-actions · 2025-07-04T06:59:49Z

Benchmark results

Performance differences relative to 76a8a60.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold.

       time:   [200.62 ms 200.95 ms 201.27 ms]
       thrpt:  [496.84 MiB/s 497.64 MiB/s 498.45 MiB/s]
change:
       time:   [+0.3888% +0.6600% +0.9123%] (p = 0.00 < 0.05)
       thrpt:  [−0.9041% −0.6556% −0.3872%]

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.

       time:   [303.63 ms 305.10 ms 306.58 ms]
       thrpt:  [32.618 Kelem/s 32.776 Kelem/s 32.935 Kelem/s]
change:
       time:   [−1.1869% −0.5339% +0.1572%] (p = 0.12 > 0.05)
       thrpt:  [−0.1570% +0.5368% +1.2011%]

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: Change within noise threshold.

       time:   [27.593 ms 27.711 ms 27.851 ms]
       thrpt:  [35.906   B/s 36.087   B/s 36.241   B/s]
change:
       time:   [+0.0765% +0.6617% +1.2188%] (p = 0.02 < 0.05)
       thrpt:  [−1.2042% −0.6573% −0.0765%]
Found 4 outliers among 100 measurements (4.00%)

1 (1.00%) high mild

3 (3.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: No change in performance detected.

       time:   [632.30 ms 636.66 ms 640.95 ms]
       thrpt:  [156.02 MiB/s 157.07 MiB/s 158.15 MiB/s]
change:
       time:   [−0.5513% +0.3484% +1.1860%] (p = 0.44 > 0.05)
       thrpt:  [−1.1721% −0.3472% +0.5543%]
Found 8 outliers among 100 measurements (8.00%)

3 (3.00%) low severe

3 (3.00%) low mild

1 (1.00%) high mild

1 (1.00%) high severe

decode 4096 bytes, mask ff: Change within noise threshold.

       time:   [11.608 µs 11.748 µs 12.010 µs]
       change: [−2.9510% −1.9178% −0.7981%] (p = 0.00 < 0.05)
Found 16 outliers among 100 measurements (16.00%)

3 (3.00%) low severe

6 (6.00%) low mild

1 (1.00%) high mild

6 (6.00%) high severe

decode 1048576 bytes, mask ff: Change within noise threshold.

       time:   [3.0624 ms 3.0717 ms 3.0826 ms]
       change: [+0.9793% +1.4487% +1.8947%] (p = 0.00 < 0.05)
Found 8 outliers among 100 measurements (8.00%)

8 (8.00%) high severe

decode 4096 bytes, mask 7f: 💚 Performance has improved.

       time:   [19.333 µs 19.372 µs 19.417 µs]
       change: [−4.3908% −3.7307% −3.2190%] (p = 0.00 < 0.05)
Found 16 outliers among 100 measurements (16.00%)

8 (8.00%) low mild

1 (1.00%) high mild

7 (7.00%) high severe

decode 1048576 bytes, mask 7f: Change within noise threshold.

       time:   [5.0855 ms 5.0988 ms 5.1138 ms]
       change: [+0.4164% +0.8193% +1.1970%] (p = 0.00 < 0.05)
Found 15 outliers among 100 measurements (15.00%)

15 (15.00%) high severe

decode 4096 bytes, mask 3f: 💚 Performance has improved.

       time:   [5.5231 µs 5.5830 µs 5.6974 µs]
       change: [−33.422% −32.886% −32.152%] (p = 0.00 < 0.05)
Found 13 outliers among 100 measurements (13.00%)

5 (5.00%) low mild

4 (4.00%) high mild

4 (4.00%) high severe

decode 1048576 bytes, mask 3f: 💔 Performance has regressed.

       time:   [1.7579 ms 1.7608 ms 1.7651 ms]
       change: [+9.4512% +10.305% +10.985%] (p = 0.00 < 0.05)
Found 2 outliers among 100 measurements (2.00%)

2 (2.00%) high severe

1000 streams of 1 bytes/multistream: 💔 Performance has regressed.

       time:   [36.759 ns 37.233 ns 37.706 ns]
       change: [+30.034% +31.950% +33.949%] (p = 0.00 < 0.05)

1000 streams of 1000 bytes/multistream: 💔 Performance has regressed.

       time:   [36.784 ns 41.017 ns 49.089 ns]
       change: [+29.043% +44.089% +72.664%] (p = 0.00 < 0.05)
Found 1 outliers among 500 measurements (0.20%)

1 (0.20%) high severe

coalesce_acked_from_zero 1+1 entries: No change in performance detected.

       time:   [87.836 ns 88.148 ns 88.457 ns]
       change: [−0.8316% −0.2213% +0.3933%] (p = 0.49 > 0.05)
Found 10 outliers among 100 measurements (10.00%)

8 (8.00%) high mild

2 (2.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.

       time:   [105.48 ns 105.80 ns 106.13 ns]
       change: [−0.1882% +0.1444% +0.5082%] (p = 0.42 > 0.05)
Found 14 outliers among 100 measurements (14.00%)

1 (1.00%) low mild

1 (1.00%) high mild

12 (12.00%) high severe

coalesce_acked_from_zero 10+1 entries: Change within noise threshold.

       time:   [104.66 ns 104.90 ns 105.25 ns]
       change: [−2.0151% −1.1299% −0.4133%] (p = 0.00 < 0.05)
Found 12 outliers among 100 measurements (12.00%)

2 (2.00%) low severe

5 (5.00%) low mild

2 (2.00%) high mild

3 (3.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.

       time:   [88.792 ns 89.062 ns 89.469 ns]
       change: [−0.9132% +0.0697% +1.0494%] (p = 0.89 > 0.05)
Found 13 outliers among 100 measurements (13.00%)

5 (5.00%) high mild

8 (8.00%) high severe

RxStreamOrderer::inbound_frame(): No change in performance detected.

       time:   [108.18 ms 108.29 ms 108.44 ms]
       change: [−0.0366% +0.0835% +0.2346%] (p = 0.26 > 0.05)
Found 9 outliers among 100 measurements (9.00%)

4 (4.00%) low mild

2 (2.00%) high mild

3 (3.00%) high severe

sent::Packets::take_ranges: No change in performance detected.

       time:   [7.9960 µs 8.2006 µs 8.3887 µs]
       change: [−1.8530% +4.6956% +15.160%] (p = 0.39 > 0.05)
Found 18 outliers among 100 measurements (18.00%)

3 (3.00%) low severe

13 (13.00%) low mild

1 (1.00%) high mild

1 (1.00%) high severe

transfer/pacing-false/varying-seeds: Change within noise threshold.

       time:   [36.552 ms 36.618 ms 36.684 ms]
       change: [−2.4071% −2.0832% −1.7592%] (p = 0.00 < 0.05)
Found 2 outliers among 100 measurements (2.00%)

2 (2.00%) high mild

transfer/pacing-true/varying-seeds: Change within noise threshold.

       time:   [37.331 ms 37.444 ms 37.563 ms]
       change: [−3.0125% −2.5383% −2.1055%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

transfer/pacing-false/same-seed: Change within noise threshold.

       time:   [36.476 ms 36.537 ms 36.599 ms]
       change: [−2.4903% −2.2703% −2.0461%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high mild

transfer/pacing-true/same-seed: Change within noise threshold.

       time:   [38.046 ms 38.131 ms 38.215 ms]
       change: [−3.0493% −2.7758% −2.4806%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) low mild

Download data for profiler.firefox.com or download performance comparison data.

fix: Add some crate features for performance

c532c76

Let's see if they do. Also, @mxinden, I was wondering why we went with a multi-threaded `tokio` client and server. I'm wondering if the thread-management overheads are worth it compared to using just the `rt` scheduler?

larseggert mentioned this pull request Mar 12, 2025

chore: Pin deps via Cargo.lock #2461

Merged

larseggert added 6 commits March 12, 2025 15:36

Add perf

d4e3f83

Not needed

911f129

Merge branch 'main' into fix-features

9d87f9e

Merge branch 'main' into fix-features

316c4d1

Signed-off-by: Lars Eggert <lars@eggert.org>

Merge branch 'main' into fix-features

365bac6

Signed-off-by: Lars Eggert <lars@eggert.org>

Merge branch 'main' into fix-features

85de634

Signed-off-by: Lars Eggert <lars@eggert.org>

larseggert marked this pull request as ready for review May 9, 2025 17:23

larseggert requested review from KershawChang, martinthomson and mxinden as code owners May 9, 2025 17:23

Merge branch 'main' into fix-features

d2091eb

Signed-off-by: Lars Eggert <lars@eggert.org>

larseggert added 4 commits June 12, 2025 09:23

Merge branch 'main' into fix-features

3a7d91e

Signed-off-by: Lars Eggert <lars@eggert.org>

Merge branch 'main' into fix-features

2ab3111

Signed-off-by: Lars Eggert <lars@eggert.org>

Update Cargo.lock

f81bc60

Merge branch 'main' into fix-features

2b7ff54

Signed-off-by: Lars Eggert <lars@eggert.org>

Merge branch 'main' into fix-features

9738394

larseggert added 2 commits June 30, 2025 09:28

Only regex

c44f46c

Merge branch 'fix-features' of github.com:larseggert/neqo into fix-fe…

7679d70

…atures

larseggert added 2 commits July 1, 2025 11:57

Merge branch 'main' into fix-features

c0c5758

Merge branch 'main' into fix-features

5a945b0

Merge branch 'main' into fix-features

33b06ac

martinthomson reviewed Jul 3, 2025

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

Update Cargo.toml

9ed9bb1

Co-authored-by: Martin Thomson <mt@lowentropy.net> Signed-off-by: Lars Eggert <lars@eggert.org>

larseggert enabled auto-merge July 4, 2025 08:09

larseggert requested a review from martinthomson July 4, 2025 12:28

Merge branch 'main' into fix-features

fa664b2

fix: Add some crate features for performance #2477

Are you sure you want to change the base?

fix: Add some crate features for performance #2477

Uh oh!

Conversation

larseggert commented Mar 6, 2025

Uh oh!

codecov bot commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Failed Interop Tests

neqo-latest as client

neqo-latest as server

Succeeded Interop Tests

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

neqo-latest as client

neqo-latest as server

Uh oh!

github-actions bot commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark results

Client/server transfer results

Uh oh!

mxinden commented Mar 6, 2025

Uh oh!

mxinden commented May 23, 2025

Uh oh!

github-actions bot commented Jun 24, 2025

Bencher Report

🚨 1 Alert

Uh oh!

github-actions bot commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

github-actions bot commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

github-actions bot commented Jun 30, 2025

Bencher Report

Uh oh!

github-actions bot commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

github-actions bot commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark results

Uh oh!

github-actions bot commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Client/server transfer results

Uh oh!

larseggert commented Jul 3, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Client/server transfer results

Uh oh!

github-actions bot commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark results

Uh oh!

Uh oh!

codecov bot commented Mar 6, 2025 •

edited

Loading

github-actions bot commented Mar 6, 2025 •

edited

Loading

github-actions bot commented Mar 6, 2025 •

edited

Loading

github-actions bot commented Jun 27, 2025 •

edited

Loading

github-actions bot commented Jun 27, 2025 •

edited

Loading

github-actions bot commented Jun 30, 2025 •

edited

Loading

github-actions bot commented Jul 1, 2025 •

edited

Loading

github-actions bot commented Jul 1, 2025 •

edited

Loading

github-actions bot commented Jul 4, 2025 •

edited

Loading

github-actions bot commented Jul 4, 2025 •

edited

Loading