Bart Massey
This code is originally by /u/bruce3434 on this Reddit
thread.
The fundamental issue was that dropping a BufWriter on top
of StdoutLocked sped the code up by a factor of 2× even
though the writes contained no newlines. This Reddit
comment
explains what is going on; this codebase is the underlying
code being measured.
-
glacial.rsuses unlockedStdout. This is really slow due to all the locking. -
slow.rsusesStdoutLocked. This is still pretty slow, for reasons explained in the comment above. -
fast.rsuses aBufWriteratopStdoutLocked. This is the version that is 2× faster than the slow version. -
speedy.rsuses aBufWriteratop a raw UNIXFile. It is a little faster than the fast version, but is portable only to UNIX systems and has anunsafein it. -
turbo.cis the original inspiration and about the fastest, a C implementation authored by DEC05EBA. Its speedup tricks are used by the other fast versions here. -
turbo.rsis a fairly straightforward port ofturbo.c, which avoids standard library routines for things in favor of hand-calculation.turbo.rsis about 30% slower thanturbo.c. -
lightning.cppis a port ofturbo.rsauthored by DEC05EBA and contributed by Hossain Adnan. It uses a manual buffer. It is comparable in performance toturbo.c. -
lightning.rsis a port oflightning.cppcontributed by Hossain Adnan. It uses a manual buffer currently backed bystd::Vec::<u8>along with POSIXwrite(). It's about 30% slower thanturbo.rs. -
ludicrous.rsis a version by DEC05EBA that uses a handmade buffer. It is about 10% slower thanturbo.c. -
serious.rs(not actually serious) is a C-like Rust implementation with tons ofunsafeemploying all the tricks. It is the same speed asturbo.c(currently insignificantly faster, actually), which is reasonable given that it's even uglier and no safer. "You can write FORTRAN in any language."
Many of these will run only on a POSIX system. I have tried them only on Linux.
Compiler choice matters for the faster C / C++ benchmarks
here. clang / gcc and clang++ / g++ will give
different answers. By default clang and clang++ are used
to increase comparability with Rust's LLVM toolchain.
-
To run the benchmarks:
-
Install Hyperfine with
cargo install hyperfine -
Build the Rust benchmarks with
cargo build --release -
Say
make bench
The results will be available in
BENCH.md. Here are my results from 2022-11-29 on an AMD Ryzen 9 3900X withrustc1.64.0 andclang/clang++14.0.6. They are not significantly different than when run several years ago on older hardware. -
-
To check that the benchmarks produce the same output say
make check. Themd5sums should match.