UUIDv7 is a small, high-performance Java library for generating UUID version 7 identifiers, combining a 48-bit millisecond timestamp with 74 bits of high-quality entropy. Unlike the standard java.util.UUID.randomUUID()
, this implementation:
- Uses ThreadLocalRandom (non-blocking, fast PRNG) instead of
SecureRandom
. - Avoids intermediate
byte[16]
allocations andByteBuffer
overhead. - Produces about 50× the throughput of a naïve implementation while still conforming to the RFC 9562 layout.
- Minimizes per-UUID garbage (≈32 B per call vs. ≈176 B in a typical version).
UUIDv7 is ideal for distributed systems, microservices, databases, and high-throughput applications that need time-sortable unique identifiers without sacrificing performance.
- Fully RFC-compliant UUID v7 format (48 bits timestamp, 4 bits version, 2 bits variant, 74 bits random).
- Extremely low overhead: only one
UUID
object allocation per call (≈32 bytes). - High throughput: bench tests show ~200 million UUID/s on modern hardware.
- Thread-safe: uses
ThreadLocalRandom
internally. - Zero external dependencies beyond the JDK.
- Java 8+ compatible (tested up through Java 17/21).
<dependency>
<groupId>io.github.robsonkades</groupId>
<artifactId>uuidv7</artifactId>
<version>1.0.1</version>
</dependency>
After adding the dependency, run:
mvn clean install
Below is a summary of benchmark results comparing the naïve UUIDv7 implementation (using SecureRandom
+ ByteBuffer
) with this optimized implementation (using ThreadLocalRandom
+ bitwise assembly). All measurements were taken on a modern 8-core CPU (Intel/AMD), Java 17, Linux SSD, with JMH settings: 5 warmup iterations, 5 measurement iterations, 2 forks, single-threaded throughput mode.
Implementation | Throughput (ops/ms) | Bytes Allocated per UUID (B/op) | GC Alloc Rate (MB/s) |
---|---|---|---|
SecureRandom + ByteBuffer | ~4 725 (≈4.7 M/s) | ~176 B | ~793 MB/s |
Optimized (ThreadLocalRandom) | ~227 174 (≈227 M/s) | ~32 B | ~6 931 MB/s |
-
Throughput
- Naïve: ~4 725 ops/ms → ≈4.7 million UUIDs per second.
- Optimized: ~227 174 ops/ms → ≈227 million UUIDs per second.
- Result: ≈50× faster throughput in the optimized version.
-
Bytes Allocated per UUID
- Naïve: ~176 bytes of garbage (creates
byte[16]
+ByteBuffer
+ oneUUID
). - Optimized: ~32 bytes of garbage (only one
UUID
object). - Result: ≈82% fewer bytes allocated per call.
- Naïve: ~176 bytes of garbage (creates
-
GC Allocation Rate (MB/s)
- Naïve: ~793 MB allocated per second.
- Optimized: ~6 931 MB allocated per second (because it generates far more UUIDs).
- Although the optimized version allocates more in absolute MB/s, it does not increase per-UUID allocation—thus total throughput is massively higher and GC pauses, while more frequent, remain a small fraction of total runtime.
-
Throughput Gain
The optimized code leverages non-blockingThreadLocalRandom
and direct 64-bit/bitwise assembly, eliminating array and buffer overhead. The result is order-of-magnitude faster UUID generation, making it suitable for high-throughput, low-latency systems. -
Garbage Generation
By reducing each call to a single 32-byteUUID
allocation, the optimized approach minimizes per-call garbage. This keeps pause times short even when producing hundreds of millions of UUIDs per second. -
GC Behavior
- The naïve version triggers ~54 collections, spending ~35 ms total in GC during the measured period.
- The optimized version triggers ~268 collections, spending ~215 ms total in GC.
- In both cases, GC overhead is negligible relative to total execution time, but the optimized version still wins because it produces far more UUIDs in the same wall-clock time.
This project is licensed under the MIT License. See LICENSE for details.
Contributions, bug reports, and feature requests are always welcome! Please see CONTRIBUTING.md for more details.