Skip to content

ColinLeeo/SIMD_TS2DIFF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SIMD_TS2DIFF

Ts_2DiFF is a key encoding algorithm in TsFile, specifically designed for compressing timestamp data. Based on second-order delta encoding, it significantly reduces storage space and achieves high compression ratios, making it especially effective for high-frequency and sequential timestamp series. It serves as a core component in TsFile’s time-series compression and decompression pipeline.

As TsFile becomes increasingly adopted in machine learning workloads—for example, as a backend for data loading during model training—it is essential to support access patterns typical of ML scenarios, such as efficient random access and high-throughput batch loading. To meet these demands, parallelizing the Ts_2Diff encoding algorithm and enabling predicate filtering without full decoding are promising directions. These enhancements form a crucial part of the TsFile for AI initiative, which aims to build an efficient and intelligent data infrastructure tailored to AI workloads.

SIMD Benchmark Result

/Users/colin/dev/SIMD_TS2DIFF/cmake-build-debug/src/ts_2diff

==== STABLE   (diff in [1,100])  N=5000000 ====
[Encode] compressed_bytes=4961245 raw_bytes=20000000 ratio(raw/comp)=4.03125
[Scalar] best=6.34846 ms  1.26969 ns/val  787.593 Mvals/s  745.285 MB/s (input)
[SIMD ] best=4.79567 ms  0.959133 ns/val  1042.61 Mvals/s  986.601 MB/s (input)
[Check] equal=true

==== UNSTABLE (diff in [-50,100])  N=5000000 ====
[Encode] compressed_bytes=5581400 raw_bytes=20000000 ratio(raw/comp)=3.58333
[Scalar] best=6.84112 ms  1.36823 ns/val  730.874 Mvals/s  778.065 MB/s (input)
[SIMD ] best=4.75662 ms  0.951325 ns/val  1051.17 Mvals/s  1119.04 MB/s (input)
[Check] equal=true

==== STABLE   random filter  N=5000000 ====
[Encode] compressed_bytes=4961245 raw_bytes=20000000 ratio(raw/comp)=4.03125
[Sample] front=263503179 mid=340546177 back=499418560

[Query 0] type=0 value=263503179 rvalue=0
[Scalar] 11.1274 ms, out=1
[SIMD  ] 0.138708 ms, out=1 equal=true

[Query 1] type=0 value=340546177 rvalue=0
[Scalar] 11.3026 ms, out=1
[SIMD  ] 0.140375 ms, out=1 equal=true

[Query 2] type=0 value=499418560 rvalue=0
[Scalar] 11.3098 ms, out=1
[SIMD  ] 0.139666 ms, out=1 equal=true

[Query 3] type=1 value=340546177 rvalue=0
[Scalar] 12.2432 ms, out=3256048
[SIMD  ] 8.94217 ms, out=3256048 equal=true

[Query 4] type=1 value=499418560 rvalue=0
[Scalar] 11.2122 ms, out=107522
[SIMD  ] 0.428291 ms, out=107522 equal=true

[Query 5] type=5 value=340546177 rvalue=340547177
[Scalar] 11.1447 ms, out=21
[SIMD  ] 4.392 ms, out=21 equal=true

==== UNSTABLE random filter  N=5000000 ====
[Encode] compressed_bytes=5581400 raw_bytes=20000000 ratio(raw/comp)=3.58333
[Sample] front=130454756 mid=168634590 back=247288488

[Query 0] type=0 value=130454756 rvalue=0
[Scalar] 11.0137 ms, out=1
[SIMD  ] 0.162333 ms, out=1 equal=true

[Query 1] type=0 value=168634590 rvalue=0
[Scalar] 11.0585 ms, out=1
[SIMD  ] 0.191792 ms, out=1 equal=true

[Query 2] type=0 value=247288488 rvalue=0
[Scalar] 11.0591 ms, out=1
[SIMD  ] 0.1975 ms, out=1 equal=true

[Query 3] type=1 value=168634590 rvalue=0
[Scalar] 11.5686 ms, out=3256049
[SIMD  ] 9.09975 ms, out=3256049 equal=true

[Query 4] type=1 value=247288488 rvalue=0
[Scalar] 10.985 ms, out=107523
[SIMD  ] 0.498416 ms, out=107523 equal=true

[Query 5] type=5 value=168634590 rvalue=168635590
[Scalar] 11.3651 ms, out=43
[SIMD  ] 4.44271 ms, out=43 equal=true

[STABLE SUMMARY]
  Encoded: 4961245 bytes, Raw: 20000000 bytes, Ratio(raw/comp): 4.03125
  Scalar: 6.34846 ms, 1.26969 ns/val, 787.593 Mvals/s, 745.285 MB/s (input)
  SIMD  : 4.79567 ms, 0.959133 ns/val, 1042.61 Mvals/s, 986.601 MB/s (input)
  Equal : true

[UNSTABLE SUMMARY]
  Encoded: 5581400 bytes, Raw: 20000000 bytes, Ratio(raw/comp): 3.58333
  Scalar: 6.84112 ms, 1.36823 ns/val, 730.874 Mvals/s, 778.065 MB/s (input)
  SIMD  : 4.75662 ms, 0.951325 ns/val, 1051.17 Mvals/s, 1119.04 MB/s (input)
  Equal : true

Process finished with exit code 1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published