A high-performance statistical calculations library implemented in Rust with bindings for Python, Java, and Go.
Author: Steven Zhou @ March, 2025
- Moving average calculation
- Maximum value
- Minimum value
- Standard deviation
- Percentile calculation
- Outlier detection
- Forecasting
- Arithmetic operations (abs, log2, log10)
- Rate calculations
- Exponential smoothing (EWMA)
- Time shifting and alignment
- Interpolation
- Exclusion operations
- Ranking operations
- Regression analysis
- Rollup operations
- Rust (latest stable version)
- Cargo
- Navigate to the library directory:
cd stats_lib
- Build the library:
cargo build
- Run the tests:
cargo test -- --nocapture
Expected test output should show all tests passing:
running 14 tests
test tests::test_max ... ok
test tests::test_moving_average ... ok
test tests::test_min ... ok
test tests::test_percentile ... ok
test timeseries::tests::test_abs ... ok
test timeseries::tests::test_align_timestamp ... ok
test tests::test_stddev ... ok
test timeseries::tests::test_detect_outliers ... ok
test timeseries::tests::test_ewma ... ok
test timeseries::tests::test_invalid_inputs ... ok
test timeseries::tests::test_forecast ... ok
test timeseries::tests::test_rate ... ok
test timeseries::tests::test_log2 ... ok
test timeseries::tests::test_timeshift ... ok
test result: ok. 14 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
- Run specific tests:
# Run tests with a specific name pattern
cargo test test_moving_average
# Run tests in a specific module
cargo test timeseries::tests::
# Run tests with output
cargo test -- --nocapture
- Build for release:
cargo build --release
The release build will create optimized library files:
- macOS:
target/release/libstats_lib.dylib
- Linux:
target/release/libstats_lib.so
- Windows:
target/release/stats_lib.dll
- Navigate to the Rust example directory:
cd examples/rust_example
- Run the example:
cargo run
Expected output:
Basic Statistics Example:
------------------------
Dataset: [1.0, 2.0, 3.0, 4.0, 5.0]
Maximum: 5.00
Minimum: 1.00
Standard Deviation: 1.41
Moving Average (window size = 3): [2.0, 3.0, 4.0]
Time Series Analysis Example:
---------------------------
Outliers detected: [TimeSeriesPoint { timestamp: 3, value: 10.0 }]
Absolute values: [TimeSeriesPoint { timestamp: 0, value: 1.0 }, TimeSeriesPoint { timestamp: 1, value: 2.0 }, TimeSeriesPoint { timestamp: 2, value: 3.0 }, TimeSeriesPoint { timestamp: 3, value: 10.0 }, TimeSeriesPoint { timestamp: 4, value: 4.0 }, TimeSeriesPoint { timestamp: 5, value: 5.0 }]
Rates of change: [TimeSeriesPoint { timestamp: 1, value: 1.0 }, TimeSeriesPoint { timestamp: 2, value: 1.0 }, TimeSeriesPoint { timestamp: 3, value: 7.0 }, TimeSeriesPoint { timestamp: 4, value: -6.0 }, TimeSeriesPoint { timestamp: 5, value: 1.0 }]
EWMA smoothed values: [TimeSeriesPoint { timestamp: 0, value: 1.0 }, TimeSeriesPoint { timestamp: 1, value: 1.2999999999999998 }, TimeSeriesPoint { timestamp: 2, value: 1.8099999999999996 }, TimeSeriesPoint { timestamp: 3, value: 4.266999999999999 }, TimeSeriesPoint { timestamp: 4, value: 4.1869 }, TimeSeriesPoint { timestamp: 5, value: 4.430829999999999 }]
Error generating forecast: InvalidInput("Need at least two complete seasons of data")
- Run the Rust example tests:
cd examples/rust_example
cargo test
Expected test output:
running 4 tests
test tests::test_moving_average ... ok
test tests::test_basic_stats ... ok
test tests::test_invalid_window ... ok
test tests::test_time_series ... ok
test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
- Navigate to the Python examples directory:
cd examples/python
- Ensure you have the required Python packages:
pip install numpy pytest
- Build the Rust library in release mode:
# From the stats_lib directory
CARGO_BUILD_TARGET=aarch64-apple-darwin cargo build --release
- Run the Python tests:
# Run all tests with detailed output
python3 -m pytest test_stats_lib.py -v
# Run a specific test
python3 -m pytest test_stats_lib.py -v -k "test_moving_average"
# Run the simple test script
python3 test_stats.py
- Run the example script:
python3 example.py
Expected test output should show all tests passing:
test_stats_lib.py::TestBasicStats::test_max PASSED
test_stats_lib.py::TestBasicStats::test_min PASSED
test_stats_lib.py::TestBasicStats::test_moving_average PASSED
test_stats_lib.py::TestBasicStats::test_stddev PASSED
test_stats_lib.py::TestTimeSeries::test_creation PASSED
test_stats_lib.py::TestTimeSeries::test_properties PASSED
-
If you see an error about not finding the library:
- Ensure you've built the Rust library in release mode
- Check that the library file exists in
../../target/release/
- Verify the library architecture matches your system
-
Common Python-specific issues:
- Missing NumPy: Install with
pip install numpy
- Missing pytest: Install with
pip install pytest
- Library load errors: Check the library path in
stats_lib.py
- Missing NumPy: Install with
- First, build the Rust library in release mode:
# From the stats_lib directory
cargo build --release
- Make sure the library is built correctly:
# Check if the library exists
ls -l target/release/libstats_lib.dylib
- Navigate to the Go examples directory:
cd examples/go
- Set the library path environment variables:
# For macOS
export DYLD_LIBRARY_PATH="$(pwd)/../../target/release:$DYLD_LIBRARY_PATH"
export LIBRARY_PATH="$(pwd)/../../target/release:$LIBRARY_PATH"
- Run the example program:
# Run with CGO enabled for Apple Silicon
GOARCH=arm64 CGO_ENABLED=1 go run cmd/example/main.go
If you encounter errors about undefined symbols or missing functions, you may need to check that:
- The Rust library is built correctly for your architecture
- The library path is set correctly
- The Go code is using the correct function signatures
For troubleshooting:
# Check the library architecture
file ../../target/release/libstats_lib.dylib
# Check the library symbols
nm -g ../../target/release/libstats_lib.dylib
- Run the Go tests:
# Navigate to the stats package directory
cd pkg/stats
# Run tests with verbose output
GOARCH=arm64 CGO_ENABLED=1 go test -v
Expected test output:
=== RUN TestNewTimeSeries
=== RUN TestNewTimeSeries/valid_series
=== RUN TestNewTimeSeries/mismatched_lengths
--- PASS: TestNewTimeSeries
=== RUN TestMovingAverage
=== RUN TestMovingAverage/valid_window
=== RUN TestMovingAverage/window_too_large
=== RUN TestMovingAverage/window_zero
--- PASS: TestMovingAverage
=== RUN TestMax
=== RUN TestMax/valid_data
=== RUN TestMax/empty_data
--- PASS: TestMax
=== RUN TestMin
=== RUN TestMin/valid_data
=== RUN TestMin/empty_data
--- PASS: TestMin
=== RUN TestStdDev
=== RUN TestStdDev/valid_data
=== RUN TestStdDev/single_point
=== RUN TestStdDev/empty_data
--- PASS: TestStdDev
PASS
-
Common Go-specific issues:
- CGO not enabled: Make sure to set
CGO_ENABLED=1
- Architecture mismatch: Set
GOARCH=arm64
for Apple Silicon - Library not found: Ensure Rust library is built in release mode
- Linker warnings about
LC_DYSYMTAB
: These can be safely ignored
- CGO not enabled: Make sure to set
-
If tests fail:
- Check that the Rust library is built correctly
- Verify Go environment variables are set properly
- Ensure you're in the correct directory for running tests
- The Rust library handles memory allocation and deallocation
- Python, Java, and Go bindings properly manage memory through their respective FFI mechanisms
- Memory leaks are prevented by proper cleanup in each language binding
- The core statistical functions are thread-safe
- Each language binding handles concurrent access appropriately
- No global mutable state is maintained
- FFI calls have overhead; batch operations when possible
- Large datasets should be processed in chunks
- Consider using the native Rust interface for performance-critical applications
[Your License Here]
[Contributing Guidelines]
If you're using an Apple Silicon (M1/M2) Mac, you need to ensure:
- Build the library for ARM64:
# Check your architecture
uname -m # Should show 'arm64'
# Clean and rebuild
cd ../..
cargo clean
CARGO_BUILD_TARGET=aarch64-apple-darwin cargo build --release
# Verify the library architecture
file target/release/libstats_lib.dylib # Should show 'arm64'
- Set the correct library path and architecture:
# Set both library paths
export DYLD_LIBRARY_PATH="../../target/release:$DYLD_LIBRARY_PATH"
export DYLD_FALLBACK_LIBRARY_PATH="../../target/release:$DYLD_FALLBACK_LIBRARY_PATH"
# Run with architecture-specific options
java -cp .:jna-5.12.1.jar \
-Djna.library.path="$(pwd)/../../target/release" \
-Djna.platform.library.path="$(pwd)/../../target/release" \
-Djna.debug_load=true \
TestStats
- Common Apple Silicon Issues:
- If you see
darwin-x86-64
in error messages, the JNA is trying to load x86 library - If you see
aarch64
orarm64
in messages, it's correctly detecting Apple Silicon - Use
otool -L libstats_lib.dylib
to verify library dependencies
- If you see
# Run all tests
cargo test
# Run tests with output
cargo test -- --nocapture
# Run specific test categories
cargo test test_basic # Run basic statistical tests
cargo test test_timeseries # Run time series tests
cargo test test_invalid # Run invalid input tests
# Run tests with coverage (requires cargo-tarpaulin)
cargo install cargo-tarpaulin
cargo tarpaulin
# Run benchmarks (requires nightly Rust)
cargo bench
-
Basic Statistical Tests
- Moving average calculation
- Maximum/minimum value detection
- Standard deviation computation
- Percentile calculation
-
Time Series Tests
- Outlier detection
- Forecasting
- Arithmetic operations
- Rate calculations
- EWMA smoothing
- Time shifting
- Timestamp alignment
-
Invalid Input Tests
- Empty series handling
- Invalid parameter validation
- Error message verification
When running cargo test -- --nocapture
, you should see output similar to:
running 14 tests
test tests::test_moving_average ... ok
test tests::test_max ... ok
test tests::test_min ... ok
test tests::test_stddev ... ok
test tests::test_percentile ... ok
test timeseries::tests::test_detect_outliers ... ok
test timeseries::tests::test_forecast ... ok
test timeseries::tests::test_abs ... ok
test timeseries::tests::test_log2 ... ok
test timeseries::tests::test_rate ... ok
test timeseries::tests::test_ewma ... ok
test timeseries::tests::test_timeshift ... ok
test timeseries::tests::test_align_timestamp ... ok
test timeseries::tests::test_invalid_inputs ... ok
test result: ok. 14 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
To check test coverage:
cargo tarpaulin --out Html
# Opens coverage report in your browser
Expected coverage metrics:
- Basic Statistics: >95% coverage
- Time Series Operations: >90% coverage
- Error Handling: >85% coverage
# Run all benchmarks
cargo bench
# Run specific benchmark
cargo bench --bench timeseries_benchmarks
-
Python Requirements:
- Python 3.6 or later
- NumPy library
python3 --version # Should be 3.6 or higher pip3 install numpy
-
Build the Rust Library:
# From the stats_lib directory cargo build --release
stats_lib/
├── src/ # Rust source code
├── target/
│ └── release/ # Contains compiled library
│ └── libstats_lib.dylib # macOS
│ # or libstats_lib.so # Linux
│ # or stats_lib.dll # Windows
└── examples/
└── python/
├── stats_lib.py # Python wrapper
├── test_stats_lib.py # Test suite
└── example.py # Usage example
-
Set up the library path:
# For macOS: export DYLD_LIBRARY_PATH="$(pwd)/target/release:$DYLD_LIBRARY_PATH" # For Linux: export LD_LIBRARY_PATH="$(pwd)/target/release:$LD_LIBRARY_PATH" # For Windows, add the directory to PATH # set PATH=%PATH%;%CD%\target\release
-
Run the example program:
cd examples/python python3 example.py
Expected output:
Statistical Calculations Library Example ====================================== Basic Statistics: ---------------- Moving average (window=3): [2. 3. 4. 5. 6. 7. 8. 9.] Maximum value: 10.0 Minimum value: 1.0 Standard deviation: 3.0276503540974917 Median (50th percentile): 6.0 Time Series Analysis: ------------------- Time Series Data: t=0: 0.000 t=1: 0.588 t=2: 0.951 t=3: 0.951 t=4: 0.588 t=5: 0.000 t=6: -0.588 t=7: -0.951 t=8: -0.951 t=9: -0.588
-
Run all tests with detailed output:
cd examples/python python3 -m unittest test_stats_lib.py -v
-
Run specific test classes:
# Run only basic stats tests python3 -m unittest test_stats_lib.TestBasicStats -v # Run only time series tests python3 -m unittest test_stats_lib.TestTimeSeries -v
-
Run individual test methods:
# Run specific test method python3 -m unittest test_stats_lib.TestBasicStats.test_stddev -v
Expected test output:
test_max (test_stats_lib.TestBasicStats)
Test maximum value calculation. ... ok
test_min (test_stats_lib.TestBasicStats)
Test minimum value calculation. ... ok
test_moving_average (test_stats_lib.TestBasicStats)
Test moving average calculation. ... ok
test_percentile (test_stats_lib.TestBasicStats)
Test percentile calculation. ... ok
test_stddev (test_stats_lib.TestBasicStats)
Test standard deviation calculation. ... ok
test_creation (test_stats_lib.TestTimeSeries)
Test time series creation. ... ok
test_properties (test_stats_lib.TestTimeSeries)
Test time series properties. ... ok
----------------------------------------------------------------------
Ran 7 tests in 0.004s
OK
The test suite covers:
-
Basic Statistical Functions:
-
Moving Average
- Regular calculation with window size 3
- Input validation for window sizes
-
Maximum/Minimum Values
- Basic number sequences
- Negative numbers
- NumPy array support
-
Standard Deviation
- Simple sequences with known stddev
- Complex sequences
- Input validation (minimum 2 points)
-
Percentile Calculation
- Median (50th percentile)
- Min/Max (0th/100th percentiles)
- Quartiles (25th/75th percentiles)
- Input validation for percentile range
-
-
Time Series Functionality:
-
Creation and Validation
- Basic initialization
- Length validation
- Error handling for mismatched lengths
-
Property Access
- Timestamp array access
- Value array access
- Type verification (NumPy arrays)
- Shape consistency
-
-
Library Not Found:
# Verify library exists ls -l target/release/libstats_lib* # Check library dependencies # macOS: otool -L target/release/libstats_lib.dylib # Linux: ldd target/release/libstats_lib.so
-
Python Import Errors:
- Ensure NumPy is installed:
pip3 list | grep numpy
- Verify Python version:
python3 --version
- Check library path is set correctly
- Ensure NumPy is installed:
-
Test Failures:
- Run tests with increased verbosity:
python3 -m unittest -v test_stats_lib.py
- Check library is built in release mode
- Verify library path environment variables
- Run tests with increased verbosity:
-
Common Issues:
- "Library not found" - Check library path and build status
- "ImportError" - Verify NumPy installation and Python version
- "TypeError" - Ensure correct data types in function calls
- "ValueError" - Check input validation requirements
[Your License Here]
[Contributing Guidelines]
- Go 1.16 or later
stats_lib/examples/go/
├── cmd/
│ └── example/
│ ├── main.go # Example usage program with full API implementation
│ └── main_test.go # Comprehensive test suite
└── pkg/
└── stats/
├── stats.go # GO wrapper for Rust library (incomplete)
└── stats_test.go # Test suite
The Go example now includes a comprehensive implementation of all the API functions defined in the Rust library, implemented in pure Go. This implementation provides all the functionality of the original Rust library without requiring FFI bindings.
To run the example program:
cd stats_lib/examples/go/cmd/example
go run main.go
Expected output:
Basic Statistical Calculations:
Data: [1 2 3 4 5 6 7 8 9 10]
Moving Average (window=3): [2 3 4 5 6 7 8 9]
Maximum value: 10.00
Minimum value: 1.00
Standard deviation: 3.03
Time Series Analysis:
Original Time Series:
t=1: 1.000
t=2: 2.000
t=3: 3.000
t=4: 4.000
t=5: 5.000
...
Outlier Detection:
Found 1 outliers
t=3: 10.000
Absolute Values:
t=1: 1.000
t=2: 2.000
t=3: 3.000
t=4: 4.000
t=5: 5.000
Log2 Values:
t=1: 0.000
t=2: 1.000
t=3: 1.585
t=4: 2.000
t=5: 2.322
...
Rate Values:
t=2: 1.000
t=3: 1.000
t=4: 1.000
t=5: 1.000
...
EWMA Values (alpha=0.3):
t=1: 1.000
t=2: 1.300
t=3: 1.810
t=4: 2.467
t=5: 3.127
...
Timeshifted Values (offset=3600):
t=3601: 1.000
t=3602: 2.000
t=3603: 3.000
t=3604: 4.000
t=3605: 5.000
...
Aligned Timestamps (interval=2):
t=0: 1.000
t=2: 2.500
t=4: 4.500
t=6: 6.500
t=8: 8.500
...
Forecast Values (horizon=12):
t=100: 0.975
t=101: 1.070
t=102: 1.096
t=103: 1.050
t=104: 0.935
...
The Go example includes comprehensive tests for all implemented functions. To run the tests:
cd stats_lib/examples/go/cmd/example
go test -v
Expected test output:
=== RUN TestNewTimeSeries
=== RUN TestNewTimeSeries/valid_series
=== RUN TestNewTimeSeries/mismatched_lengths
--- PASS: TestNewTimeSeries (0.00s)
=== RUN TestMovingAverage
=== RUN TestMovingAverage/valid_window
=== RUN TestMovingAverage/window_too_large
=== RUN TestMovingAverage/window_zero
--- PASS: TestMovingAverage (0.00s)
=== RUN TestMax
=== RUN TestMax/valid_data
=== RUN TestMax/negative_values
=== RUN TestMax/empty_data
--- PASS: TestMax (0.00s)
...
PASS
ok stats_lib/examples/go/cmd/example 0.336s
To run tests with coverage:
cd stats_lib/examples/go/cmd/example
go test -cover
The Go implementation includes all the API functions defined in the Rust library:
-
Basic Statistics:
- Moving Average
- Maximum/Minimum Values
- Standard Deviation
-
Time Series Analysis:
- Outlier Detection
- Forecasting
- Absolute Values
- Log2 Transformation
- Rate Calculation
- Exponentially Weighted Moving Average (EWMA)
- Timeshift
- Timestamp Alignment
The original Rust FFI layer for Go bindings is currently incomplete. The new implementation provides all the functionality in pure Go, making it easier to use and extend.
If you want to use the Rust FFI bindings in the future:
- Complete the implementation of the functions in
src/ffi.rs
- Update the Go code in
pkg/stats/stats.go
to use the implemented functions
- If you encounter any issues with the Go example:
- Make sure you're using Go 1.16 or later
- Check that you're running the commands from the correct directory
- Verify that the math package is available
[Your License Here]
[Contributing Guidelines]
To run the Rust example:
cd examples/rust_example
cargo run
To run the tests:
cargo test
- Java Requirements:
- Java 11 or later
java --version # Should be 11 or higher
stats_lib/
├── src/ # Rust source code
├── target/
│ └── release/ # Contains compiled library
│ └── libstats_lib.dylib # macOS
│ # or libstats_lib.so # Linux
│ # or stats_lib.dll # Windows
└── examples/
└── java/
├── src/
│ ├── main/java/com/statslib/
│ │ ├── Example.java # Usage example
│ │ └── MockStatsLib.java # Java implementation
│ └── test/java/com/statslib/
│ ├── StatsLibTest.java # Test suite
│ └── TestRunner.java # Test runner
├── build.sh # Build script
└── lib/ # Dependencies
-
Navigate to the Java example directory:
cd examples/java
-
Run the build script:
./build.sh
Expected output:
Statistical Calculations Library Example ====================================== Basic Statistics: ---------------- Moving average (window=3): [2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0] Maximum value: 10.00 Minimum value: 1.00 Standard deviation: 3.03 Time Series Analysis: ------------------- Original Time Series: t=0: 1.00 t=1: 2.00 t=2: 3.00 t=3: 10.00 t=4: 4.00 t=5: 5.00 t=6: 6.00 t=7: 7.00 t=8: 8.00 t=9: 9.00 Outliers (threshold=2.0): t=3: 10.00
The build script also runs the tests automatically. You can run them separately with:
cd examples/java
javac -d target/classes src/main/java/com/statslib/*.java
javac -d target/test-classes -cp "target/classes:lib/*" src/test/java/com/statslib/*.java
java -cp "target/classes:target/test-classes:lib/*" com.statslib.TestRunner
Expected test output:
Running tests...
Test: Moving Average
Test: Moving Average Invalid Window
Test: Max
Test: Max Negative
Test: Max Empty
Test: Min
Test: Min Negative
Test: StdDev
Test: StdDev Insufficient Data
Test: Detect Outliers
All tests passed!
The Java example includes a pure Java implementation of the statistical functions:
-
Basic Statistical Functions:
-
Moving Average
- Regular calculation with window size 3
- Input validation for window sizes
-
Maximum/Minimum Values
- Basic number sequences
- Negative numbers
- Empty array validation
-
Standard Deviation
- Simple sequences with known stddev
- Complex sequences
- Input validation (minimum 2 points)
-
-
Time Series Functionality:
- Outlier Detection
- Z-score based outlier detection
- Threshold configuration
- Timestamp and value pairing
- Outlier Detection
-
Java Version Issues:
- Ensure you're using Java 11 or later:
java --version
- If using an older version, update Java or modify the code to be compatible
- Ensure you're using Java 11 or later:
-
Test Failures:
- Check that the test data matches the expected values
- Verify the implementation of the statistical functions
-
Common Issues:
- "ClassNotFoundException" - Check your classpath and directory structure
- "NoClassDefFoundError" - Ensure all dependencies are downloaded correctly
[Your License Here]
[Contributing Guidelines]