Skip to content
/ libhmm Public

Modern C++17 Hidden Markov Model library with smart pointer memory management, comprehensive training algorithms, and critical bug fixes. Features Viterbi training, multiple probability distributions, and memory-safe implementation.

License

Notifications You must be signed in to change notification settings

OldCrow/libhmm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

50 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

libhmm - Modern C++17 Hidden Markov Model Library

C++17 CMake License Version Tests SIMD Threading

A modern, high-performance C++17 implementation of Hidden Markov Models with advanced statistical distributions, SIMD optimization, and parallel processing capabilities.

๐Ÿš€ Latest Release v2.9.1: Cross-platform architecture support with Apple Silicon ecosystem compatibility and zero build warnings. Features ARM64 HMM library ecosystem porting (HMMLib, GHMM, StochHMM, HTK, LAMP HMM, JAHMM), Intel SSE to ARM NEON SIMD optimization, and architecture-aware build system with automatic Homebrew path detection for Apple Silicon vs Intel Mac.

Major Achievements

โœ… Complete Boost dependency elimination - Replaced all Boost.uBLAS and Boost.Serialization dependencies with custom C++17 implementations
โœ… Custom Matrix/Vector implementations - SIMD-friendly contiguous memory layout with full API compatibility
โœ… Comprehensive 5-library benchmarking suite - 100% numerical agreement across HMMLib, GHMM, StochHMM, HTK, and libhmm
โœ… Zero external dependencies - C++17 standard library only, simplified build and deployment
โœ… Performance baseline established - Validated numerical accuracy with comprehensive performance characterization across HMM libraries

Features

๐ŸŽฏ Training Algorithms

  • Viterbi Training - Segmented k-means with clustering
  • Segmented K-Means - Alternative k-means implementation
  • Baum-Welch - Standard expectation-maximization
  • Scaled Baum-Welch - Numerically stable implementation

๐Ÿ“Š Probability Distributions

Discrete Distributions:

  • Discrete Distribution (categorical)
  • Binomial Distribution (success/failure trials)
  • Negative Binomial Distribution (success probability modeling)
  • Poisson Distribution (count data and rare events)

Continuous Distributions:

  • Gaussian (Normal) Distribution (symmetric, bell-curve)
  • Beta Distribution (probabilities and proportions on [0,1])
  • Gamma Distribution (positive continuous variables)
  • Exponential Distribution (waiting times and reliability)
  • Log-Normal Distribution (multiplicative processes)
  • Pareto Distribution (power-law phenomena)
  • Uniform Distribution (continuous uniform random variables)
  • Weibull Distribution (reliability analysis and survival modeling)
  • Student's t-Distribution (robust modeling with heavy tails)
  • Chi-squared Distribution (goodness-of-fit and categorical analysis)

Quality Standards: All distributions are being progressively updated to meet the Gold Standard Checklist for consistency, robustness, and maintainability.

๐Ÿงฎ Calculators

Forward-Backward Algorithms:

  • Standard Forward-Backward - Classic algorithm for probability computation
  • Scaled SIMD Forward-Backward - Numerically stable with SIMD optimization and automatic CPU fallback
  • Log SIMD Forward-Backward - Log-space computation with SIMD optimization and automatic CPU fallback

Viterbi Algorithms:

  • Standard Viterbi - Most likely state sequence decoding
  • Scaled SIMD Viterbi - Numerically stable Viterbi with SIMD optimization and automatic CPU fallback
  • Log SIMD Viterbi - Log-space Viterbi with SIMD optimization and automatic CPU fallback

Automatic Calculator Selection:

  • AutoCalculator - Intelligent algorithm selection based on problem characteristics
  • Performance Prediction - CPU feature detection and optimal calculator selection
  • Traits-Based Selection - Automatic fallback from SIMD to scalar implementations

โšก Performance Optimizations

  • SIMD Support: AVX, SSE2, and ARM NEON vectorization
  • Thread Pool: Modern C++17 work-stealing thread pool
  • Automatic Optimization: CPU feature detection and algorithm selection
  • Memory Efficiency: Aligned allocators and memory pools
  • Cache Optimization: Blocked algorithms for large matrices

๐Ÿ’พ I/O Support

  • XML file reading/writing
  • Extensible file I/O manager
  • Model serialization

๐Ÿงช Testing Infrastructure

  • Distribution Tests: 14 comprehensive distribution tests in tests/distributions/
  • Calculator Tests: 10 SIMD calculator and performance tests in tests/calculators/
  • Integration Tests: Core HMM functionality testing
  • 100% Test Coverage: All distributions with complete functionality testing
  • CMake/CTest Integration: Automated testing framework
  • Continuous Validation: Parameter fitting, edge cases, and error handling
  • Code Quality Enforcement: clang-tidy integration for automated style guide compliance

๐Ÿ“ˆ Benchmarking Suite

  • Multi-Library Validation: Integration with HMMLib, GHMM, StochHMM, HTK
  • Numerical Accuracy: 100% agreement at machine precision across libraries
  • Performance Baseline: Comprehensive performance characterization
  • Compatibility Documentation: Complete integration guides and fixes

Quick Start

Building with CMake

# Clone and build
git clone <repository-url>
cd libhmm
mkdir build && cd build
cmake ..
make -j$(nproc)

# Run tests
ctest

# Install
sudo make install

Cross-Platform Support: For detailed cross-platform building instructions including macOS, Linux, and configuration options, see Cross-Platform Build Guide.

Building with Make (Legacy)

make

Basic Usage

#include <libhmm/libhmm.h>

// Create HMM with 2 states and discrete observations
auto hmm = std::make_unique<Hmm>(2, 6); // 2 states, 6 observation symbols

// Set up training data
ObservationLists trainingData = { /* your observation sequences */ };

// Train using Baum-Welch
BaumWelchTrainer trainer(hmm.get(), trainingData);
trainer.train();

// Decode most likely state sequence
ViterbiCalculator viterbi(hmm.get(), observations);
StateSequence states = viterbi.decode();

Examples

See the examples/ directory for comprehensive usage examples:

Project Structure

libhmm/
โ”œโ”€โ”€ include/libhmm/           # Public headers
โ”‚   โ”œโ”€โ”€ calculators/          # Algorithm implementations
โ”‚   โ”œโ”€โ”€ distributions/        # Probability distributions
โ”‚   โ”œโ”€โ”€ training/            # Training algorithms
โ”‚   โ”œโ”€โ”€ io/                  # File I/O support
โ”‚   โ””โ”€โ”€ common/              # Utilities and types
โ”œโ”€โ”€ src/                     # Implementation files
โ”œโ”€โ”€ tests/                   # Test suite
โ”œโ”€โ”€ examples/                # Usage examples
โ”œโ”€โ”€ docs/                    # Documentation
โ””โ”€โ”€ CMakeLists.txt          # Modern build system

Requirements

  • C++17 compatible compiler (GCC 7+, Clang 6+, MSVC 2017+)
  • CMake 3.15+ (for CMake builds)
  • Make (for legacy Makefile builds)

Zero External Dependencies - libhmm now requires only the C++17 standard library!

Documentation

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Follow the Gold Standard Checklist for distribution implementations
  5. Ensure code complies with the project style guide (enforced via clang-tidy)
  6. Submit a pull request

Acknowledgments

  • Original implementation modernized to C++17 standards
  • Inspired by JAHMM and other HMM libraries
  • Mathematical foundations from Rabiner & Juang tutorials

About

Modern C++17 Hidden Markov Model library with smart pointer memory management, comprehensive training algorithms, and critical bug fixes. Features Viterbi training, multiple probability distributions, and memory-safe implementation.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published