libCacheSim Python Binding

Python bindings for libCacheSim, a high-performance cache simulator and analysis library.

📚 Documentation

English Documentation - Complete API reference, tutorials, and examples
中文文档 - 完整的API参考、教程和示例

Installation

Binary installers for the latest released version are available at the Python Package Index (PyPI).

pip install libcachesim

Installation from sources

If there are no wheels suitable for your environment, consider building from source.

bash scripts/install.sh

Run all tests to ensure the package works.

python -m pytest tests/

🚀 Features

High-Performance Cache Simulation: Built on the proven libCacheSim C++ library
Multiple Cache Algorithms: LRU, LFU, FIFO, ARC, S3FIFO, Sieve, TinyLFU, and more
Trace Processing: Support for various trace formats (CSV, binary, Oracle, etc.)
Synthetic Workload Generation: Zipf, uniform, and custom distributions
Trace Analysis: Comprehensive workload analysis and visualization tools
Custom Cache Policies: Implement new algorithms using Python hooks
Multi-language Documentation: English and Chinese documentation with examples

Quick Start

Basic Usage

import libcachesim as lcs

# Create a cache
cache = lcs.LRU(cache_size=1024*1024)  # 1MB cache

# Process requests
req = lcs.Request()
req.obj_id = 1
req.obj_size = 100

print(cache.get(req))  # False (first access)
print(cache.get(req))  # True (second access)

Trace Processing

To simulate with traces, we need to read the request of traces correctly. open_trace is an unified interface for trace reading, which accepet three parameters:

trace_path: trace path, can be relative or absolutive path.
type (optional): if not given, we will automatically infer the type of trace according to the suffix of the trace file.
params (optional): if not given, default params are applied.

import libcachesim as lcs

# Open trace and process efficiently
reader = lcs.open_trace(
    trace_path = "./data/cloudPhysicsIO.oracleGeneral.bin",
    type = lcs.TraceType.ORACLE_GENERAL_TRACE,
    params = lcs.ReaderInitParam(ignore_obj_size=True)
)
cache = lcs.S3FIFO(cache_size=1024*1024)

# Process entire trace efficiently (C++ backend)
obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
print(f"Object miss ratio: {obj_miss_ratio:.4f}, Byte miss ratio: {byte_miss_ratio:.4f}")

cache = lcs.S3FIFO(cache_size=1024*1024)
# Process with limits and time ranges
obj_miss_ratio, byte_miss_ratio = cache.process_trace(
    reader,
    start_req=0,
    max_req=1000
)
print(f"Object miss ratio: {obj_miss_ratio:.4f}, Byte miss ratio: {byte_miss_ratio:.4f}")

Custom Cache Policies

Implement custom cache replacement algorithms using pure Python functions - no C/C++ compilation required.

Python Hook Cache Overview

The PluginCache allows you to define custom caching behavior through Python callback functions. This is perfect for:

Prototyping new cache algorithms
Educational purposes and learning
Research and experimentation
Custom business logic implementation

Hook Functions

You need to implement these callback functions:

init_hook(cache_size: int) -> Any: Initialize your data structure
hit_hook(data: Any, obj_id: int, obj_size: int) -> None: Handle cache hits
miss_hook(data: Any, obj_id: int, obj_size: int) -> None: Handle cache misses
eviction_hook(data: Any, obj_id: int, obj_size: int) -> int: Return object ID to evict
remove_hook(data: Any, obj_id: int) -> None: Clean up when object removed
free_hook(data: Any) -> None: [Optional] Final cleanup

Example: Custom LRU Implementation

import libcachesim as lcs
from collections import OrderedDict

# Create a Python hook-based cache
cache = lcs.PluginCache(cache_size=1024*1024, cache_name="MyLRU")

# Define LRU policy hooks
def init_hook(cache_size):
    return OrderedDict()  # Track access order

def hit_hook(lru_dict, obj_id, obj_size):
    lru_dict.move_to_end(obj_id)  # Move to most recent

def miss_hook(lru_dict, obj_id, obj_size):
    lru_dict[obj_id] = True  # Add to end

def eviction_hook(lru_dict, obj_id, obj_size):
    return next(iter(lru_dict))  # Return least recent

def remove_hook(lru_dict, obj_id):
    lru_dict.pop(obj_id, None)

# Set the hooks
cache.set_hooks(init_hook, hit_hook, miss_hook, eviction_hook, remove_hook)

# Use it like any other cache
req = lcs.Request()
req.obj_id = 1
req.obj_size = 100
hit = cache.get(req)
print(f"Cache hit: {hit}")  # Should be False (miss)

Example: Custom FIFO Implementation

import libcachesim as lcs
from collections import deque
from contextlib import suppress

cache = lcs.PluginCache(cache_size=1024, cache_name="CustomFIFO")

def init_hook(cache_size):
    return deque()  # Use deque for FIFO order

def hit_hook(fifo_queue, obj_id, obj_size):
    pass  # FIFO doesn't reorder on hit

def miss_hook(fifo_queue, obj_id, obj_size):
    fifo_queue.append(obj_id)  # Add to end of queue

def eviction_hook(fifo_queue, obj_id, obj_size):
    return fifo_queue[0]  # Return first item (oldest)

def remove_hook(fifo_queue, obj_id):
    with suppress(ValueError):
        fifo_queue.remove(obj_id)

# Set the hooks and test
cache.set_hooks(init_hook, hit_hook, miss_hook, eviction_hook, remove_hook)

req = lcs.Request(obj_id=1, obj_size=100)
hit = cache.get(req)
print(f"Cache hit: {hit}")  # Should be False (miss)

Available Algorithms

Built-in Cache Algorithms

Basic Algorithms

FIFO: First-In-First-Out
LRU: Least Recently Used
LFU: Least Frequently Used
LFUDA: LFU with Dynamic Aging
Clock: Clock/Second-chance algorithm

Advanced Algorithms

QDLP: Queue Demotion with Lazy Promotion
S3FIFO: Simple, Fast, Fair FIFO (recommended for most workloads)
Sieve: High-performance eviction algorithm
ARC: Adaptive Replacement Cache
TwoQ: Two-Queue algorithm
SLRU: Segmented LRU
TinyLFU: TinyLFU with window
WTinyLFU: Windowed TinyLFU

Research/ML Algorithms

LeCaR: Learning Cache Replacement (adaptive)
Cacheus: Cache replacement policy
LRB: Learning-based cache (if enabled)
GLCache: Machine learning-based cache
ThreeLCache: Three-level cache hierarchy (if enabled)

Optimal Algorithms (for analysis)

Belady: Optimal offline algorithm
BeladySize: Size-aware optimal algorithm

import libcachesim as lcs

# All algorithms use the same unified interface
cache_size = 1024 * 1024  # 1MB

lru_cache = lcs.LRU(cache_size)
s3fifo_cache = lcs.S3FIFO(cache_size)
sieve_cache = lcs.Sieve(cache_size)
arc_cache = lcs.ARC(cache_size)

# All caches work identically
req = lcs.Request()
req.obj_id = 1
req.obj_size = 100
hit = lru_cache.get(req)
print(hit)

Examples and Testing

Algorithm Comparison

import libcachesim as lcs

def compare_algorithms(trace_path):
    reader = lcs.open_trace(trace_path, lcs.TraceType.VSCSI_TRACE)
    algorithms = ['LRU', 'S3FIFO', 'Sieve', 'ARC']
    for algo_name in algorithms:
        cache = getattr(lcs, algo_name)(cache_size=1024*1024)
        obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
        print(f"{algo_name}\t\tObj: {obj_miss_ratio:.4f}, Byte: {byte_miss_ratio:.4f}")

compare_algorithms("./data/cloudPhysicsIO.vscsi")

Performance Benchmarking

import time

def benchmark_cache(cache, num_requests=100000):
    """Benchmark cache performance"""
    start_time = time.time()
    for i in range(num_requests):
        req = lcs.Request()
        req.obj_id = i % 1000  # Working set of 1000 objects
        req.obj_size = 100
        cache.get(req)
    end_time = time.time()
    throughput = num_requests / (end_time - start_time)
    print(f"Processed {num_requests} requests in {end_time - start_time:.2f}s")
    print(f"Throughput: {throughput:.0f} requests/sec")

# Compare performance
lru_cache = lcs.LRU(cache_size=1024*1024)
s3fifo_cache = lcs.S3FIFO(cache_size=1024*1024)

print("LRU Performance:")
benchmark_cache(lru_cache)

print("\nS3FIFO Performance:")
benchmark_cache(s3fifo_cache)

Advanced Usage

Multi-Format Trace Processing

import libcachesim as lcs

# Supported trace types
trace_types = {
    "oracle": lcs.TraceType.ORACLE_GENERAL_TRACE,
    "csv": lcs.TraceType.CSV_TRACE,
    "vscsi": lcs.TraceType.VSCSI_TRACE,
    "txt": lcs.TraceType.PLAIN_TXT_TRACE
}

# Open different trace formats
oracle_reader = lcs.open_trace("./data/cloudPhysicsIO.oracleGeneral.bin", trace_types["oracle"])
csv_reader = lcs.open_trace("./data/cloudPhysicsIO.txt", trace_types["txt"])

# Process traces with different caches
caches = [
    lcs.LRU(cache_size=1024*1024),
    lcs.S3FIFO(cache_size=1024*1024),
    lcs.Sieve(cache_size=1024*1024)
]

for i, cache in enumerate(caches):
    miss_ratio_oracle = cache.process_trace(oracle_reader)[0]
    miss_ratio_csv = cache.process_trace(csv_reader)[0]
    print(f"Cache {i} miss ratio: {miss_ratio_oracle:.4f}, {miss_ratio_csv:.4f}")

Troubleshooting

Common Issues

Import Error: Make sure libCacheSim C++ library is built first:

cmake -G Ninja -B build && ninja -C build

Performance Issues: Use process_trace() for large workloads instead of individual get() calls for better performance.

Memory Usage: Monitor cache statistics (cache.occupied_byte) and ensure proper cache size limits for your system.

Custom Cache Issues: Validate your custom implementation against built-in algorithms using the test functions above.

Install with uv: Since automatically building with uv will fail due to incomplete source code, please force install the binary file via uv pip install libcachesim --only-binary=:all:.

Getting Help

Check the main documentation for detailed guides
Open issues on GitHub
Review examples in the main repository

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github		.github
benchmark		benchmark
docs		docs
examples		examples
libcachesim		libcachesim
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

libCacheSim Python Binding

📚 Documentation

Installation

Installation from sources

🚀 Features

Quick Start

Basic Usage

Trace Processing

Custom Cache Policies

Python Hook Cache Overview

Hook Functions

Example: Custom LRU Implementation

Example: Custom FIFO Implementation

Available Algorithms

Built-in Cache Algorithms

Basic Algorithms

Advanced Algorithms

Research/ML Algorithms

Optimal Algorithms (for analysis)

Examples and Testing

Algorithm Comparison

Performance Benchmarking

Advanced Usage

Multi-Format Trace Processing

Troubleshooting

Common Issues

Getting Help

About

Uh oh!

Releases

Packages

Languages

License

cacheMon/libCacheSim-python

Folders and files

Latest commit

History

Repository files navigation

libCacheSim Python Binding

📚 Documentation

Installation

Installation from sources

🚀 Features

Quick Start

Basic Usage

Trace Processing

Custom Cache Policies

Python Hook Cache Overview

Hook Functions

Example: Custom LRU Implementation

Example: Custom FIFO Implementation

Available Algorithms

Built-in Cache Algorithms

Basic Algorithms

Advanced Algorithms

Research/ML Algorithms

Optimal Algorithms (for analysis)

Examples and Testing

Algorithm Comparison

Performance Benchmarking

Advanced Usage

Multi-Format Trace Processing

Troubleshooting

Common Issues

Getting Help

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages