A Python library for discovering causal networks from time series data using Optimal Causation Entropy (oCSE).
CausationEntropy implements state-of-the-art information-theoretic methods for causal discovery from multivariate time series. The library provides robust algorithms that can identify causal relationships while controlling for confounding variables and false discoveries.
Given time series data, CausationEntropy finds which variables cause changes in other variables by:
- Predictive Testing: Testing if knowing variable X at time t helps predict variable Y at time t+1
- Information Theory: Using conditional mutual information to measure predictive relationships
- Statistical Control: Rigorous statistical testing to avoid false discoveries
- Multiple Methods: Supporting various information estimators and discovery algorithms
pip install causationentropy
git clone https://github.com/Center-For-Complex-Systems-Science/causationentropy.git
cd causationentropy
pip install -e .
import numpy as np
import pandas as pd
from causationentropy import discover_network
# Load your time series data (variables as columns, time as rows)
data = pd.read_csv('data.csv')
# Discover causal network
network = discover_network(data, method='standard', max_lag=5)
# Examine results
print(f"Found {network.number_of_edges()} causal relationships")
for source, sink in network.edges(data=True):
print(f"{source} to {sink}: {network[source][sink]}")
# Configure discovery parameters
network = discover_network(
data,
method='standard', # 'standard', 'alternative', 'information_lasso', or 'lasso'
information='gaussian', # 'gaussian', 'knn', 'kde', 'geometric_knn', or 'poisson'
max_lag=5, # Maximum time lag to consider
alpha_forward=0.05, # Forward selection significance
alpha_backward=0.05, # Backward elimination significance
n_shuffles=200 # Permutation test iterations
)
from causationentropy.datasets import synthetic
# Generate synthetic causal time series
data, true_network = synthetic.linear_stochastic_gaussian_process(
n_variables=5,
n_samples=1000,
sparsity=0.3
)
# Discover network
discovered = discover_network(data)
# Compare with ground truth
print(f"True edges: {true_network.number_of_edges()}")
print(f"Discovered edges: {discovered.number_of_edges()}")
- Multiple Algorithms: Standard, alternative, information lasso, and lasso variants of oCSE
- Flexible Information Estimators: Gaussian, k-NN, KDE, geometric k-NN, and Poisson methods
- Statistical Rigor: Permutation-based significance testing with comprehensive test coverage
- Synthetic Data: Built-in generators for testing and validation
- Visualization: Network plotting and analysis tools
- Performance: Optimized implementations with parallel processing support
The algorithm uses conditional mutual information to quantify causal relationships:
This measures how much variable X tells us about variable Y, beyond what we already know from conditioning set Z.
Causal Discovery Rule: Variable X causes Y if knowing X(t) significantly improves prediction of Y(t+1), even when controlling for all other relevant variables.
The algorithm implements a two-phase approach:
- Forward Selection: Iteratively adds predictors that maximize conditional mutual information
- Backward Elimination: Removes predictors that lose significance when conditioned on others
📚 Read the full documentation on ReadTheDocs
- API Reference: Complete function and class documentation
- User Guide: Detailed tutorials and examples
- Theory: Mathematical background and algorithms
- Examples: Check the
examples/
andnotebooks/
directories - Research Papers: See the
papers/
directory for theoretical foundations
Build documentation locally:
cd docs/
make html
# Open docs/_build/html/index.html
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
If you use this library in your research, please cite:
@misc{slote2025causationentropy,
author = {Slote, Kevin and Fish, Jeremie and Bollt, Erik},
title = {CausationEntropy: A Python Library for Causal Discovery},
url = {https://github.com/Center-For-Complex-Systems-Science/causationentropy},
doi = {10.5281/zenodo.17047565}
}
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: kslote1@gmail.com
This work builds upon fundamental research in information theory, causal inference, and time series analysis. Special thanks to the open-source scientific Python community.
Generative AI was used to help with doc strings, documentation, and unit tests.