GitHub - architdatar/ml_uncertainty: Get prediction intervals, confidence intervals, and parameter uncertainties for various machine learning models

ML Uncertainty is a Python package which provides a scikit-learn-like interface to obtain prediction intervals and model parameter error estimation for machine learning models.

All in less than 4 lines of code.

Getting started

Install from PyPI with

pip install ml-uncertainty

Examples

View: View all examples.

Run: To run examples, some additional packages are required since they require plots for visualization. Install these using:

pip install matplotlib seaborn jupyter scikit-fda

First example: Linear regression

Consider a linear regression model fit with scikit-learn. The uncertainty estimation can be done as follows:

# Fit model with sklearn.
regr = LinearRegression(fit_intercept=True)
regr.fit(X_expt, y_expt)

# Set up error estimation with ML uncertainty. 
inf = ParametricModelInference()
inf.set_up_model_inference(X_expt, y_expt, regr)

# Obtain parameter error estimate with ML Uncertainty
df_feature_imp = inf.get_parameter_errors()

# Obtain prediction intervals with ML Uncertainty.
df_int = inf.get_intervals(X_expt, confidence_level=95.0, distribution="t")

The result looks like:

Find the full example here.

Other examples

Intended audience

This package is intended to benefit data scientists and ML enthusiasts.

Motivation

Too often in machine learning, we fit complex models, but cannot quantity their precision via prediction intervals or feature significance.
This is especially true of the scikit-learn environment which is extremely easy to use but does not offer these functionalities.
However, in many use cases, especially where we have small and fat datasets, these are insights are critical to produce reliable models and insights.
Enter ML Uncertainty! This provides an easy API to get all these insights from models.
It takes scikit-learn fitted models as inputs and uses appropriate statistics to quantify the uncertainties in ML models.

Computing stats as easy as:

# Set up the model inference.
inf = ParametricModelInference()
inf.set_up_model_inference(X_train=X, y_train=y, estimator=regr)

# Get parameter importance estimates.
df_imp = inf.get_parameter_errors()

# Get prediction intervals.
df_int = inf.get_intervals(X)

Features

Model parameter significance testing: Tests whether the given model parameters are truly significant or not.

For ensemble models, it can inform if given features are truly important or if they just seem so due to the instability of the model.
Prediction intervals: Can produce prediction and confidence intervals for parametric and non-parametric ML models.
Error propagation: Propagates error from input / model parameters to the outputs.
Non-Linear regression: Scikit-learn-style API to fit non-linear models.

Installation

Dependencies

Python versions: See badges above.
Packages: See requirements.txt.

User installation

See ./docs/installation.md.

Theoretical foundations

Discussion about the theory used can be found here:

Benchmarking

NonLinearRegression, ParametricModelInference, and ErrorPropagation classes have been benchmarked against the Python statsmodels package. The codes for this can be found here.

To run these benchmarking codes, please install statsmodels using:

pip install statsmodels==0.14.0

The EnsembleModelInference does not have a code to benchmark it against to the best of my knowledge. However, the code follows the ideas developed in the work by Zhang et al. (2020). The test is that a $(1-\alpha)\times100$ % prediction interval must contain $(1-\alpha)$ proportion of the training data. See benchmarking codes here.

Author

Archit Datar (architdatar@gmail.com)

Credits

This package was created with Cookiecutter_ and the audreyr/cookiecutter-pypackage_ project template.

Cookiecutter

audreyr/cookiecutter-pypackage
Some functions in ParametricModelInference are adopted from a Github repo by sriki18.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github		.github
docs		docs
examples		examples
ml_uncertainty		ml_uncertainty
tests		tests
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Getting started

Examples

First example: Linear regression

Other examples

Intended audience

Motivation

Features

Installation

Dependencies

User installation

Theoretical foundations

Benchmarking

Author

Credits

About

Uh oh!

Releases 3

Packages

Uh oh!

Languages

License

architdatar/ml_uncertainty

Folders and files

Latest commit

History

Repository files navigation

Getting started

Examples

First example: Linear regression

Other examples

Intended audience

Motivation

Features

Installation

Dependencies

User installation

Theoretical foundations

Benchmarking

Author

Credits

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Languages

Packages