Scikit-longitudinal
(Sklong) is a machine learning library designed to analyse
longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing,
and predicting longitudinal data, with a user-friendly interface that
integrates with the Scikit-learn
ecosystem.
For more details, visit the official documentation.
Note
Want to be using Jupyter Notebook
, Marimo
, Google Colab
, or JupyterLab
?
Head to the Getting Started
section of the documentation, we explain it all! 🎉
To install Scikit-longitudinal:
-
✅ Install the latest version:
pip install Scikit-longitudinal
To install a specific version:
pip install Scikit-longitudinal==0.1.0
Caution
Scikit-longitudinal
is currently compatible with Python versions 3.9
only.
Ensure you have one of these versions installed before proceeding with the installation.
Now, while we understand that this is a limitation, we are tied for the time being because of Deep Forest
.
Deep Forest
is a dependency of Scikit-longitudinal
that is not compatible with Python versions greater than 3.9
.
Deep Forest
helps us with the Deep Forest
algorithm, to which we have made some modifications to
welcome Lexicographical Deep Forest
.
To follow up on this discussion, please refer to this github issue.
If you encounter any errors, feel free to explore further the installation
section in the Getting Started
of the documentation.
If it still doesn't work, please open an issue on GitHub.
Here's how to analyse longitudinal data with Scikit-longitudinal:
from scikit_longitudinal.data_preparation import LongitudinalDataset
from scikit_longitudinal.estimators.ensemble.lexicographical.lexico_gradient_boosting import LexicoGradientBoostingClassifier
dataset = LongitudinalDataset('./stroke.csv') # Note this is a fictional dataset. Use yours!
dataset.load_data_target_train_test_split(
target_column="class_stroke_wave_4",
)
# Pre-set or manually set your temporal dependencies
dataset.setup_features_group(input_data="elsa")
model = LexicoGradientBoostingClassifier(
features_group=dataset.feature_groups(),
threshold_gain=0.00015 # Refer to the API for more hyper-parameters and their meaning
)
model.fit(dataset.X_train, dataset.y_train)
y_pred = model.predict(dataset.X_test)
# Classification report
print(classification_report(y_test, y_pred))
We are currently cooking a JOSS submission, wait a bit for it! Meanwhile, click on Cite This Repository
on the top right corner of this page to get a BibTeX entry.
Scikit-longitudinal is licensed under the MIT License.