Skip to content

First release

Latest
Compare
Choose a tag to compare
@AnFreTh AnFreTh released this 12 Apr 20:55
· 4 commits to main since this release

📦 pretab v0.0.1 – Initial Release

We’re excited to introduce pretab, a flexible and extensible preprocessing library for tabular data, built on top of scikit-learn.

🚀 Features

  • 🔢 Numerical preprocessing via:

    • Discretization with custom binning
    • Tree-based transformations (PLE, CustomBin)
    • Spline expansions: Cubic, Natural, P-splines, Thin Plate Splines, Tensor Product Splines
    • Neural-inspired bases: RBF, ReLU, Sigmoid, Tanh
    • All available sklearn transformations:
      • Standard, MinMax, Quantile, and Robust scaling
      • Polynomial features, Box-Cox, and Yeo-Johnson transforms
  • 🧠 Categorical preprocessing via:

    • Ordinal, one-hot, and float encodings
    • Language embeddings (optional)
  • 🧱 Integrated with sklearn's ColumnTransformer and Pipeline

  • 🔍 Detailed feature metadata via get_feature_info()

  • 📦 Easy integration of embeddings alongside transformed features

💬 Community

This is an open project – contributions, feature requests, and bug reports are welcome via GitHub!