📦 pretab v0.0.1 – Initial Release
We’re excited to introduce pretab, a flexible and extensible preprocessing library for tabular data, built on top of scikit-learn.
🚀 Features
-
🔢 Numerical preprocessing via:
- Discretization with custom binning
- Tree-based transformations (PLE, CustomBin)
- Spline expansions: Cubic, Natural, P-splines, Thin Plate Splines, Tensor Product Splines
- Neural-inspired bases: RBF, ReLU, Sigmoid, Tanh
- All available sklearn transformations:
- Standard, MinMax, Quantile, and Robust scaling
- Polynomial features, Box-Cox, and Yeo-Johnson transforms
-
🧠 Categorical preprocessing via:
- Ordinal, one-hot, and float encodings
- Language embeddings (optional)
-
🧱 Integrated with
sklearn
'sColumnTransformer
andPipeline
-
🔍 Detailed feature metadata via
get_feature_info()
-
📦 Easy integration of embeddings alongside transformed features
💬 Community
This is an open project – contributions, feature requests, and bug reports are welcome via GitHub!