Skip to content

Support for more BinningProcess.binning_transform_params parameters #201

@WonderfulAnastasia

Description

@WonderfulAnastasia

Hello!

I've run into some issues while converting OptBinning to PMML format. Specifically, I used the Binning Process class and specified binning_transform_params with metric_missing='empirical'. After inspecting the resulting binning table, I noticed that the Weight-of-Evidence (WoE) value for missing entries was non-zero (it was equal to -0.182322). However, when examining the exported PMML file, the attribute mapMissingTo was set to 0 (Discretize field="age" mapMissingTo="0.0"), which contradicts my treatment of missing values.
Could you please clarify if it's possible to configure mapMissingTo reflects the empirical WoE values for missing entries instead of being 0?

Thanks!

import pandas as pd
import numpy as np
from optbinning import BinningProcess
from sklearn2pmml import sklearn2pmml
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
df = pd.DataFrame({
    'age':[np.nan, 80, 54, 31, 32, 79, 43, np.nan, 22, 48, 62, 66, 33, 76, 60, 68, 47, 72, 20, 51, 44, 38, 25, 64, 63, 39, 52,
            65, 59, 53, 73, 78, 45, 27, 57, 21, 34, 24, 42, np.nan],
    'y':[1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0,
       1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1,
       0, 0, 1, 1, 0, 0]
})
pipeline = Pipeline([
    ('binning',  BinningProcess(variable_names = ['age'], 
                                binning_transform_params = {
                                    'age' : {'metric' : 'woe',
                                             'metric_missing':'empirical'}
                                })),
    ("logistic_regression", LogisticRegression(random_state=42))
])
pipeline.fit(df[['age']], df['y'])

P.S. The same problem with metric_special

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions