Output Probabilities

How do I get probabilities out of icsiboost

icsiboost does not really generate anything near probabilities. There is a paper that studied the question: Obtaining calibrated probabilities from boosting by Niculescu-Mizil and Caruana.

They advise three solutions:

Logistic Correction
Platt Calibration
Isotonic regression

The first one consists in transforming the scores using this formula: 1/(1+exp(-2*n*score)), where n is the number of weak learners.

It is implemented in icsiboost through the --posteriors option. For instance, on the adult dataset, it results in:

icsiboost -S adult -C --posteriors < adult.test | head
0 1 0.000676516588 0.999323483412
0 1 0.142914079015 0.857085920985
1 0 0.346835918704 0.653164081296
1 0 0.996016904305 0.003983095695
0 1 0.000004176785 0.999995823215
0 1 0.003001997215 0.996998002785
0 1 0.014896068044 0.985103931956
1 0 0.788795652673 0.211204347327
0 1 0.003583447587 0.996416552413
0 1 0.060653451950 0.939346548050

Note that while these scores are between 0 and 1, they are not guaranteed to sum to 1 over all classes (when you have more than 2 classes), so you should normalize them for each example.

Platt Calibration and Logistic Regression work better in some cases (skewed label prior...). It's also possible to get good results by just moving the decision boundary using a development set (for instance with the --max-fmeasure <label> --optimal-iterations options or with the optimal_threshold.pl script).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Output Probabilities

How do I get probabilities out of icsiboost

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally