Skip to content

Small difference in predicted score when loading a trained Booster object into XGBClassifier #11403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
protco opened this issue Apr 10, 2025 · 2 comments

Comments

@protco
Copy link

protco commented Apr 10, 2025

Hi team,
We have a Booster trained using xgboost.train() and need to use it inside XGBClassifer() when doing score calibration. Here is the code, without explicitly setting n_classes_ and classes_, CalibratedClassifierCV complains that the model is not pre-fitted. With these settings, the predicted score using XGBClassifier is slightly different from the booster itself. Do you know any possible reasons and how to make the score 100% match?
xgb_classifier = XGBClassifier()
xgb_classifier.Booster = xgb_booster
xgb_classifier.n_classes
= 2
xgb_classifier.classes_ = np.array([0, 1])
XGBoost version: 1.5.0
thank you,

@trivialfis
Copy link
Member

Could you please:

  • Update to the latest XGBoost.
  • Provide a reproducible example if the issue persists with the latest version.

@protco
Copy link
Author

protco commented Apr 14, 2025

Update: it turns out with our pre-trained booster loaded into XGBClassifier, XGBClassifier._Booster.predict() returns identical score as booster.predict(). So the difference we saw is actually between XGBClassifier._Booster.predict() and XGBClassifier.predict_proba(). The question is why do the two produce different result.

Thank you, Jiaming!
I tried xgboost 2.1.4 to load our pre-trained boosters, the prediction difference between the booster and XGBClassifier still remain the same (on average XBGClassifier prediction is ~1-5% lower for 3 different models). Unfortunately I can't share the booster files. I have tried both 1.5.0 and 2.1.4 on a toy model using the breast cancer dataset, and the predictions match perfectly for both versions. So not sure why this difference only happens with our pre-trained models. Are there any possible reasons I can test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants