Brier score for Binomial data? #886
-
Thanks a lot for the great package, it's super helpful! Right now it's only working for Bernoulli trials, but when working with lots of data, I need to aggregate the trials into a Binomial distribution. I'd like to evaluate the distribution of forecasted Binomial rates à la Brier, but don't see if and how it's doable right now. Thanks a lot in advance for your help! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hi Alex, thanks for your question. To see if I understand it correctly, I'm going to rephrase your problem. Please let me know if I have understood correctly or correct where I am mis-understanding it. Let's take a particular forecast case. The set up is you have (say) n trials and the observation is the number of trials that resulted in a success. What needs to be forecast is the number of successes from the n trials. The forecast could take various forms. For example, e.g. you could issue a single real-valued forecast for the expected number of successes for the n trial. Or, for each k=0,1,...,n, you could forecast the probability of getting k successes, which is essentially the same as issuing a full predictive distribution. I think your forecast problem is the latter - that is, your forecasts are predictive distributions on the space {0, 1, 2, ..., n} of possible outcomes and you want to know how to score that forecast against the observation. Is that correct? |
Beta Was this translation helpful? Give feedback.
-
Hi @rob-taggart , and thanks for your quick reply! Yes, exactly, I'm in the latter case (I have a hierarchical Bayesian model on the Binomial probabilities), so I wanna know how the posterior of the latent rate (the |
Beta Was this translation helpful? Give feedback.
Hi @AlexAndorra, I think what you are after is the equation in appendix A.5 in https://www.jstatsoft.org/article/view/v090i12 .
We have plans to add this to scores via a scores-scoringrules interface. Scoringrules is a package with numpy (and jax, pytorch, tensorflow) backends. Whereas our primary focus is for people working with xarray.
If you want to work with numpy (or those ML backends), then you can use this function https://frazane.github.io/scoringrules/api/crps/#scoringrules.crps_binomial
If you'd like an xarray implementation, then I can help you out!