Interpreting the probability of a customer being alive #1639

stochastic1 · 2025-04-22T18:20:29Z

stochastic1
Apr 22, 2025

I'm using BG/NBD, Pareto/NBD and Modified BG/NBD models to forecast a customer being alive in the future. My goal is to "validate" the accuracy of the forecast to the extent possible in a non-contractual setting where a customer may not know that they've churned.

Relevant time period
The models are trained on data aggregated to the week using rfm_summary(time_unit='W', observation_period_end='2024-05-31')
When I use model.expected _probability_alive() for some future point in time, can I interpret that prediction as being so many periods (weeks in this case) after the end of the training period given in rfm_summary? If the final date is May 31, 2024, is 4 weeks' prediction is through June 28, 2024? Or does the 4-week period begin with an internal date within the training data?

Calibrating outcome to actual data
The output for probability of being alive is probabilistic. I've seen elsewhere that a probability of 0.3 is a good threshhold, a customer is most likely churned when the probability of being alive at time t falls below 0.3. However, if I want to validate that a customer is alive, what is the best approach to that?

I've started by saying that a customer is alive at week 4 if they have transacted between week 1 and week 4. This makes sense given that the probability of a customer being alive is monotonically with respect to time. At the same time, if I use the metric of "has transacted by week N" to say a customer is alive, then I gain evidence of a customer being alive as N increases, even while the probability of a customer being alive at time N decreases. Also, if the probability of a customer being alive is also the probability that a customer has not churned, then a transaction after week N would also indicate that the customer is alive. Would it then make more sense to validate at time N against a customer who transacts by time N or even N+4 if 4 weeks is a relevant value to my business?

stochastic1 · 2025-04-22T21:35:30Z

stochastic1
Apr 22, 2025
Author

Here's a little color to add. I've calculated calibration curves and Brier scores to help me understand where this is fitting well or poorly.
I would expect that as the predicated probability increases, the percentage of positives (alive) customers should also increase.

Here's a view of two mockup models: Model 1 is totally random with a Brier Score of 0.33, Model 2 is strongly correlated (by design) with a Brier Score of 0.15

Here are two of the outcomes for my data and these are typical, with Brier Scores around 0.45:

0 replies

ColtAllen · 2025-04-22T23:01:50Z

ColtAllen
Apr 22, 2025
Maintainer

When I use model.expected _probability_alive() for some future point in time, can I interpret that prediction as being so many periods (weeks in this case) after the end of the training period given in rfm_summary?

Prediction is based on the final date of the training data, unless you specify future_t methods into the future. Pareto/NBD has a parameter for this, and for the other models you can just augment the T variable.

Would it then make more sense to validate at time N against a customer who transacts by time N or even N+4 if 4 weeks is a relevant value to my business?

Pareto/NBD has a predictive method specifically for this. Note probability alive is monotonically decreasing over time, whereas expected purchases are monotonically increasing (albeit at a very low rate for an "inactive" customer ).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Interpreting the probability of a customer being alive #1639

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Interpreting the probability of a customer being alive #1639

Uh oh!

stochastic1 Apr 22, 2025

Replies: 2 comments

Uh oh!

stochastic1 Apr 22, 2025 Author

Uh oh!

ColtAllen Apr 22, 2025 Maintainer

stochastic1
Apr 22, 2025

stochastic1
Apr 22, 2025
Author

ColtAllen
Apr 22, 2025
Maintainer