Median survival time prediction is more than true duration value #1414
Replies: 2 comments 1 reply
-
| 
 No, I would expect it to be larger often (not always). This is because censored individuals have (often) a longer duration than their uncensored counterparts (Ex: a study of mortality lasting 5 years. Uncensored subjects will die in less than 5 years, but all censored subjects live longer than 5 years). So in a regression model, the model summarizes s.t. censored subjects might have slightly lower predicted median lifetime than their actual, and uncensored have slightly higher predicted lifetimes than their actual (think about this like "regression to the mean"). | 
Beta Was this translation helpful? Give feedback.
-
| 
 It's just more biased (in some sense), because you are only evaluating on a (biased) subset of your data. An alternative to RMST and c-index is a log-likelihood measure like AIC, or score. The latter don't have easy interpretations though. | 
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am using CoxPHFittter, WeibullAFTFitter, LogLogisticAFTFitter and LogNormalAFTFitter. All of these models have the
predict_median()method to get the predicted median survival time (MST). What I have noticed is that the predicted MST is almost always larger than the true duration. Shouldn't the MST be nearly half of the true duration?When I compare the mean absolute error (MAE) of MST to true, I am way off on average. I read an article where they say that MAE computation (as a performance measure) should only be computed on records where the event occurs, but still, when I filter out records that are censored, the MAE is still way off.
Is there any way to "correct" or "calibrate" the predictions of MST so that they are closer to the true duration? I understand MST is the time point at which a record has a 50/50 chance of surviving or not, and so I might have to calibrate MST to 0.5 of the true duration.
Additionally, the performance as measured by concordance is phenomenal (99% on training and validation). How can it be that concordance is nearly perfect, but the absolute values of predictions (e.g. MST) are so off?
Any help is appreciated.
Beta Was this translation helpful? Give feedback.
All reactions