-
Notifications
You must be signed in to change notification settings - Fork 169
Description
I am experiencing an issue with the ggsurvplot
function from the survminer
package in R. When attempting to plot survival curves using ggsurvplot
with the newdata
argument, the function incorrectly interprets the number of observations as the number of levels, resulting in an error. Without the newdata
argument, the function does not recognize the levels of the factor variable correctly, leading to another error.
Steps to Reproduce:
Data Preparation:
Clean and prepare the dataset ensuring the factor variable is correctly defined.
d.dat.tot.clean <- d.dat.tot[d.dat.tot$DEAD_OR_ALIVE < 2, ]
d.dat.tot.clean$GENOTYPE.111758446 <- as.factor(d.dat.tot.clean$GENOTYPE.111758446)
Cox Model Fitting:
Fit a Cox proportional hazards model.
cox_model_single_111758446 <- coxph(Surv(DONOR_SURVIVAL_TIME, DEAD_OR_ALIVE) ~ GENOTYPE.111758446, data = d.dat.tot.clean)
Create Survival Curves:
Create survival curves using the survfit function.
surv_fit <- survfit(cox_model_single_111758446)
Plot Survival Curves:
Attempt to plot the survival curves using ggsurvplot.
cox_plot_single_111758446 <- ggsurvplot(
surv_fit,
data = d.dat.tot.clean,
pval = TRUE,
conf.int = TRUE,
risk.table = TRUE,
legend.title = "Genotype",
legend.labs = levels(d.dat.tot.clean$GENOTYPE.111758446),
xlab = "Time to Last Follow-up, mo",
ylab = "Cumulative Survival, %",
ggtheme = theme_minimal(),
palette = "set2"
)
print(cox_plot_single_111758446)
Observed Behavior:
With newdata Argument:
cox_plot_single_111758446 <- ggsurvplot(
survfit(cox_model_single_111758446, newdata = d.dat.tot.clean),
data = d.dat.tot.clean,
pval = TRUE,
conf.int = TRUE,
risk.table = TRUE,
legend.title = "Genotype",
legend.labs = levels(d.dat.tot.clean$GENOTYPE.111758446),
xlab = "Time to Last Follow-up, mo",
ylab = "Cumulative Survival, %",
ggtheme = theme_minimal(),
palette = "set2"
)
This results in the error:
Error in ggsurvplot_df(d, fun = fun, color = color, palette = palette, :
The length of legend.labs should be 236
(236 is the number of cases I have.)
Without newdata Argument:
The function fails to recognize the levels of the factor variable correctly and returns:
Error in ggsurvplot_df(d, fun = fun, color = color, palette = palette, :
The length of legend.labs should be 1
Expected Behavior:
The function should correctly interpret the levels of the factor variable and plot the survival curves without error.
Environment:
R version: R 4.2.3 GUI 1.79 High Sierra build (8198)
survminer version: 0.4.9.999
survival package version: 3.7.0
Operating system: macOS 10.15.7
Any help or guidance on how to resolve this issue would be greatly appreciated.
Thank you so much!!!!
Reproducible example:
# Sample reproducible data
set.seed(123)
d.dat.tot <- data.frame(
DONOR_SURVIVAL_TIME = rexp(100, 0.1),
DEAD_OR_ALIVE = sample(0:1, 100, replace = TRUE),
GENOTYPE.111758446 = sample(c("A/A", "G/A", "G/G"), 100, replace = TRUE)
)
# Data preparation
d.dat.tot.clean <- d.dat.tot[d.dat.tot$DEAD_OR_ALIVE < 2, ]
d.dat.tot.clean$GENOTYPE.111758446 <- as.factor(d.dat.tot.clean$GENOTYPE.111758446)
# Cox model fitting
cox_model_single_111758446 <- coxph(Surv(DONOR_SURVIVAL_TIME, DEAD_OR_ALIVE) ~ GENOTYPE.111758446, data = d.dat.tot.clean)
# Survival curves creation
surv_fit <- survfit(cox_model_single_111758446)
# Plotting survival curves
cox_plot_single_111758446 <- ggsurvplot(
surv_fit,
data = d.dat.tot.clean,
pval = TRUE,
conf.int = TRUE,
risk.table = TRUE,
legend.title = "Genotype",
legend.labs = levels(d.dat.tot.clean$GENOTYPE.111758446),
xlab = "Time to Last Follow-up, mo",
ylab = "Cumulative Survival, %",
ggtheme = theme_minimal(),
palette = "set2"
)
print(cox_plot_single_111758446)