Skip to content

refactor geom sd and fix custom stat order #1393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,17 @@

### Enhancements
* Refactored `count_abnormal()`, `count_abnormal_by_baseline()`, `count_abnormal_by_marked()`, `count_abnormal_by_worst_grade()`, and `count_abnormal_lab_worsen_by_baseline()` to work without `make_afun()`.
* Added `geom_sd` and `geom_mean_sd` to `s_summary()` default available statistics.

### Bug Fixes
* Fixed bug in `a_count_patients_with_flags()` preventing select custom label and indentation specification formats from being applied.
* Fixed bug in `tabulate_rsp_subgroups` and `tabulate_survival_subgroups` preventing the `pct` option from having an effect when adding a risk difference column.
* Fixed bug with the order of `.stats` when adding custom statistical functions.

### Miscellaneous
* Removed internal function `ungroup_stats()` and replaced its usage with the `get_*_from_stats()` functions.
* Began deprecation of the unused `table_names` argument to `count_abnormal_lab_worsen_by_baseline()`.
* Added warnings for `geom_mean` statistical output.

# tern 0.9.7

Expand Down
30 changes: 22 additions & 8 deletions R/analyze_variables.R
Original file line number Diff line number Diff line change
Expand Up @@ -234,10 +234,25 @@ s_summary.numeric <- function(x, control = control_analyze_vars(), ...) {

y$cv <- c("cv" = unname(y$sd) / unname(y$mean) * 100)

# Convert negative values to NA for log calculation.
# Geometric Mean - Convert negative values to NA for log calculation.
geom_verbose <- args_list[["geom_verbose"]] %||% FALSE # Additional info if requested
checkmate::assert_flag(geom_verbose)
x_no_negative_vals <- x
if (identical(x_no_negative_vals, numeric())) {
x_no_negative_vals <- NA
}
x_no_negative_vals[x_no_negative_vals <= 0] <- NA
if (geom_verbose) {
if (any(x <= 0)) {
warning("Negative values were converted to NA for calculation of the geometric mean.")
}
if (all(is.na(x_no_negative_vals))) {
warning("Since all values are negative or NA, the geometric mean is NA.")
}
}
y$geom_mean <- c("geom_mean" = exp(mean(log(x_no_negative_vals), na.rm = FALSE)))
y$geom_sd <- c("geom_sd" = geom_sd <- exp(sd(log(x_no_negative_vals), na.rm = FALSE)))
y$geom_mean_sd <- c(y$geom_mean, y$geom_sd)
geom_mean_ci <- stat_mean_ci(x, conf_level = control$conf_level, na.rm = FALSE, gg_helper = FALSE, geom_mean = TRUE)
y$geom_mean_ci <- formatters::with_label(geom_mean_ci, paste("Geometric Mean", f_conf_level(control$conf_level)))

Expand Down Expand Up @@ -549,7 +564,7 @@ a_summary <- function(x,

# Check if there are user-defined functions
default_and_custom_stats_list <- .split_std_from_custom_stats(.stats)
.stats <- default_and_custom_stats_list$default_stats
.stats <- default_and_custom_stats_list$all_stats # just the labels of stats
custom_stat_functions <- default_and_custom_stats_list$custom_stats

# Correction of the pval indication if it is numeric or counts
Expand Down Expand Up @@ -588,12 +603,11 @@ a_summary <- function(x,

# Fill in with stats defaults if needed
met_grp <- paste0(c("analyze_vars", type), collapse = "_")
.stats <- c(
get_stats(met_grp,
stats_in = .stats,
add_pval = dots_extra_args$compare_with_ref_group %||% FALSE
),
names(custom_stat_functions) # Additional stats from custom functions
.stats <- get_stats(
met_grp,
stats_in = .stats,
custom_stats_in = names(custom_stat_functions),
add_pval = dots_extra_args$compare_with_ref_group %||% FALSE
)

x_stats <- x_stats[.stats]
Expand Down
9 changes: 5 additions & 4 deletions R/count_values.R
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ a_count_values <- function(x,

# Check for user-defined functions
default_and_custom_stats_list <- .split_std_from_custom_stats(.stats)
.stats <- default_and_custom_stats_list$default_stats
.stats <- default_and_custom_stats_list$all_stats # just the labels of stats
custom_stat_functions <- default_and_custom_stats_list$custom_stats

# Add extra parameters to the s_* function
Expand All @@ -139,9 +139,10 @@ a_count_values <- function(x,
)

# Fill in formatting defaults
.stats <- c(
get_stats("analyze_vars_counts", stats_in = .stats),
names(custom_stat_functions) # Additional stats from custom functions
.stats <- get_stats(
"analyze_vars_counts",
stats_in = .stats,
custom_stats_in = names(custom_stat_functions),
)
.formats <- get_formats_from_stats(.stats, .formats)
.labels <- get_labels_from_stats(.stats, .labels)
Expand Down
9 changes: 5 additions & 4 deletions R/summarize_change.R
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ a_change_from_baseline <- function(df,

# Check if there are user-defined functions
default_and_custom_stats_list <- .split_std_from_custom_stats(.stats)
.stats <- default_and_custom_stats_list$default_stats
.stats <- default_and_custom_stats_list$all_stats # just the labels of stats
custom_stat_functions <- default_and_custom_stats_list$custom_stats

# Adding automatically extra parameters to the statistic function (see ?rtables::additional_fun_params)
Expand All @@ -88,9 +88,10 @@ a_change_from_baseline <- function(df,
)

# Fill in with formatting defaults if needed
.stats <- c(
get_stats("analyze_vars_numeric", stats_in = .stats),
names(custom_stat_functions) # Additional stats from custom functions
.stats <- get_stats(
"analyze_vars_numeric",
stats_in = .stats,
custom_stats_in = names(custom_stat_functions),
)
.formats <- get_formats_from_stats(.stats, .formats)
.labels <- get_labels_from_stats(.stats, .labels)
Expand Down
27 changes: 20 additions & 7 deletions R/utils_default_stats_formats_labels.R
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ NULL
#' @param method_groups (`character`)\cr indicates the statistical method group (`tern` analyze function)
#' to retrieve default statistics for. A character vector can be used to specify more than one statistical
#' method group.
#' @param stats_in (`character`)\cr statistics to retrieve for the selected method group.
#' @param stats_in (`character`)\cr statistics to retrieve for the selected method group. If custom statistical
#' functions are used, `stats_in` needs to have them in too.
#' @param custom_stats_in (`character`)\cr custom statistics to add to the default statistics.
#' @param add_pval (`flag`)\cr should `"pval"` (or `"pval_counts"` if `method_groups` contains
#' `"analyze_vars_counts"`) be added to the statistical methods?
#'
Expand All @@ -57,9 +59,11 @@ NULL
#' get_stats(c("count_occurrences", "analyze_vars_counts"))
#'
#' @export
get_stats <- function(method_groups = "analyze_vars_numeric", stats_in = NULL, add_pval = FALSE) {
get_stats <- function(method_groups = "analyze_vars_numeric",
stats_in = NULL, custom_stats_in = NULL, add_pval = FALSE) {
checkmate::assert_character(method_groups)
checkmate::assert_character(stats_in, null.ok = TRUE)
checkmate::assert_character(custom_stats_in, null.ok = TRUE)
checkmate::assert_flag(add_pval)

# Default is still numeric
Expand All @@ -82,6 +86,9 @@ get_stats <- function(method_groups = "analyze_vars_numeric", stats_in = NULL, a
out <- unique(c(out, out_tmp))
}

# Add custom stats
out <- c(out, custom_stats_in)

# If you added pval to the stats_in you certainly want it
if (!is.null(stats_in) && any(grepl("^pval", stats_in))) {
stats_in_pval_value <- stats_in[grepl("^pval", stats_in)]
Expand Down Expand Up @@ -157,7 +164,7 @@ get_stat_names <- function(stat_results, stat_names_in = NULL) {
if (is.null(nm)) {
nm <- rep(NA_character_, length(si)) # no statistical names
}
return(nm)
nm
})

# Modify some with custom stat names
Expand All @@ -172,16 +179,18 @@ get_stat_names <- function(stat_results, stat_names_in = NULL) {

# Utility function used to separate custom stats (user-defined functions) from defaults
.split_std_from_custom_stats <- function(stats_in) {
out <- list(default_stats = NULL, custom_stats = NULL)
out <- list(default_stats = NULL, custom_stats = NULL, all_stats = NULL)
if (is.list(stats_in)) {
is_custom_fnc <- sapply(stats_in, is.function)
checkmate::assert_list(stats_in[is_custom_fnc], types = "function", names = "named")
out[["custom_stats"]] <- stats_in[is_custom_fnc]
out[["default_stats"]] <- unlist(stats_in[!is_custom_fnc])
all_stats <- names(stats_in) # to keep the order
all_stats[!is_custom_fnc] <- out[["default_stats"]]
out[["all_stats"]] <- all_stats
} else {
out[["default_stats"]] <- stats_in
out[["default_stats"]] <- out[["all_stats"]] <- stats_in
}

out
}

Expand Down Expand Up @@ -500,7 +509,7 @@ tern_default_stats <- list(
analyze_vars_numeric = c(
"n", "sum", "mean", "sd", "se", "mean_sd", "mean_se", "mean_ci", "mean_sei", "mean_sdi", "mean_pval",
"median", "mad", "median_ci", "quantiles", "iqr", "range", "min", "max", "median_range", "cv",
"geom_mean", "geom_mean_ci", "geom_cv",
"geom_mean", "geom_sd", "geom_mean_sd", "geom_mean_ci", "geom_cv",
"median_ci_3d",
"mean_ci_3d", "geom_mean_ci_3d"
),
Expand Down Expand Up @@ -580,6 +589,8 @@ tern_default_formats <- c(
median_range = "xx.x (xx.x - xx.x)",
cv = "xx.x",
geom_mean = "xx.x",
geom_sd = "xx.x",
geom_mean_sd = "xx.x (xx.x)",
geom_mean_ci = "(xx.xx, xx.xx)",
geom_mean_ci_3d = "xx.xx (xx.xx - xx.xx)",
geom_cv = "xx.x",
Expand Down Expand Up @@ -631,6 +642,8 @@ tern_default_labels <- c(
median_range = "Median (Min - Max)",
cv = "CV (%)",
geom_mean = "Geometric Mean",
geom_sd = "Geometric SD",
geom_mean_sd = "Geometric Mean (SD)",
geom_mean_ci = "Geometric Mean 95% CI",
geom_mean_ci_3d = "Geometric Mean (95% CI)",
geom_cv = "CV % Geometric Mean",
Expand Down
2 changes: 1 addition & 1 deletion man/analyze_variables.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/compare_variables.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion man/default_stats_formats_labels.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/summarize_change.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

70 changes: 57 additions & 13 deletions tests/testthat/_snaps/analyze_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,15 @@

$geom_mean
geom_mean
NaN
NA

$geom_sd
geom_sd
NA

$geom_mean_sd
geom_mean geom_sd
NA NA

$geom_mean_ci
mean_ci_lwr mean_ci_upr
Expand All @@ -131,7 +139,7 @@

$geom_mean_ci_3d
geom_mean mean_ci_lwr mean_ci_upr
NaN NA NA
NA NA NA
attr(,"label")
[1] "Geometric Mean (95% CI)"

Expand Down Expand Up @@ -257,6 +265,14 @@
geom_mean
1

$geom_sd
geom_sd
NA

$geom_mean_sd
geom_mean geom_sd
1 NA

$geom_mean_ci
mean_ci_lwr mean_ci_upr
NA NA
Expand Down Expand Up @@ -395,6 +411,14 @@
geom_mean
NA

$geom_sd
geom_sd
NA

$geom_mean_sd
geom_mean geom_sd
NA NA

$geom_mean_ci
mean_ci_lwr mean_ci_upr
NA NA
Expand Down Expand Up @@ -533,6 +557,14 @@
geom_mean
1.414214

$geom_sd
geom_sd
1.632527

$geom_mean_sd
geom_mean geom_sd
1.414214 1.632527

$geom_mean_ci
mean_ci_lwr mean_ci_upr
0.01729978 115.60839614
Expand Down Expand Up @@ -671,6 +703,14 @@
geom_mean
4.842534

$geom_sd
geom_sd
2.252326

$geom_mean_sd
geom_mean geom_sd
4.842534 2.252326

$geom_mean_ci
mean_ci_lwr mean_ci_upr
2.456211 9.547283
Expand Down Expand Up @@ -1388,11 +1428,13 @@
20 median_range 0.3 (-0.8 - 1.6) 0 Median (Min - Max)
21 cv 590.4 0 CV (%)
22 geom_mean NA 0 Geometric Mean
23 geom_mean_ci NA 0 Geometric Mean 95% CI
24 geom_cv NA 0 CV % Geometric Mean
25 median_ci_3d 0.26 (-0.82 - 0.74) 0 Median (95% CI)
26 mean_ci_3d 0.13 (-0.43 - 0.69) 0 Mean (95% CI)
27 geom_mean_ci_3d NA 0 Geometric Mean (95% CI)
23 geom_sd NA 0 Geometric SD
24 geom_mean_sd NA 0 Geometric Mean (SD)
25 geom_mean_ci NA 0 Geometric Mean 95% CI
26 geom_cv NA 0 CV % Geometric Mean
27 median_ci_3d 0.26 (-0.82 - 0.74) 0 Median (95% CI)
28 mean_ci_3d 0.13 (-0.43 - 0.69) 0 Mean (95% CI)
29 geom_mean_ci_3d NA 0 Geometric Mean (95% CI)

---

Expand Down Expand Up @@ -1523,12 +1565,14 @@
20 median_range 5.0 (3.0 - 5.9) 0 Median (Min - Max)
21 cv 19.6 0 CV (%)
22 geom_mean 4.8 0 Geometric Mean
23 geom_mean_ci (4.07, 5.58) 0 Geometric Mean 95% CI
24 geom_cv 22.3 0 CV % Geometric Mean
25 median_ci_3d 5.01 (3.53 - 5.78) 0 Median (95% CI)
26 mean_ci_3d 4.87 (4.18 - 5.55) 0 Mean (95% CI)
27 geom_mean_ci_3d 4.77 (4.07 - 5.58) 0 Geometric Mean (95% CI)
28 pval <0.0001 0 p-value (t-test)
23 geom_sd 1.2 0 Geometric SD
24 geom_mean_sd 4.8 (1.2) 0 Geometric Mean (SD)
25 geom_mean_ci (4.07, 5.58) 0 Geometric Mean 95% CI
26 geom_cv 22.3 0 CV % Geometric Mean
27 median_ci_3d 5.01 (3.53 - 5.78) 0 Median (95% CI)
28 mean_ci_3d 4.87 (4.18 - 5.55) 0 Mean (95% CI)
29 geom_mean_ci_3d 4.77 (4.07 - 5.58) 0 Geometric Mean (95% CI)
30 pval <0.0001 0 p-value (t-test)

---

Expand Down
3 changes: 2 additions & 1 deletion tests/testthat/_snaps/compare_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
[13] "median" "mad" "median_ci" "median_ci_3d"
[17] "quantiles" "iqr" "range" "min"
[21] "max" "median_range" "cv" "geom_mean"
[25] "geom_mean_ci" "geom_cv" "geom_mean_ci_3d" "pval"
[25] "geom_sd" "geom_mean_sd" "geom_mean_ci" "geom_cv"
[29] "geom_mean_ci_3d" "pval"

# s_compare for numeric does not give p-value when not at least 2 values in each group

Expand Down
Loading
Loading