Skip to content

Conversation

tgerke
Copy link
Contributor

@tgerke tgerke commented May 28, 2023

Generally making way through the code base. I'll open the PR and do this iteratively, so that you can review changes per file as we go.

@stopsack stopsack changed the base branch from master to dev June 2, 2023 12:07
@stopsack
Copy link
Owner

stopsack commented Jun 2, 2023

Changed base branch to dev, which is now up to speed with master again.

Copy link
Owner

@stopsack stopsack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @tgerke - thanks a lot for the initial rounds of edits! See comments inline.

library(risks) # provides riskratio(), riskdiff(), postestimation functions
library(dplyr) # For data handling
library(broom) # For tidy() model summaries
data(breastcancer)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say somewhere what package breastcancer comes from?

```{r allmodels2}
tidy(fit_all) %>%
select(-statistic, -p.value) %>%
tidy(fit_all) |>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about introducing a dependency on R 4.1. I still see scientists use 4.0 and earlier.

data <- data |>
dplyr::mutate(.clusterid = dplyr::row_number()) |>
dplyr::rename(outc = dplyr::one_of(!!yvar)) |>
tidyr::uncount(outc + 1) |>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The see the elegance in this approach! However, in the spirit of my comment in #10, it would be nice not to rely on tidyverse functions down the road. Let us focus on testing and bug fixes for now.

dplyr::mutate(outc = 0) %>%
dplyr::rename(!!yvar := "outc"))

data <- data |>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see earlier comment about base R pipe

yvar <- as.character(all.vars(formula)[1])

#TODO the following assumes that outc is coded as 0/1. We should throw an
# error when this is not true
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%! How about here:

risks/R/estimate_risk.R

Lines 216 to 217 in f48f0a0

...) {
implausible <- 0.99999

...
)
#TODO seeing a pattern: let's make returning a non-converged object
# an internal function that can be used in all cases like this
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - such a function just exists under different names because it returns slightly different object types depending on what model was supposed to be fit.

# Exception handlers
possibly_estimate_poisson <- ext_possibly(
.f = estimate_poisson,
otherwise = return_failure(family = list(family = "poisson"),
classname = "robpoisson"))
possibly_estimate_duplicate <- ext_possibly(
.f = estimate_duplicate,
otherwise = return_failure(family = list(family = "binomial"),
classname = "duplicate"))
possibly_estimate_glm <- ext_possibly(
.f = estimate_glm,
otherwise = return_failure(family = list(family = "binomial"),
classname = NULL))
possibly_estimate_glm_startp <- ext_possibly(
.f = estimate_glm,
otherwise = return_failure(family = list(family = "binomial"),
classname = "glm_startp"))
possibly_estimate_glm_startd <- ext_possibly(
.f = estimate_glm,
otherwise = return_failure(family = list(family = "binomial"),
classname = "glm_startd"))
possibly_estimate_logbin <- ext_possibly(
.f = estimate_logbin,
otherwise = return_failure(family = list(family = "binomial", link = "log"),
classname = "logbin"))
possibly_estimate_addreg <- ext_possibly(
.f = estimate_addreg,
otherwise = return_failure(family = list(family = "binomial", link = "identity"),
classname = "addreg"))
possibly_estimate_logistic <- ext_possibly(
.f = estimate_logistic,
otherwise = return_failure(family = list(family = "binomial", link = "logit"),
classname = "logistic"))
possibly_estimate_margstd_boot <- ext_possibly(
.f = estimate_margstd_boot,
otherwise = return_failure(family = list(family = "binomial", link = "logit"),
classname = "margstd_boot"))
possibly_estimate_margstd_delta <- ext_possibly(
.f = estimate_margstd_delta,
otherwise = return_failure(family = list(family = "binomial", link = "logit"),
classname = "margstd_delta"))

# TODO the below else is a great example of why they are best avoided
# in favor of explicit "if" conditions. I needed to jump way up to find
# that this is the "else" that happens when estimand is not equal to "rr".
# What else can it be? Let's just spell it clearly here instead of else.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Either "rr" or "rd".

)
}

#TODO I think duplicate is only valid for RRs? If so we should add a check
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happening here already:

risks/R/estimate_risk.R

Lines 182 to 195 in f48f0a0

riskdiff <- function(
formula,
data,
approach = c(
"auto",
"all",
"robpoisson",
"glm",
"glm_startp",
"glm_cem",
"glm_cem_startp",
"margstd_boot",
"margstd_delta",
"legacy"),

and

risks/R/estimate_risk.R

Lines 223 to 234 in f48f0a0

if(link == "log")
possible_approaches <- as.character(as.list(
args(risks::riskratio))$approach)[-1]
else
possible_approaches <- as.character(as.list(
args(risks::riskdiff))$approach)[-1]
if(!(approach[1] %in% possible_approaches))
stop(paste0(
"Approach '", approach[1], "' is not implemented. ",
"Available are: ",
paste(possible_approaches, sep = ", ", collapse = ", "),
"."))

#TODO I think duplicate is only valid for RRs? If so we should add a check
if(approach[1] == "duplicate") {
#TODO I think this is supposed to be assigning to the object "fit" or did I
# lose something in the refactor?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

risks/R/estimate_risk.R

Lines 236 to 237 in f48f0a0

fit <- switch(
EXPR = approach[1],

if(approach[1] == "glm_cem") {
#TODO why not just require logbin and addreg as imports? They seem pretty
# fundamental to key functions in this package. Then these error messages could
# be removed
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants