Skip to content

Figure 13.10 & 14.3 (a) potentially misleading; given values of arsenic used #69

@shane-kercheval

Description

@shane-kercheval

Figures 13.10 and 14.3 (a) show the probability of switching wells as a function of distance and a given level of arsenic.

The values of arsenic used are 0.5 and 1.0.

The minimum value of arsenic in the data is 0.51, so perhaps 0.5 is close enough.

But the value of 1 is something like the 35% percentile in the data. I'm not sure this is a good representation, and seems arbitrary, but perhaps I'm missing something.

The median value of arsenic is 1.3.

Plotting the min/median/max values of arsenic (or even something like ~95% percent, which is ~3.79), I think provide better insight.

The values 0.5-1 of arsenic used in the graph is on the lower levels and so to say, in the book, "the interaction is small in the range of most of the data" seem misleading because we're only looking at lower levels of arsenic.

(graph b uses a distance of 0 to 50, which seems more reasonable because its covering the "closest" 65% of data (median dist is 36.8), and 50 meters is a nice conceptual value; although seems like a min/medium/max approach could benefit this graph as well)

Here's a screenshot at different levels.

Screen Shot 2020-12-23 at 12 24 38 PM

code below, along with the non-interaction model

library(rstan)
library(rstanarm)

set.seed(2)
model_interaction <- stan_glm(switch ~ dist100*arsenic, family=binomial(link="logit"), data=wells, refresh=0)
model_no_interaction <- stan_glm(switch ~ dist100+arsenic, family=binomial(link="logit"), data=wells, refresh=0)

jitter_binary <- function(a, jitt=.05){
  a + (1-2*a)*runif(length(a),0,jitt)
}
predict_switch <- function(.model, .dist100, .arsenic) {
    predict(.model,
            newdata = data.frame(dist100=.dist100, arsenic=.arsenic),
            type='response')
}

plot_prob_swiching_given_distance <- function(.model) {

    plot(wells$dist, jitter_binary(wells$switch), xlim=c(0, max(wells$dist)))
    for(arsenic_quantile in c(0, 0.5, 0.95, 1)) {
        arsenic_value <- quantile(wells$arsenic, arsenic_quantile)
        curve(predict_switch(.model, .dist100=x/100, .arsenic = arsenic_value), add=TRUE, col='red')
        
        text_y <- predict_switch(.model, .dist100=0, .arsenic = arsenic_value)
        text(.25, text_y, paste("if arsenic = ", arsenic_value), adj=0, cex=.8, col = 'red')
    }
    
}
plot_prob_swiching_given_distance(model_no_interaction)
plot_prob_swiching_given_distance(model_interaction)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions