-
Notifications
You must be signed in to change notification settings - Fork 51
Open
Description
Hi, it'd be great if, instead of stopping the stat calculation with one of the variables that has no enough observations, just leave an NA and continue with the other variables.
For example, this is my data:
> mydata <- structure(list(pops = structure(c(1L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
2L, 1L, 1L, 2L, 2L, 1L), levels = c("MEP", "MDSC"), class = "factor"),
condition = c("a", "a", "a", "a", "a", "a", "a", "b", "b",
"b", "b", "b", "b", "b", "b", "a", "a", "a", "a", "a", "a",
"a", "b", "b", "b", "b", "b", "b", "b"), timepoint = c("BL",
"ES", "ES", "BL", "ES", "ES", "BL", "ES", "BL", "ES", "BL",
"ES", "BL", "BL", "ES", "BL", "BL", "BL", "ES", "BL", "ES",
"ES", "BL", "ES", "ES", "BL", "BL", "ES", "ES"), value = c(0.00467886005954913,
0.000258531540847983, 0.00539083557951483, 0.00479616306954436,
0.000789265982636148, 0.0513022888713496, 0.00503959683225342,
0.464576962283384, 0.0803300043421624, 0.0421768707482993,
0.336828309305374, 0.000272108843537415, 0.00235910878112713,
0.00208768267223382, 0.00273224043715847, 0.0233545647558386,
0.0159453302961276, 0.00251151109250733, 0.000471698113207547,
0.00171969045571797, 0.0289855072463768, 0.00480769230769231,
0.612244897959184, 0.000175162024873008, 0.298125766333859,
0.157004830917874, 0.14975845410628, 0.435387673956262, 0.0387673956262425
)), row.names = c(12L, 48L, 77L, 107L, 111L, 130L, 159L,
171L, 200L, 229L, 233L, 241L, 249L, 279L, 291L, 336L, 381L, 434L,
463L, 492L, 529L, 545L, 603L, 610L, 623L, 652L, 660L, 672L, 697L
), class = "data.frame")
If I try to calculate any test (eg, Wilcox), I get this error because MDSC has no observations in condition a and BL...
> mydata %>% group_by(pops, condition) %>% rstatix::wilcox_test(value ~ timepoint)
Error in `mutate()`:
ℹ In argument: `data = map(.data$data, .f, ...)`.
Caused by error in `map()`:
ℹ In index: 2.
Caused by error in `wilcox.test.default()`:
! not enough 'y' observations
Run `rlang::last_trace()` to see where the error occurred.
> mydata %>% group_by(pops, condition) %$% table(pops, timepoint, condition)
, , condition = a
timepoint
pops BL ES
MEP 7 5
MDSC 0 2
, , condition = b
timepoint
pops BL ES
MEP 5 5
MDSC 2 3
... I have to manually remove this variable and then it works:
> mydata %>% subset(pops != "MDSC") %>% group_by(pops, condition) %>% rstatix::wilcox_test(value ~ timepoint)
# A tibble: 2 × 9
pops condition .y. group1 group2 n1 n2 statistic p
* <fct> <chr> <chr> <chr> <chr> <int> <int> <dbl> <dbl>
1 MEP a value BL ES 7 5 16 0.876
2 MEP b value BL ES 5 5 15 0.69
It'd be nice just to have something like this and not stopping the calculation:
Warning: Some variables do not have enough observations for calculation.
# A tibble: 2 × 9
pops condition .y. group1 group2 n1 n2 statistic p
* <fct> <chr> <chr> <chr> <chr> <int> <int> <dbl> <dbl>
1 MDSC a value BL ES 0 2 <NA> <NA>
2 MDSC b value BL ES 2 3 ... ...
3 MEP a value BL ES 7 5 16 0.876
4 MEP b value BL ES 5 5 15 0.69
Only as a suggestion. Thanks!
Metadata
Metadata
Assignees
Labels
No labels