Skip to content

row_means() proportion of datapoints  #144

@florisvanvugt

Description

@florisvanvugt

row_means() has an argument n which allows us to specify the proportion of values required per row to return a mean. For example, n=.75 in my understanding is supposed to return a mean only if at least 75% of values in that row are non-NA. The following behaviour is therefore contrary to what I expected:

> df<-data.frame(q1=c(1,2),q2=c(2,NA),q3=c(1,1))
> df
  q1 q2 q3
1  1  2  1
2  2 NA  1
> sjmisc::row_means(df,n=.75)
  q1 q2 q3 rowmeans
1  1  2  1 1.333333
2  2 NA  1 1.500000

I had expected the second entry of the rowmeans column to be NA, because only 2 out of 3 values in that column are non-NA, i.e. 67% which is less than 75%. I realize I might be missing something about the intended behaviour of this function.

> packageVersion('sjmisc')
[1] ‘2.8.6’

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions