Skip to content

Buenisimo! Aqui una alternative con group_map() #1

@maurolepore

Description

@maurolepore

Gracias Julio por la presentacion de hoy. Me encanta purrr y me alegro muchisimo que lo hayas promocionado tan bien :)

Tus ejemplos me recordaron a group_map(). Estoy intentando aprender este nuevo approach asi que busco toda excusa posible para ponerla en practica. Y este parece un buen caso.

Este ejemplo es sacado directamente de ?group_map():

library(tidyverse)
library(broom)
library(datasets)

iris %>%
  group_by(Species) %>%
  group_map(~ broom::tidy(lm(Petal.Length ~ Sepal.Length, data = .x)))
#> # A tibble: 6 x 6
#> # Groups:   Species [3]
#>   Species    term         estimate std.error statistic  p.value
#>   <fct>      <chr>           <dbl>     <dbl>     <dbl>    <dbl>
#> 1 setosa     (Intercept)     0.803    0.344      2.34  2.38e- 2
#> 2 setosa     Sepal.Length    0.132    0.0685     1.92  6.07e- 2
#> 3 versicolor (Intercept)     0.185    0.514      0.360 7.20e- 1
#> 4 versicolor Sepal.Length    0.686    0.0863     7.95  2.59e-10
#> 5 virginica  (Intercept)     0.610    0.417      1.46  1.50e- 1
#> 6 virginica  Sepal.Length    0.750    0.0630    11.9   6.30e-16

iris %>%
  group_by(Species) %>%
  group_map(~ broom::glance(lm(Petal.Length ~ Sepal.Length, data = .x)))
#> # A tibble: 3 x 12
#> # Groups:   Species [3]
#>   Species r.squared adj.r.squared sigma statistic  p.value    df logLik
#>   <fct>       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl>
#> 1 setosa     0.0714        0.0520 0.169      3.69 6.07e- 2     2  18.9 
#> 2 versic~    0.569         0.560  0.312     63.3  2.59e-10     2 -11.7 
#> 3 virgin~    0.747         0.742  0.281    142.   6.30e-16     2  -6.37
#> # ... with 4 more variables: AIC <dbl>, BIC <dbl>, deviance <dbl>,
#> #   df.residual <int>

Created on 2019-04-25 by the reprex package (v0.2.1)

.

(cc' @FvD porque se que te gusta do() -- a mi tambien, pero creo que la estan abandonando poco a poco)

Use group_map() when summarize() is too limited, in terms of what you need to do and return for each group. group_map() is good for "data frame in, data frame out". If that is too limited, you need to use a nested or split workflow.

group_map() and group_walk() are an evolution of do(), if you have used that before.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions