Skip to content

Anonalize fails on non-time series grouped data #55

@larry77

Description

@larry77

Dear All,
Hopefully the reprex is self-explanatory.
I plan to use anomalize on non-time series data.
It should still work according to the documentation (without the time series decomposition) and it does, but not on non-time series grouped data.
Any ideas?

library(tidyverse)

library(anomalize)
#> ══ Use anomalize to improve your Forecasts by 50%! ═════════════════════════════
#> Business Science offers a 1-hour course - Lab #18: Time Series Anomaly Detection!
#> </> Learn more at: https://university.business-science.io/p/learning-labs-pro </>

test1 <- tidyverse_cran_downloads %>%
    time_decompose(count) %>%
    anomalize(remainder)
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo

print(test1)  ##and this works fine
#> # A time tibble: 6,375 x 9
#> # Index:  date
#> # Groups: package [15]
#>    package date       observed season trend remainder remainder_l1 remainder_l2
#>    <chr>   <date>        <dbl>  <dbl> <dbl>     <dbl>        <dbl>        <dbl>
#>  1 broom   2017-01-01    1053. -1007. 1708.    352.         -1725.        1704.
#>  2 broom   2017-01-02    1481    340. 1731.   -589.         -1725.        1704.
#>  3 broom   2017-01-03    1851    563. 1753.   -465.         -1725.        1704.
#>  4 broom   2017-01-04    1947    526. 1775.   -354.         -1725.        1704.
#>  5 broom   2017-01-05    1927    430. 1798.   -301.         -1725.        1704.
#>  6 broom   2017-01-06    1948    136. 1820.     -8.11       -1725.        1704.
#>  7 broom   2017-01-07    1542   -988. 1842.    688.         -1725.        1704.
#>  8 broom   2017-01-08    1479. -1007. 1864.    622.         -1725.        1704.
#>  9 broom   2017-01-09    2057    340. 1887.   -169.         -1725.        1704.
#> 10 broom   2017-01-10    2278    563. 1909.   -194.         -1725.        1704.
#> # … with 6,365 more rows, and 1 more variable: anomaly <chr>




test2 <- tidyverse_cran_downloads %>%
    group_by(package) %>% 
    time_decompose(count) %>%
    anomalize(remainder)

print(test2)  ##and also this works fine
#> # A time tibble: 6,375 x 9
#> # Index:  date
#> # Groups: package [15]
#>    package date       observed season trend remainder remainder_l1 remainder_l2
#>    <chr>   <date>        <dbl>  <dbl> <dbl>     <dbl>        <dbl>        <dbl>
#>  1 broom   2017-01-01    1053. -1007. 1708.    352.         -1725.        1704.
#>  2 broom   2017-01-02    1481    340. 1731.   -589.         -1725.        1704.
#>  3 broom   2017-01-03    1851    563. 1753.   -465.         -1725.        1704.
#>  4 broom   2017-01-04    1947    526. 1775.   -354.         -1725.        1704.
#>  5 broom   2017-01-05    1927    430. 1798.   -301.         -1725.        1704.
#>  6 broom   2017-01-06    1948    136. 1820.     -8.11       -1725.        1704.
#>  7 broom   2017-01-07    1542   -988. 1842.    688.         -1725.        1704.
#>  8 broom   2017-01-08    1479. -1007. 1864.    622.         -1725.        1704.
#>  9 broom   2017-01-09    2057    340. 1887.   -169.         -1725.        1704.
#> 10 broom   2017-01-10    2278    563. 1909.   -194.         -1725.        1704.
#> # … with 6,365 more rows, and 1 more variable: anomaly <chr>


## From the documentation:
## For non-time series data (data without trend), the anomalize()
## function can be used without time
## series decomposition.





test3 <- tidyverse_cran_downloads %>%
    select(-date) %>%
    filter(package=="broom") %>% 
    anomalize(count)


print(test3) ## OK!
#> # A tibble: 425 x 5
#>    count package count_l1 count_l2 anomaly
#>    <dbl> <chr>      <dbl>    <dbl> <chr>  
#>  1  1053 broom     -2535.    7965. No     
#>  2  1481 broom     -2535.    7965. No     
#>  3  1851 broom     -2535.    7965. No     
#>  4  1947 broom     -2535.    7965. No     
#>  5  1927 broom     -2535.    7965. No     
#>  6  1948 broom     -2535.    7965. No     
#>  7  1542 broom     -2535.    7965. No     
#>  8  1479 broom     -2535.    7965. No     
#>  9  2057 broom     -2535.    7965. No     
#> 10  2278 broom     -2535.    7965. No     
#> # … with 415 more rows



### now let us try this on grouped data






test4 <- tidyverse_cran_downloads %>%
    select(-date) %>% 
    group_by(package) %>% 
    anomalize(count)
#> Error in value[[3L]](cond): Error in prep_tbl_time(): No date or datetime column found.

print(test4)  ##and now an error ## what to do?
#> Error in print(test4): object 'test4' not found

Created on 2020-07-30 by the reprex package (v0.3.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions