Skip to content

non-logical datasets shouldn't start with "is_" #2

@alistaire47

Description

@alistaire47

Currently is_developed and is_independent return character vectors:

unique(passport:::countries$is_developed)
#> [1] NA           "Developed"  "Developing"

unique(passport:::countries$is_independent)
#>  [1] NA                       "Yes"                   
#>  [3] "Territory of GB"        "International"         
#>  [5] "Territory of US"        "Part of NL"            
#>  [7] "Part of FI"             "Part of FR"            
#>  [9] "Territory of NO"        "Territory of AU"       
#> [11] "Associated with NZ"     "In contention"         
#> [13] "Part of DK"             "Crown dependency of GB"
#> [15] "Part of CN"             "Commonwealth of US"    
#> [17] "Territory of FR"        "Territory of NZ"       
#> [19] "Territories of US"

If they're going to start with is_, they should really return logical vectors. To address the issue, they could

  • drop information to actually return a logical
  • get renamed
  • be split in two, e.g. is_independent and dependency_status

None of these options is really ideal, as the expectation of as_country_code and as_country_name is usually to return a character vector or factor. They are not the only exceptions:

code_types <- sapply(passport:::countries, typeof) 

code_types[code_types != 'character']
#>                        gaul              un_region_code 
#>                    "double"                   "integer" 
#>           un_subregion_code un_intermediate_region_code 
#>                   "integer"                   "integer" 
#>                         m49                         ldc 
#>                   "integer"                   "logical" 
#>                        lldc                        sids 
#>                   "logical"                   "logical"

Numeric country codes (gaul, un_*_code, m49) are a different issue. Perhaps they should be strings, as they should not be operated upon, but converting them to factors is potentially very confusing and may merit a warning or message.

Country groupings (ldc, lldc, sids, un_*_code) will be addressed by #1 (though they face the same type issue).

These two (plus a lot more) should be split into a separate set of country attributes (#3), but the issue will still have to be addressed within that dataset.

This will be a breaking change, but integrating the change with #3 will minimize disruption.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions