Skip to content

Using select with explicit header names requires all the column names to be specified #701

@miromarszal

Description

@miromarszal

Consider the following example.

csv = """1,2,3
1,2,3
1,2,3"""
CSV.File(IOBuffer(csv), select=[2, 3], header=0)
3-element CSV.File{false}:
 CSV.Row: (Column2 = 2, Column3 = 3)
 CSV.Row: (Column2 = 2, Column3 = 3)
 CSV.Row: (Column2 = 2, Column3 = 3)
CSV.File("data.csv", select=[2, 3], header=["a", "b", "c"])
3-element CSV.File{false}:
 CSV.Row: (b = 2, c = 3)
 CSV.Row: (b = 2, c = 3)
 CSV.Row: (b = 2, c = 3)
CSV.File("data.csv", select=[2, 3], header=["b", "c"])
thread = 1 warning: parsed expected 2 columns, but didn't reach end of line around data row: 1. Ignoring any extra columns on this row
thread = 1 warning: parsed expected 2 columns, but didn't reach end of line around data row: 2. Ignoring any extra columns on this row
thread = 1 warning: parsed expected 2 columns, but didn't reach end of line around data row: 3. Ignoring any extra columns on this row

3-element CSV.File{false}:
 CSV.Row: (c = 2,)
 CSV.Row: (c = 2,)
 CSV.Row: (c = 2,)

I find this rather surprising. I can either specify all the column names, which may be not too nice in a file with a large number of columns, or go with header=0 and rename columns afterwards, which feels like an unnecessary step.

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementImprove an existing feature/functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions