Skip to content

Allow cut on non-existent columns #156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tseemann opened this issue Aug 16, 2021 · 7 comments
Open

Allow cut on non-existent columns #156

tseemann opened this issue Aug 16, 2021 · 7 comments

Comments

@tseemann
Copy link

% csvtk version
csvtk v0.23.

% cat foo.csv
A,B
1,2
3,4

% csvtk cut -f A,C foo.csv

# THIS HAPPENS
[ERRO] column "C" not existed in file: a.csv

# DESIRED OPTION 1
% csvtk cut --allow-missing-col -f A,C foo.csv
A
1
3

# DESIRED OPTION 2
% csvtk cut --blank-missing-col  -f A,C foo.csv
A,C
1,
3,

This would be usefulf or cleaning large datatsets that I need to merge or get partial versions from different sources.

@shenwei356
Copy link
Owner

New flags

  -m, --allow-missing-col   allow missing column
  -b, --blank-missing-col   blank missing column

Tests

$ cat foo.csv | csvtk cut -f D,A,C,B -m | csvtk pretty 
A   B
-   -
1   2
3   4

$ cat foo.csv | csvtk cut -f D,A,C,B -m -b | csvtk pretty 
D   A   C   B
-   -   -   -
    1       2
    3       4

$ cat foo.csv | csvtk cut -f 5,2,4,1 -m | csvtk pretty 
B   A
-   -
2   1
4   3

$ cat foo.csv | csvtk cut -f 5,2,4,1 -m -b  
,B,,A
,2,,1
,4,,3

Binaries

@tseemann
Copy link
Author

Awesome! Thanks @shenwei356 !

@clydeugene
Copy link

Hi @shenwei356

I am trying to combine SRA run tables in such a way that the final output will have consistent columns with blank values where there is no information. So, I used the cut command with the -m -b flags, and I was not able to replicate what you did here:

$ cat foo.csv | csvtk cut -f D,A,C,B -m -b | csvtk pretty 
D   A   C   B
-   -   -   -
    1       2
    3       4

Here's my command:

$ csvtk cut -f Run,BioProject,sex,source_name,BioSample,Diet,treatment,cell_type,AGE,strain,tissue SraRunTable.txt -m -b | csvtk pretty
Run           BioProject    source_name         BioSample      treatment   tissue           
-----------   -----------   -----------------   ------------   ---------   -----------------
SRR24897950   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets
SRR24897951   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets
SRR24897952   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets
SRR24897953   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets
SRR24897954   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets
SRR24897955   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets
SRR24897956   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets
SRR24897957   PRJNA982793   Pancreatic islets   SAMN35715211   HAMS        Pancreatic islets

Note: even though the file extension is .txt this is in fact a csv file. It has a .txt extension because another application in the workflow expects .txt and not .csv

From what I can tell the -m flag is working fine but the -b is not.

$ csvtk --version
csvtk v0.33.0

I would appreciate your help with this :)

@shenwei356
Copy link
Owner

API changed since then, but -b was not adapted. Will check this on Sunday. Need to sleep now.

@shenwei356 shenwei356 reopened this Apr 13, 2025
@shenwei356 shenwei356 reopened this Apr 13, 2025
shenwei356 added a commit that referenced this issue Apr 13, 2025
@shenwei356
Copy link
Owner

fixed.

$ cat foo.csv | csvtk cut -f D,A,C,B -m  | csvtk pretty 
A   B
-   -
1   2
3   4


$ cat foo.csv | csvtk cut -f D,A,C,B -m -b | csvtk pretty 
D   A   C   B
-   -   -   -
    1       2
    3       4

@clydeugene
Copy link

Thank you @shenwei356

I tested the linux binary and -b is now working as expected. Might you have an estimate of when the updated binaries will be available for installation with pixi? I would want to avoid installing via a pixi task although that's possible now.

@shenwei356
Copy link
Owner

I'll tag a new release this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants