Skip to content

Useful Command Lines

vjrj edited this page Jul 2, 2021 · 3 revisions

Useful Command Lines

Introduction

(...)

pipes

Something that makes terminal shell commands powerful is concatenation of command with the pipe |. Lets see it in cation:

Let's see the content of a meta.xml file. Let's move to some dr location:

cd /data/biocache-load/dr603

and we can show the contents with cat:

cat meta.xml 

but if we want to see the contents of a occurrence.txt file, if this file is too long, maybe it's more useful to just count the lines of that file:

cat occurrence.txt | wc
3540297 134473483 1548283140

the output shows the number of lines (3M), words, and bytes in that file. Here we concatenate the output of cat to wc (word count command) with the pipe |.

We can do many concatenation of commands until we get the thing we need. For instance this command get only the 10th column (the ids), sort all the ids and redirect the output to a file in /tmp directory.

cat occurrence.txt | awk -F $'\t' '{print $10}' | sort > /tmp/dr-603-ids-load.txt

But let's explain this step by step.

Useful shell commands

head and tail

Instead of use cat many times is useful to use only a part of a file. In our previous example, with a file of 3M of lines, this is quite useful. So if we do:

head -50 occurrence.txt 

we'll see the first 50 lines of that file. This is interesting to see the header of a file.

The same with tail:

tail -50 occurrence.txt 

we'll show you the end of that file.

This is also useful to test commands with a portion of a big file. In the previous long cat command we can test our command with head instead of cat

head -5 occurrence.txt | awk -F $'\t' '{print $10}'

will print 5 lines of the 10th column of occurrence.txt with columns separated with TABs (\t). (explanation in detail). Something like:

occurrenceID
3084007342
1090938898
1090938908
3015196328

this is useful to see if we are selecting the correct column we are interested in, without having to process all the 3M lines.

When we are sure that his is what we want we can continue concatenating the output with the pipe | with other commands.

Clone this wiki locally