Skip to content

Commit 527565a

Browse files
committed
basic readme
1 parent 884026e commit 527565a

File tree

1 file changed

+119
-0
lines changed

1 file changed

+119
-0
lines changed

README.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# blobtools
2+
Application for the visualisation of draft genome assemblies and general QC
3+
4+
- Requirements
5+
6+
```
7+
pip install matplotlib
8+
pip install docopt
9+
```
10+
11+
- blobtools : main executable
12+
```
13+
usage: blobtools <command> [<args>...] [--help]
14+
15+
commands:
16+
create create a BlobDB
17+
view print BlobDB
18+
plot plot BlobDB as a blobplot
19+
20+
-h --help show this
21+
```
22+
23+
- blobtools create : create a BlobDb JSON file
24+
```
25+
usage: blobtools create -i FASTA [-y FASTATYPE] [-o OUTFILE] [--title TITLE]
26+
[-b BAM...] [-s SAM...] [-a CAS...] [-c COV...]
27+
[--nodes <NODES>] [--names <NAMES>] [--db <NODESDB>]
28+
[-t TAX...] [-r TAXRULE...]
29+
[-h|--help]
30+
31+
Options:
32+
-h --help show this
33+
-i, --infile FASTA FASTA file of assembly. Headers are split at whitespaces.
34+
-y, --type FASTATYPE Assembly program used to create FASTA. If specified,
35+
coverage will be parsed from FASTA header.
36+
(Parsing supported for 'spades', 'soap', 'velvet', 'abyss')
37+
-t, --taxfile TAX... Taxonomy file in format (qseqid\ttaxid\tbitscore)
38+
(e.g. BLAST output "--outfmt '6 qseqid staxids bitscore'")
39+
-r, --taxrule <TAXRULE>... Taxrule determines how taxonomy of blobs is computed [default: bestsum]
40+
"bestsum" : sum bitscore across all hits for each taxonomic rank
41+
"bestsumorder" : sum bitscore across all hits for each taxonomic rank.
42+
- If first <TAX> file supplies hits, bestsum is calculated.
43+
- If no hit is found, the next <TAX> file is used.
44+
--nodes <NODES> NCBI nodes.dmp file. Not required if '--db'
45+
--names <NAMES> NCBI names.dmp file. Not required if '--db'
46+
--db <NODESDB> NodesDB file [default: data/nodesDB.txt].
47+
-b, --bam <BAM>... BAM file (requires samtools in $PATH)
48+
-s, --sam <SAM>... SAM file
49+
-a, --cas <CAS>... CAS file (requires clc_mapping_info in $PATH)
50+
-c, --cov <COV>... TAB separated. (seqID\tcoverage)
51+
-o, --out <OUT> BlobDB output prefix
52+
--title TITLE Title of BlobDB [default: FASTA)
53+
```
54+
55+
- blobtools view : generate table output from a blobDB file
56+
```
57+
usage: blobtools view -i <BLOBDB> [-r <TAXRULE>] [--rank <TAXRANK>...] [--hits]
58+
[--list <LIST>] [--out <OUT>]
59+
[--h|--help]
60+
61+
Options:
62+
--h --help show this
63+
-i, --input <BLOBDB> BlobDB file (created with "blobtools forge")
64+
-o, --out <OUT> Output file [default: STDOUT]
65+
-l, --list <LIST> List of sequence names (comma-separated or file).
66+
If comma-separated, no whitespaces allowed.
67+
-r, --taxrule <TAXRULE> Taxrule used for computing taxonomy (supported: "bestsum", "bestsumorder")
68+
[default: bestsum]
69+
--rank <TAXRANK>... Taxonomic rank(s) at which output will be written.
70+
(supported: 'species', 'genus', 'family', 'order',
71+
'phylum', 'superkingdom', 'all') [default: phylum]
72+
-b, --hits Displays taxonomic hits from tax files
73+
```
74+
75+
blobtools plot : generate a blobplot from a blobDB file
76+
```
77+
usage: blobtools plot -i BLOBDB [-p INT] [-l INT] [-c] [-n] [-s]
78+
[-r RANK] [-x TAXRULE] [--label GROUPS...]
79+
[-o PREFIX] [-m] [--sort ORDER] [--hist HIST] [--title]
80+
[--colours FILE] [--include FILE] [--exclude FILE]
81+
[--format FORMAT] [--noblobs] [--noreads] [--refcov FILE]
82+
[-h|--help]
83+
84+
Options:
85+
-h --help show this
86+
-i, --infile BLOBDB BlobDB file
87+
-p, --plotgroups INT Number of (taxonomic) groups to plot, remaining
88+
groups are placed in 'other' [default: 7]
89+
-l, --length INT Minimum sequence length considered for plotting [default: 100]
90+
-c, --cindex Colour blobs by 'c index' [default: False]
91+
-n, --nohit Hide sequences without taxonomic annotation [default: False]
92+
-s, --noscale Do not scale sequences by length [default: False]
93+
-o, --out PREFIX Output prefix
94+
-m, --multiplot Multi-plot. Print plot after addition of each (taxonomic) group
95+
[default: False]
96+
--sort <ORDER> Sort order for plotting [default: span]
97+
span : plot with decreasing span
98+
count : plot with decreasing count
99+
--hist <HIST> Data for histograms [default: span]
100+
span : span-weighted histograms
101+
count : count histograms
102+
--title Add title of BlobDB to plot [default: False]
103+
-r, --rank RANK Taxonomic rank used for colouring of blobs [default: phylum]
104+
(Supported: species, genus, family, order, phylum, superkingdom)
105+
-x, --taxrule TAXRULE Taxrule which has been used for computing taxonomy
106+
(Supported: bestsum, bestsumorder) [default: bestsum]
107+
--label GROUPS... Relabel (taxonomic) groups (not 'all' or 'other'),
108+
e.g. "Bacteria=Actinobacteria,Proteobacteria"
109+
--colours COLOURFILE File containing colours for (taxonomic) groups
110+
--exclude GROUPS.. Place these (taxonomic) groups in 'other',
111+
e.g. "Actinobacteria,Proteobacteria"
112+
--format FORMAT Figure format for plot (png, pdf, eps, jpeg,
113+
ps, svg, svgz, tiff) [default: png]
114+
--noblobs Omit blobplot [default: False]
115+
--noreads Omit plot of reads mapping [default: False]
116+
--refcov FILE File containing number of "total" and "mapped" reads
117+
per coverage file. (e.g.: bam0,900,100). If provided, info
118+
will be used in read coverage plot(s).
119+
```

0 commit comments

Comments
 (0)