You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
5) To perform STREAM analyis in Jupyter Notebook as shown in **Tutorial**, run the following commands within `myenv`:
42
+
4) To perform STREAM analyis in Jupyter Notebook as shown in **Tutorial**, type `jupyter notebook` within `myenv`:
42
43
43
44
```sh
44
-
$ conda install jupyter
45
45
$ jupyter notebook
46
46
```
47
47
@@ -129,17 +129,7 @@ perform log2 transformation
129
129
--norm
130
130
normalize data based on library size
131
131
--atac
132
-
indicate scATAC-seq data
133
-
--atac_counts
134
-
scATAC-seq counts file name in .tsv or .tsv.gz format. Counts file is a compressed sparse matrix that contains three columns including region indices, sample indices and the number of reads(default: None)
135
-
--atac_regions
136
-
scATAC-seq regions file name in .tsv or .tsv.gz format. Regions file contains three columns including chromosome names, start and end positions of regions (default: None)
137
-
--atac_samples
138
-
scATAC-seq samples file name in .tsv or tsv.gz. Samples file contains one column of cell names (default: None)
139
-
--atac_k
140
-
specify k-mers length for scATAC-seq analysis (default: 7)
141
-
--atac_zscore
142
-
Indicate precomputed atac zscore matrix file
132
+
indicate scATAC-seq data
143
133
--n_processes
144
134
Specify the number of processes to use. (default, all the available cores).
145
135
--loess_frac
@@ -338,30 +328,55 @@ Please note that for large dataset analysis it'll be necessary to increase the d
338
328
339
329
Here we we take a single cell RNA-seq dataset as an example,including data_Nestorowa.tsv.gz, cell_label.tsv.gz and cell_label_color.tsv.gz (Nestorowa, S. et al.,2016), and assuming that **they are in the current folder**, to perform trajectory inference analysis, users can simply run a single command:
If cell labels are not available or no customized cell label color file is available, **-l** or **-c** can also be omitted
346
341
342
+
*Using Bioconda:*
343
+
```sh
344
+
$ stream -m data_Nestorowa.tsv.gz
345
+
```
346
+
*Using Docker:*
347
347
```sh
348
348
$ docker run -v ${PWD}:/data -w /data pinellolab/stream -m data_Nestorowa.tsv.gz
349
349
```
350
350
351
351
To visualize genes of interest, user can provide a gene list file by adding **-g**, for example: gene_list.tsv.gz. Meanwhile, by adding the flag **-p**, STREAM will use the precomputed file obtained from the first running (In this way, STREAM will import precomupted pkl file so the analysis will skip structure learning part and only execute the step of visualizing genes):
To explore potential marker genes, it is possible to add the flags **--DE**, **--TG**, or **--LG** to detect DE (differentially expressed) genes, transition gens, and leaf genes respectively:
@@ -388,24 +413,20 @@ After running this command, a folder named **'mapping_result'** will be created
388
413
389
414
To perform scATAC-seq trajectory inference analysis, three files are necessary, a .tsv file of counts in compressed sparse format, a sample file in .tsv format and a region file in .bed format. (Buenrostro, J.D. et al., 2018). We assume that **they are in the current folder**.
390
415
391
-
Using these three files, users can run STREAM with the following command (note the flag **--atac** ):
416
+
Using these three files, users can run `stream_atac` with the following command to preprocess sc-atac-seq data and get a z_score matrix file named **'zscore.tsv.gz'**(This step may take a couple of hours with a modest machine):
**The above command may take a couple of hours with a modest machine because the conversion from counts to k-mer z-score is time-consuming.** Therefore STREAM also provides the option to take as input a precomputed z-score file.
398
-
399
-
First, the z-score file can be obtained with the following command (add **--atac_zscore**):
423
+
Then, take z-score file as input to infer trajectories using `stream`:
The above command will generate a file named **'zscore.tsv'**. It’s a tab-delimited z-score matrix with k-mers in row and cells in column. Each entry is a scaled z-score of the accessibility of each k-mer across cells.
406
-
407
-
Second, take z-score file as input to infer trajectories:
0 commit comments