Skip to content

Commit 567fcd3

Browse files
Merge pull request #87 from nasa/master
Updating to merge Master branch updates
2 parents 0b756c9 + c6c3ead commit 567fcd3

File tree

8 files changed

+68
-13
lines changed

8 files changed

+68
-13
lines changed

Amplicon/Illumina/Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ Amanda Saravia-Butler (GeneLab Data Processing Lead)
4141
- The ITS UNITE reference database used was updated to "UNITE_v2023_July2023.RData", from http://www2.decipher.codes/Classification/TrainingSets/
4242
- Several program versions were updated (all versions listed in [Software used](#software-used) below)
4343

44+
---
45+
4446
# Table of contents
4547

4648
- [Software used](#software-used)
@@ -818,8 +820,8 @@ ASV_physeq <- phyloseq(count_tab_phy, tax_tab_phy, sample_info_tab_phy)
818820
richness_and_diversity_estimates_by_sample <- plot_richness(ASV_physeq, color = "groups", measures = c("Chao1", "Shannon"))
819821
richness_and_diversity_estimates_by_group <- plot_richness(ASV_physeq, x = "groups", color = "groups", measures = c("Chao1", "Shannon"))
820822

821-
ggsave(paste0("richness_and_diversity_estimates_by_sample_GLAmpSeq", ".png"), plot = richness_and_diversity_estimates_by_sample)
822-
ggsave(paste0("richness_and_diversity_estimates_by_group_GLAmpSeq", ".png"), plot = richness_and_diversity_estimates_by_group)
823+
ggsave(filename = "richness_and_diversity_estimates_by_sample_GLAmpSeq.png", plot = richness_and_diversity_estimates_by_sample)
824+
ggsave(filename = "richness_and_diversity_estimates_by_group_GLAmpSeq.png", plot = richness_and_diversity_estimates_by_group)
823825
```
824826

825827
**Parameter Definitions:**
@@ -861,10 +863,10 @@ relative_classes <- plot_bar(proportions_physeq, x = "groups", fill = "class")
861863
samplewise_phyla <- plot_bar(proportions_physeq, fill = "phylum")
862864
samplewise_classes <- plot_bar(proportions_physeq, fill = "class")
863865

864-
ggsave(filename = "relative_phyla_GLAmpSeq", ".png", plot = relative_phyla)
865-
ggsave(filename = "relative_classes_GLAmpSeq", ".png", plot = relative_classes)
866-
ggsave(filename = "samplewise_relative_phyla_GLAmpSeq", ".png", plot = samplewise_phyla)
867-
ggsave(filename = "samplewise_relative_classes_GLAmpSeq", ".png", plot = samplewise_classes)
866+
ggsave(filename = "relative_phyla_GLAmpSeq.png", plot = relative_phyla)
867+
ggsave(filename = "relative_classes_GLAmpSeq.png", plot = relative_classes)
868+
ggsave(filename = "samplewise_relative_phyla_GLAmpSeq.png", plot = samplewise_phyla)
869+
ggsave(filename = "samplewise_relative_classes_GLAmpSeq.png", plot = samplewise_classes)
868870
```
869871

870872
**Input Data:**
@@ -967,8 +969,8 @@ ordination_plot_u <- plot_ordination(vst_physeq, vst_pcoa, color = "groups") +
967969
annotate("text", x = Inf, y = -Inf, label = paste("R2:", toString(round(r2_value, 3))), hjust = 1.1, vjust = -2, size = 4)+
968970
annotate("text", x = Inf, y = -Inf, label = paste("Pr(>F)", toString(round(prf_value,4))), hjust = 1.1, vjust = -0.5, size = 4)+ ggtitle("PCoA")
969971

970-
ggsave(filename=paste0(beta_diversity_out_dir, output_prefix, "PCoA_w_labels_GLAmpSeq", ".png"), plot=ordination_plot)
971-
ggsave(filename=paste0(beta_diversity_out_dir, output_prefix, "PCoA_without_labels_GLAmpSeq", ".png"), plot=ordination_plot_u)
972+
ggsave(filename="PCoA_w_labels_GLAmpSeq.png", plot=ordination_plot)
973+
ggsave(filename="PCoA_without_labels_GLAmpSeq.png", plot=ordination_plot_u)
972974

973975
```
974976

@@ -1009,7 +1011,7 @@ Run the DESeq() function to normalize for sample read-depth and composition, tra
10091011
```R
10101012
deseq_modeled <- DESeq(deseq_obj)
10111013

1012-
write.table(counts(deseq_modeled, normalized=TRUE), file = paste0("normalized_counts_GLAmpSeq.tsv"), sep="\t", row.names=TRUE, quote=FALSE)
1014+
write.table(counts(deseq_modeled, normalized=TRUE), file = "normalized_counts_GLAmpSeq.tsv", sep="\t", row.names=TRUE, quote=FALSE)
10131015
```
10141016

10151017
**Input Data:**

Amplicon/Illumina/Workflow_Documentation/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
|Pipeline Version|Current Workflow Version (for respective pipeline version)|
88
|:---------------|:---------------------------------------------------------|
9-
|*[GL-DPPD-7104-B.md](../Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md)|[1.2.1](SW_AmpIllumina-B)|
9+
|*[GL-DPPD-7104-B.md](../Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md)|[1.2.2](SW_AmpIllumina-B)|
1010
|[GL-DPPD-7104-A.md](../Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-A.md)|[1.1.1](SW_AmpIllumina-A)|
1111

1212
*Current GeneLab Pipeline/Workflow Implementation

Amplicon/Illumina/Workflow_Documentation/SW_AmpIllumina-B/CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# Workflow change log
22

3+
## [1.2.2](https://github.com/nasa/GeneLab_Data_Processing/tree/SW_AmpIllumina-B_1.2.2/Amplicon/Illumina/Workflow_Documentation/SW_AmpIllumina-B)
4+
- Visualizations are now optional with the default being off.
5+
- Enable with optional `run_workflow.py` argument `--visualizations TRUE` or setting `config.yaml` `enable_visualizations` to "TRUE"
6+
- Added new directory `workflow_code/visualizations/`
7+
- Moved visualization script and conda environment config file to `workflow_code/visualizations/`
8+
- Added `workflow_code/visualizations/README.md` for instructions on running the visualization script manually
9+
- Refactored Snakefile outputs
10+
311
## [1.2.1](https://github.com/nasa/GeneLab_Data_Processing/tree/SW_AmpIllumina-B_1.2.1/Amplicon/Illumina/Workflow_Documentation/SW_AmpIllumina-B)
412
- Moved SW_AmpIllumina-A_1.2.1 to SW_AmpIllumina-B_1.2.1
513
- Workflow runs the [GL-DPPD-7104-B version](../../Pipeline_GL-DPPD-7104_Versions/GL-DPPD-7104-B.md) of the GeneLab standard pipeline, which includes data visualization outputs

GeneLab_Reference_Annotations/Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110/GL-DPPD-7110_annotations.csv

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,4 @@ YEAST,Saccharomyces cerevisiae,S288C,107,ensembl,http://ftp.ensembl.org/pub/rele
1616
STAA8,Staphylococcus aureus,UAMS-1,54,ensembl_bacteria,coming soon,coming soon,,,,
1717
,Streptococcus mutans,UA159,54,ensembl_bacteria,coming soon,coming soon,,,,
1818
BRADI,Brachypodium distachyon,,54,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-54/fasta/brachypodium_distachyon/dna/Brachypodium_distachyon.Brachypodium_distachyon_v3.0.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-54/gtf/brachypodium_distachyon/Brachypodium_distachyon.Brachypodium_distachyon_v3.0.54.gtf.gz,15368,,,
19+
ORYSJ,Oryza sativa,Japonica,54,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-54/fasta/oryza_sativa/dna/Oryza_sativa.IRGSP-1.0.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-54/gtf/oryza_sativa/Oryza_sativa.IRGSP-1.0.54.gtf.gz,4530,BSgenome.Osativa.MSU.MSU7,,

Metagenomics/Illumina/Pipeline_GL-DPPD-7107_Versions/GL-DPPD-7107.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -772,6 +772,10 @@ bit-GL-combine-KO-and-tax-tables *-gene-coverage-annotation-and-tax.tsv -o Combi
772772
jgi_summarize_bam_contig_depths --outputDepth sample-1-metabat-assembly-depth.tsv --percentIdentity 97 --minContigLength 1000 --minContigDepth 1.0 --referenceFasta sample-1-assembly.fasta sample-1.bam
773773
774774
metabat2 --inFile sample-1-assembly.fasta --outFile sample-1 --abdFile sample-1-metabat-assembly-depth.tsv -t 4
775+
776+
mkdir sample-1-bins
777+
mv sample-1*bin*.fasta sample-1-bins
778+
zip -r sample-1-bins.zip sample-1-bins
775779
```
776780

777781
**Parameter Definitions:**
@@ -796,7 +800,8 @@ metabat2 --inFile sample-1-assembly.fasta --outFile sample-1 --abdFile sample-1
796800
**Output data:**
797801

798802
* **sample-1-metabat-assembly-depth.tsv** (tab-delimited summary of coverages)
799-
* **sample-1-bin\*.fasta** (fasta files of recovered bins)
803+
* sample-1-bins/sample-1-bin\*.fasta (fasta files of recovered bins)
804+
* **sample-1-bins.zip** (zip file containing fasta files of recovered bins)
800805

801806
#### 14b. Bin quality assessment
802807
Utilizes the default `checkm` database available [here](https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_2015_01_16.tar.gz), `checkm_data_2015_01_16.tar.gz`.
@@ -839,6 +844,13 @@ do
839844
MAG_ID=$(echo $ID | sed 's/bin./MAG-/')
840845
cp ${ID}.fasta MAGs/${MAG_ID}.fasta
841846
done
847+
848+
for SAMPLE in $(cat MAG-bin-IDs.tmp | sed 's/-bin.*//' | sort -u);
849+
do
850+
mkdir ${SAMPLE}-MAGs
851+
mv ${SAMPLE}-*MAG*.fasta ${SAMPLE}-MAGs
852+
zip -r ${SAMPLE}-MAGs.zip ${SAMPLE}-MAGs
853+
done
842854
```
843855

844856
**Input data:**
@@ -848,8 +860,8 @@ done
848860
**Output data:**
849861

850862
* checkm-MAGs-overview.tsv (tab-delimited file with quality estimates per MAG)
851-
* **MAGs/\*.fasta** (directory holding high-quality MAGs)
852-
863+
* MAGs/\*.fasta (directory holding high-quality MAGs)
864+
* **\*-MAGs.zip** (zip files containing directories of high-quality MAGs)
853865

854866

855867
#### 14d. MAG taxonomic classification
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
speciesType,tissueType,cellName,geneSymbol
2+
Mouse,Heart,Fibroblast,"Thy1, Col1a2, Col3a1, Fbln2, Fstl1, Gsn, Mmp2, Sparc, Vim, Itgae, Il1r1, Pdgfra, Pdgfrb, Itgb1, Cd47, Cd81, Lrp1, 1810037I17Rik, Pclaf, Dmac1, ACAP2, SLF1, ANP32E, ARL6IP6, ARMCX3, Atad2, Birc5, Bmp5, Bub1, Bud31, Capn6, Ccdc34, Ccna2, Ccnb1, Ccnb2, Cdc7, Cdca3, Cenpa, Cenpe, Cenpf, Cep57, Cks2, Cnot6, Csrp2, Dctpp1, Ddah2, Dhx9, Enc1, Epha3, Evi5, Exosc8, Tcaf1, Fam173a, Pimreg, Fbln1, Fgf7, Fkbp7, Gnpat, Gps1, Gstm5, H2afv, H2afx, H2afz, Hist1h2ab, Hmgb1, Hmgb2, Hnrnpa3, Hoxb5, Htatsf1, Iigp1, Kif15, Kif20b, Knstrn, Mad2l1, Mcm4, Mdk, Mfap2, Mis18bp1, Mki67, Mrpl27, Mrpl39, Mrpl49, Mrto4, Mycbp, Naa50, Ncapg, Ndufs5, Nexn, Nrep, Nusap1, Pbk, Pmf1, Prim1, Psmd6, Ptch1, Rab34, Rfc4, Rhno1, Rn7sk, Rpa1, NA, Rrm2, Scnm1, Sdc2, Shcbp1, Sin3a, Ska2, Smc2, Smchd1, Stmn1, Tacc3, Taf12, Tbpl1, Tbx3, Tk1, Tmed1, Tnc, Top2a, Tpx2, Trp53bp1, Tshz1, Tuba1a, Tuba1b, Tubb5, Ube2c, Uqcr10, Utp11, Zranb1, Ckap4, Abi3, Ace, Adgre4, Ap1s2, Arl5c, Ccdc88a, Ccr2, Cd209a, Cd300a, Cd300e, Chil3, Clec10a, Clec12a, Clec4a1, Clec4a3, Cytip, Ear2, Eno3, F13a1, Fam49a, Fam49b, Fgr, Filip1l, Flna, Fn1, G6pdx, Gm2a, Gngt2, Gpr141, Hck, Hfe, Ifi27l2a, Ifi30, Ifitm3, Irf7, Itgal, Lgals3, Lst1, Ly6i, Lyn, Lyz2, Mob1a, Mpeg1, Ms4a4c, Ms4a6c, Ms4a6d, Msn, Naaa, Nadk, Napsa, Nr4a1, Pip4k2a, Pitpna, Plac8, Plbd1, Prkcd, Psap, Psma7, Ptpn1, Ptpn6, Rap1a, S100a4, Samhd1, Sat1, Serpinb10, Stk10, Tgfb1, Tgfbi, Tkt, Tmcc1, Tnfrsf1b, Tppp3, Treml4, Ucp2, Col5a2, Thy1"
3+
Mouse,Heart,Cardiomyocyte,"Tnnc1, Acta1, Actc1, Atp2a2, Myh6, Nppa, Ryr2, Tnnc1, NA, Tnnt2, Actc1, Actn2, Gja1, Hand2, Tnnt2, Vcam1, Gja1, Alcam, Itga1, Itga5, Itga6, Cdh2, Bmp4, Emcn, Fbn1, Gata4, Hand1, Mef2c, Myl4, Neb, Nid1, Tbx20, Tbx3, Vim"
4+
Mouse,Heart,Cardiac progenitor cell,"Hcn4, Alcam, Isl1, Kit"
5+
Mouse,Heart,Sinoatrial node cell,Isl1
6+
Mouse,Heart,Atrial cell,"Itga1, Itga5, Itga6"
7+
Mouse,Heart,Ventricular compact cell,"Itga1, Itga5, Itga6"
8+
Mouse,Heart,Ventricular trabecular cell,"Itga1, Itga5, Itga6"
9+
Mouse,Heart,Endothelial cell,"Eng, Itgam, Pecam1, Ptprc, Flt1, Kdr, Pecam1, Rgs5, Ednrb, Egfl7, Emcn, Epas1, Fabp4, Tie1, Egfl7, Cd34, Eng, Tek, Cdh5, Nos3, Plvap, Cdh5, Kdr, Fcgr2b, Lyve1, Icam2, Vcam1, Ifngr1, Il3ra, Anpep, Cxcr2, Tlr2, Cd34, Cd44, Itga2, Icam1, Sell, Sele, Selp, Cd93, Ly6c1, MHC class II, Pdpn, Flt4, Mcam, Itgb3, 2810025M15Rik, 4931406P16Rik, Jcad, Ablim1, NA, Ace, Acvrl1, Adam15, Agfg1, Ahr, Ankrd37, Aplnr, Arap2, Arap3, Arhgap18, Arhgap31, Arhgef15, BC028528, Bcl6b, Calcrl, Casz1, Cav1, Cav2, Cd200, Cd36, Adgre5, Cldn5, Clec14a, Clec1a, Clec9a, Cobll1, Col4a1, Copg2, Cpne8, Crip2, Ctla2a, Rtl8a, Rtl8b, Dpysl3, Ecscr, Edil3, Efna1, Elk3, Ephb4, Esam, Ets1, F11r, Fam198b, Fgd5, Fkbp1a, Fli1, Fxyd5, Gap43, Gata2, Gchfr, Gimap1, Gimap4, Gimap5, Gimap6, Gmfg, Gpihbp1, Adgrf5, Grap, Guk1, Hey1, Ica1, Icam2, Il2rg, Itga6, Jup, Kctd12b, Kit, Klhl4, Ldb2, Lmo2, Luzp1, Lxn, Ly6e, Mest, Mfng, Mid2, Afdn, Mmrn2, Myct1, Myzap, Nos3, Nostrin, Npdc1, Nup210, Pcdh12, Pcdh17, Pde4b, Piezo2, Plekha1, Ppm1f, Ppp1r16b, Prkch, Ptprb, Ptprm, Cavin1, Ramp2, Rasgrp3, Rasip1, Rgcc, Rhoj, Rnf122, S100a10, S100a16, Scarb1, Scarf1, Scn3b, Scn7a, Sgk1, Slc43a3, Slc9a3r2, Smad1, Smtn, Snrk, Srgn, Stab2, Stard4, Thsd1, Tjp1, Tm4sf1, Tmem2, Tmem88, Tnfaip8l1, Trp53i11, Tspan13, Txnip, Vim, Clic5, Cyyr1, Edn1, Tspan7, Acta2, Actg2, Igfbp7, Myl9, Sparcl1, Tagln, Aplnr, Aqp1, Fbln2, Pitp family, Cavin2, Sox17, Vwf, Fn1, Lyve1, Sox18, Abcb1a, Apcdd1, Ccdc141, Cgnl1, Adgrl4, Erg, Fzd6, Hmcn1, Hspg2, Lmntd1, Itga1, Itga4, Lama4, Ly75, Mecom, Mfsd2a, Rassf9, Rbpms, Rgs5, Slc16a1, Slc22a8, Slc40a1, Slc7a1, Slc7a5, Slco1a4, Slco1c1, Slco2b1, Sparc, St3gal6, Wwtr1, Zfp366, Bmp2, C1qtnf1, Dpp4, F8, Il1a, Myf6, Oit3, Ushbp1"
10+
Mouse,Heart,Macrophage,"Fcgr3, Mrc1, Fcer2a, Ly6c1, Ly6g, NA, Csf1r, Pdcd1lg2, macrophage galactose-type C-type lectin family, Adgre1, Dock2, Cd163, Cd68, Cd74, Adgre1, Itgam, Lgals3, Lyz1, Cd68, Itgax, Itgam, Mrc1, Cd40, Nos2, Cd200r1, Kit, Prom1, Pecam1, F8, S100 family, Kdr, Itgax, Arg1, Il10, VEGF family, Chil3, MHC class II, Fcgr1, Cd19, Cd3d, Cd3e, Cd3g, Klrb1c, Ptprc, Mertk, Axl, Lamp1, Lamp2, Csf1r, Fcgr3, Tlr2, Tlr4, Cdh1, Cd40, Tnfsf18, Gpnmb, MHC Class II, cd68, Cd14, C1qa, C1qb, C1qc, Ccl12, Cd209f, Clec4n, Coro1a, Ctss, Cxcl2, F13a1, Folr2, Gatm, Gmfg, Lsp1, Lyz2, Ms4a7, Pf4, Pld4, Rgs1, Srgn, Tyrobp, Unc93b1, Abhd12, Adap2, Adrb1, Aif1, Apobec1, Apoe, Asah1, Axl, Basp1, Bcl2a1b, C1qa, Cadm1, Cd72, Cd83, Cd86, Cdk6, Cfh, Clec4a2, Clec4b1, Col14a1, Creb5, Cst3, Ctsc, Ctsh, Cx3cr1, Cxcl16, Cyth4, Ebi3, Fcgr2b, Fcgr4, Fgd2, Fnbp1, Fyb, Gm4951, Gpr65, Gusb, H2-Aa, H2-Ab1, H2-DMa, H2-DMb1, H2-Eb1, Hpgds, Hspa1a, Ifnar1, Inpp5d, Itm2c, Lacc1, Lair1, Lat2, Lilra5, Lpar6, Ly86, Marcks, Mgl2, Mknk1, Mmp13, Ntpcr, P2ry6, Parp14, Pea15a, Pid1, Ppt1, Rab7b, Rbpj, Rgs10, Rtn4, Runx1, Scimp, Sdc3, Sft2d1, Slamf9, Slco2b1, Stab1, Stap1, Stx16, Taok3, Tgfbr1, Tmcc3, Tmem176b, Trf, Vcam1, Vsir, Zmynd15, C1qb, Basp1, Bst2, Cd5l, Cd79b, Gzma, S100a4, Lrmda, Abca9, Aoah, Arhgap15, Bank1, Cd84, Cmah, Colec12, Ctsb, Dab2, Dclre1c, Ddx60, Dock2, Dock8, Ehd4, Fli1, Gas6, Ikzf1, Lgmn, Lyn, Maf, P2rx7, P2ry12, Pik3ap1, Pnpla7, Rbm47, Rreb1, Sirpa, Slc9a9, Snx2, Stard8, Tbxas1, Tgfbr2, Tnfaip8, Tnfrsf11a, Wdfy4, Zfp710"

0 commit comments

Comments
 (0)