Skip to content

Dev gene lab reference annotations v gl dppd 7110 a #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
3f731d5
Create GL-DPPD-7110-A.md
asaravia-butler Apr 25, 2024
98ff3e3
Add files via upload
asaravia-butler Apr 25, 2024
b6d408a
Updating to point to pipeline version A
asaravia-butler Apr 25, 2024
9123f89
[GL_RefAnnotTable] Added rat links and annotation table
torres-alexis May 24, 2024
8d6f239
[GL_RefAnnotTable] Updated Reference annotations CSV
torres-alexis Jun 2, 2024
e27b4c7
[GL_RefAnnotTable] GL_RefAnnotTable-A 1.1.0
torres-alexis Jul 12, 2024
e22158d
[GL_RefAnnotTable] Initial microbes updates
torres-alexis Aug 4, 2024
f6154f7
[GL_RefAnnotTable] GL_RefAnnotTable-A 1.1.0
torres-alexis Aug 12, 2024
9b2b943
[GL_RefAnnotTable] Added makeOrgPackageFromNCBI to DPPD doc
torres-alexis Sep 3, 2024
0020c19
[GL_RefAnnotTable] Adjust DPPD doc
torres-alexis Sep 3, 2024
3dc8f2c
[GL_RefAnnotTable] Change input arg to full name
torres-alexis Sep 4, 2024
97d5fef
Merge pull request #110 from torres-alexis/DEV_GeneLab_Reference_Anno…
asaravia-butler Sep 5, 2024
eb93b05
Adding missing updates
asaravia-butler Sep 5, 2024
bbf7a78
Updating install and run instructions.
asaravia-butler Sep 5, 2024
7c011a2
[GL_RefAnnotTable] Misc fixes
torres-alexis Sep 6, 2024
9fd9fb7
[GL_RefAnnotTable] Typo fixes
torres-alexis Sep 6, 2024
8050e32
[GL_RefAnnotTable] Add database versions, fix go.db version
torres-alexis Sep 6, 2024
d910e55
[GL_RefAnnotTable] Add go.db info
torres-alexis Sep 6, 2024
81d06dd
[GL_RefAnnotTable] Move panther note line
torres-alexis Sep 6, 2024
c72d4bb
Merge pull request #118 from torres-alexis/DEV_GeneLab_Reference_Anno…
asaravia-butler Sep 11, 2024
cc11ff9
Specify DB versions used
asaravia-butler Sep 11, 2024
6299719
Input output updates, remove unnecessary variables
asaravia-butler Sep 11, 2024
8bae3a5
Removed target_species_designation variable
asaravia-butler Sep 11, 2024
c3f621b
[GL_RefAnnotTable] Typo fixes
torres-alexis Sep 11, 2024
65f04bd
Merge pull request #122 from torres-alexis/DEV_GeneLab_Reference_Anno…
asaravia-butler Sep 11, 2024
7228880
Typo fix
asaravia-butler Sep 16, 2024
2749fe5
[GL_RefAnnotTable] Fix R packages, add docker instructions
torres-alexis Sep 16, 2024
c72d704
[GL_RefAnnotTable] Typo fixes
torres-alexis Sep 16, 2024
fab25b4
[GL_RefAnnotTable] Typo fixes
torres-alexis Sep 16, 2024
3e3dec6
[GL_RefAnnotTable] Add Docker/Singularity, fix R lib
torres-alexis Sep 16, 2024
f9c4f03
[GL_RefAnnotTable] Update docker image
torres-alexis Sep 17, 2024
ffec6ae
[GL_RefAnnotTable] Typo fixes
torres-alexis Sep 17, 2024
c4acfad
[GL_RefAnnotTable] Readd comment
torres-alexis Sep 17, 2024
51570c4
[GL_RefAnnotTable] Typo fix
torres-alexis Sep 17, 2024
40e3652
Merge pull request #123 from torres-alexis/DEV_GeneLab_Reference_Anno…
asaravia-butler Oct 1, 2024
4f181bf
Refactor instructions for singularity use
torres-alexis Oct 1, 2024
232e421
[GL_RefAnnotTable] add container + local instructions
torres-alexis Oct 2, 2024
99daa55
[GL_RefAnnotTable] Fix typos
torres-alexis Oct 2, 2024
12b0587
[GL_RefAnnotTable] Fix interactive install-org-db
torres-alexis Oct 2, 2024
86814bd
[GL_RefAnnotTable] switch from apptainer to singularity
torres-alexis Oct 10, 2024
d4b1c09
fix typo
torres-alexis Oct 10, 2024
f839222
[GL_RefAnnotTable] fix .img image name
torres-alexis Oct 10, 2024
0338c93
Merge pull request #125 from torres-alexis/DEV_GeneLab_Reference_Anno…
asaravia-butler Oct 22, 2024
bd917b4
Formatting updates
asaravia-butler Oct 22, 2024
499538d
Formatting updates
asaravia-butler Oct 22, 2024
75dd660
Typo and link fixes
asaravia-butler Oct 22, 2024
f54a529
Formatting updates
asaravia-butler Oct 23, 2024
f015225
remove lib path
torres-alexis Oct 23, 2024
fce6a73
Update GL-DPPD-7110-A_build-genome-annots-tab.R
torres-alexis Oct 23, 2024
3cc61cf
Add possible paths to install-org-db execution function
torres-alexis Oct 23, 2024
8619121
Update GL-DPPD-7110-A_build-genome-annots-tab.R
torres-alexis Oct 23, 2024
8bbf66d
Update GL-DPPD-7110-A.md
torres-alexis Oct 23, 2024
4bec193
Update GL-DPPD-7110-A_build-genome-annots-tab.R
torres-alexis Oct 24, 2024
6da57fa
Updating signature matrix
asaravia-butler Oct 24, 2024
e3dfb4b
Formatting updates
asaravia-butler Oct 24, 2024
d1ea649
remove custom org dbs from annotation table
torres-alexis Oct 29, 2024
f06cf14
move timeout to top of scripts, add to readme
torres-alexis Oct 30, 2024
5088539
add no-home + bind local path to same container path
torres-alexis Oct 30, 2024
6368381
add cols bioconductor_annotations, custom_annotations, change dppd va…
torres-alexis Oct 31, 2024
b39c63c
remove --no-home from readme
torres-alexis Oct 31, 2024
dcdf589
Add r_libs to scrips, readme, standardize notes
torres-alexis Oct 31, 2024
ef2958f
Merge pull request #128 from torres-alexis/DEV_GeneLab_Reference_Anno…
bnovak32 Nov 5, 2024
283c0ab
Update README.md
bnovak32 Nov 8, 2024
c4b100e
Update README.md
bnovak32 Nov 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name,species,strain,ensemblVersion,ref_source,fasta,gtf,taxon,bioconductor_annotations,custom_annotations,genelab_annots_link,genelab_annots_info_link
ARABIDOPSIS,Arabidopsis thaliana,,59,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/fasta/arabidopsis_thaliana/dna/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/gtf/arabidopsis_thaliana/Arabidopsis_thaliana.TAIR10.59.gtf.gz,3702,org.At.tair.db,,https://figshare.com/ndownloader/files/48354355,https://figshare.com/ndownloader/files/48354352
BACSU,Bacillus subtilis,subsp. subtilis 168,59,ensembl_bacteria,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/fasta/bacteria_0_collection/bacillus_subtilis_subsp_subtilis_str_168_gca_000009045/dna/Bacillus_subtilis_subsp_subtilis_str_168_gca_000009045.ASM904v1.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/gtf/bacteria_0_collection/bacillus_subtilis_subsp_subtilis_str_168_gca_000009045/Bacillus_subtilis_subsp_subtilis_str_168_gca_000009045.ASM904v1.59.gtf.gz,224308,,org.Bsubtilissubspsubtilis168.eg.db,https://figshare.com/ndownloader/files/48354346,https://figshare.com/ndownloader/files/48354349
BRADI,Brachypodium distachyon,,59,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/fasta/brachypodium_distachyon/dna/Brachypodium_distachyon.Brachypodium_distachyon_v3.0.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/gtf/brachypodium_distachyon/Brachypodium_distachyon.Brachypodium_distachyon_v3.0.59.gtf.gz,15368,,org.Bdistachyon.eg.db,https://figshare.com/ndownloader/files/48354370,https://figshare.com/ndownloader/files/48354361
BRARP,Brassica rapa,,59,ensembl_plants,http://ftp.ensemblgenomes.org/pub/plants/release-59/fasta/brassica_rapa/dna/Brassica_rapa.Brapa_1.0.dna.toplevel.fa.gz,http://ftp.ensemblgenomes.org/pub/plants/release-59/gtf/brassica_rapa/Brassica_rapa.Brapa_1.0.59.gtf.gz,,,,,
WORM,Caenorhabditis elegans,,112,ensembl,https://ftp.ensembl.org/pub/release-112/fasta/caenorhabditis_elegans/dna/Caenorhabditis_elegans.WBcel235.dna.toplevel.fa.gz,https://ftp.ensembl.org/pub/release-112/gtf/caenorhabditis_elegans/Caenorhabditis_elegans.WBcel235.112.gtf.gz,6239,org.Ce.eg.db,,https://figshare.com/ndownloader/files/48354373,https://figshare.com/ndownloader/files/48354364
ZEBRAFISH,Danio rerio,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/danio_rerio/dna/Danio_rerio.GRCz11.dna.primary_assembly.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/danio_rerio/Danio_rerio.GRCz11.112.gtf.gz,7955,org.Dr.eg.db,,https://figshare.com/ndownloader/files/48354388,https://figshare.com/ndownloader/files/48354367
FLY,Drosophila melanogaster,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/drosophila_melanogaster/dna/Drosophila_melanogaster.BDGP6.46.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP6.46.112.gtf.gz,7227,org.Dm.eg.db,,https://figshare.com/ndownloader/files/48354382,https://figshare.com/ndownloader/files/48354376
ERCC,,,,ThermoFisher,https://assets.thermofisher.com/TFS-Assets/LSG/manuals/ERCC92.zip,https://assets.thermofisher.com/TFS-Assets/LSG/manuals/ERCC92.zip,,,,,
ECOLI,Escherichia coli,str. K-12 substr. MG1655,59,ensembl_bacteria,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655_gca_000005845/dna/Escherichia_coli_str_k_12_substr_mg1655_gca_000005845.ASM584v2.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/gtf/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655_gca_000005845/Escherichia_coli_str_k_12_substr_mg1655_gca_000005845.ASM584v2.59.gtf.gz,511145,,org.EcolistrK12substrMG1655.eg.db,https://figshare.com/ndownloader/files/48354379,https://figshare.com/ndownloader/files/48354394
HUMAN,Homo sapiens,,112,ensembl,https://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz,https://ftp.ensembl.org/pub/release-112/gtf/homo_sapiens/Homo_sapiens.GRCh38.112.gtf.gz,9606,org.Hs.eg.db,,https://figshare.com/ndownloader/files/48354445,https://figshare.com/ndownloader/files/48354448
,Lactobacillus acidophilus,NCFM,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/985/GCF_000011985.1_ASM1198v1/GCF_000011985.1_ASM1198v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/985/GCF_000011985.1_ASM1198v1/GCF_000011985.1_ASM1198v1_genomic.gtf.gz,272621,,,https://figshare.com/ndownloader/files/49061254,https://figshare.com/ndownloader/files/49061257
MOUSE,Mus musculus,,112,ensembl,https://ftp.ensembl.org/pub/release-112/fasta/mus_musculus/dna/Mus_musculus.GRCm39.dna.primary_assembly.fa.gz,https://ftp.ensembl.org/pub/release-112/gtf/mus_musculus/Mus_musculus.GRCm39.112.gtf.gz,10090,org.Mm.eg.db,,https://figshare.com/ndownloader/files/48354460,https://figshare.com/ndownloader/files/48354457
,Mycobacterium marinum,M,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/018/345/GCF_000018345.1_ASM1834v1/GCF_000018345.1_ASM1834v1_genomic.gtf.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/018/345/GCF_000018345.1_ASM1834v1/GCF_000018345.1_ASM1834v1_genomic.gtf.gz,216594,,,https://figshare.com/ndownloader/files/49061260,https://figshare.com/ndownloader/files/49061263
ORYSJ,Oryza sativa,Japonica,59,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/fasta/oryza_sativa/dna/Oryza_sativa.IRGSP-1.0.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/gtf/oryza_sativa/Oryza_sativa.IRGSP-1.0.59.gtf.gz,39947,,,https://figshare.com/ndownloader/files/48354451,https://figshare.com/ndownloader/files/48354454
ORYLA,Oryzias latipes,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/oryzias_latipes/dna/Oryzias_latipes.ASM223467v1.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/oryzias_latipes/Oryzias_latipes.ASM223467v1.112.gtf.gz,8090,,org.Olatipes.eg.db,https://figshare.com/ndownloader/files/48354463,https://figshare.com/ndownloader/files/48354466
,Pseudomonas aeruginosa,UCBPP-PA14,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/014/625/GCF_000014625.1_ASM1462v1/GCF_000014625.1_ASM1462v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/014/625/GCF_000014625.1_ASM1462v1/GCF_000014625.1_ASM1462v1_genomic.gtf.gz,208963,,,https://figshare.com/ndownloader/files/49061266,https://figshare.com/ndownloader/files/49061269
RAT,Rattus norvegicus,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/rattus_norvegicus/dna/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/rattus_norvegicus/Rattus_norvegicus.mRatBN7.2.112.gtf.gz,10116,org.Rn.eg.db,,https://figshare.com/ndownloader/files/48354472,https://figshare.com/ndownloader/files/48354475
YEAST,Saccharomyces cerevisiae,S288C,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.112.gtf.gz,559292,org.Sc.sgd.db,,https://figshare.com/ndownloader/files/48354469,https://figshare.com/ndownloader/files/48354478
SALTY,Salmonella enterica,serovar Typhimurium str. LT2,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/945/GCF_000006945.2_ASM694v2/GCF_000006945.2_ASM694v2_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/945/GCF_000006945.2_ASM694v2/GCF_000006945.2_ASM694v2_genomic.gtf.gz,99287,,org.SentericaserovarTyphimuriumstrLT2.eg.db,https://figshare.com/ndownloader/files/49061272,https://figshare.com/ndownloader/files/49061275
,Serratia liquefaciens,ATCC 27592,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/422/085/GCF_000422085.1_ASM42208v1/GCF_000422085.1_ASM42208v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/422/085/GCF_000422085.1_ASM42208v1/GCF_000422085.1_ASM42208v1_genomic.gtf.gz,1346614,,,https://figshare.com/ndownloader/files/49061278,https://figshare.com/ndownloader/files/49061281
,Staphylococcus aureus,MRSA252,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/505/GCF_000011505.1_ASM1150v1/GCF_000011505.1_ASM1150v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/505/GCF_000011505.1_ASM1150v1/GCF_000011505.1_ASM1150v1_genomic.gtf.gz,282458,,,https://figshare.com/ndownloader/files/49061284,https://figshare.com/ndownloader/files/49061287
,Streptococcus mutans,UA159,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/465/GCF_000007465.2_ASM746v2/GCF_000007465.2_ASM746v2_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/465/GCF_000007465.2_ASM746v2/GCF_000007465.2_ASM746v2_genomic.gtf.gz,210007,,,https://figshare.com/ndownloader/files/49061290,https://figshare.com/ndownloader/files/49061293
,Vibrio fischeri,ES114,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/805/GCF_000011805.1_ASM1180v1/GCF_000011805.1_ASM1180v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/805/GCF_000011805.1_ASM1180v1/GCF_000011805.1_ASM1180v1_genomic.gtf.gz,312309,,,https://figshare.com/ndownloader/files/49061296,https://figshare.com/ndownloader/files/49061299
7 changes: 5 additions & 2 deletions GeneLab_Reference_Annotations/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# GeneLab pipeline for generating reference annotation tables

> **The document [`GL-DPPD-7110.md`](Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110/GL-DPPD-7110.md) holds an overview and example commands for how GeneLab generates reference annotation tables. See the [Repository Links](#repository-links) descriptions below for more information.**
> **The document [`GL-DPPD-7110-A.md`](Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A.md) holds an overview and example commands for how GeneLab generates reference annotation tables. See the [Repository Links](#repository-links) descriptions below for more information.**

---
## Repository Links
Expand All @@ -17,6 +17,9 @@

---

**Developed and maintained by:**
**Developed by:**
Mike Lee

**Maintained by:**
Alexis Torres
Crystal Han
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.1.0](https://github.com/nasa/GeneLab_Data_Processing/blob/DEV_GeneLab_Reference_Annotations_vGL-DPPD-7110-A/GeneLab_Reference_Annotations/Workflow_Documentation/GL_RefAnnotTable-A)

### Added

- Added software:
- AnnotationForge version 1.46.0
- biomaRt version 2.60.1
- GO.db version 3.19.1
- Added support for:
- Bacillus subtilis, subsp. subtilis 168
- Brachypodium distachyon
- Escherichia coli,str. K-12 substr. MG1655
- Oryzias latipes
- Lactobacillus acidophilus NCFM
- Mycobacterium marinum M
- Oryza sativa Japonica
- Pseudomonas aeruginosa UCBPP-PA14
- Salmonella enterica subsp. enterica serovar Typhimurium str. LT2
- Serratia liquefaciens ATCC 27592
- Staphylococcus aureus MRSA252
- Streptococcus mutans UA159
- Vibrio fischeri ES114
- Added AnnotationForge helper script install-org-db.R to create
organism-specific annotation packages (org.*.eg.db) in R if not available on
Bioconductor. Used for:
- Bacillus subtilis, subsp. subtilis 168
- Brachypodium distachyon
- Escherichia coli,str. K-12 substr. MG1655
- Oryzias latipes
- Salmonella enterica subsp. enterica serovar Typhimurium str. LT2
- Added NCBI as a source for FASTA and GTF files

### Fixed

- Fixed processing for ECOLI

### Changed

- Updated Ensembl versions:
- Animals: Ensembl release 112
- Plants: Ensembl plants release 59
- Bacteria: Ensembl bacteria release 59
- Updated software:
- tidyverse version updated from 1.3.2 to 2.0.0
- STRINGdb version updated from 2.8.4 to 2.16.4
- PANTHER.db version updated from 1.0.11 to 1.0.12
- rtracklayer version updated from 1.56.1 to 1.64.0
- Bioconductor version updated from 3.15.1 to 3.19
- Removed org.EcK12.eg.db and replaced it with a locally created annotations
database, as it is no longer available on Bioconductor
- Changed the first argument of GL-DPPD-7110-A_build-genome-annots-tab.R from
the 'name' column value to the 'species' column value (e.g., 'Mus musculus' instead of 'MOUSE')


## [1.0.0](https://github.com/nasa/GeneLab_Data_Processing/releases/tag/GL_RefAnnotTable_1.0.0)
Loading
Loading