Skip to content

Commit df3f216

Browse files
Merge pull request #131 from nasa/DEV_GeneLab_Reference_Annotations_vGL-DPPD-7110-A
GeneLab Reference Annotations Pipeline version update from GL-DPPD-7110 to GL-DPPD-7110-A
2 parents d4e2d54 + c4b100e commit df3f216

File tree

10 files changed

+1924
-4
lines changed

10 files changed

+1924
-4
lines changed

GeneLab_Reference_Annotations/Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A.md

Lines changed: 864 additions & 0 deletions
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
name,species,strain,ensemblVersion,ref_source,fasta,gtf,taxon,bioconductor_annotations,custom_annotations,genelab_annots_link,genelab_annots_info_link
2+
ARABIDOPSIS,Arabidopsis thaliana,,59,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/fasta/arabidopsis_thaliana/dna/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/gtf/arabidopsis_thaliana/Arabidopsis_thaliana.TAIR10.59.gtf.gz,3702,org.At.tair.db,,https://figshare.com/ndownloader/files/48354355,https://figshare.com/ndownloader/files/48354352
3+
BACSU,Bacillus subtilis,subsp. subtilis 168,59,ensembl_bacteria,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/fasta/bacteria_0_collection/bacillus_subtilis_subsp_subtilis_str_168_gca_000009045/dna/Bacillus_subtilis_subsp_subtilis_str_168_gca_000009045.ASM904v1.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/gtf/bacteria_0_collection/bacillus_subtilis_subsp_subtilis_str_168_gca_000009045/Bacillus_subtilis_subsp_subtilis_str_168_gca_000009045.ASM904v1.59.gtf.gz,224308,,org.Bsubtilissubspsubtilis168.eg.db,https://figshare.com/ndownloader/files/48354346,https://figshare.com/ndownloader/files/48354349
4+
BRADI,Brachypodium distachyon,,59,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/fasta/brachypodium_distachyon/dna/Brachypodium_distachyon.Brachypodium_distachyon_v3.0.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/gtf/brachypodium_distachyon/Brachypodium_distachyon.Brachypodium_distachyon_v3.0.59.gtf.gz,15368,,org.Bdistachyon.eg.db,https://figshare.com/ndownloader/files/48354370,https://figshare.com/ndownloader/files/48354361
5+
BRARP,Brassica rapa,,59,ensembl_plants,http://ftp.ensemblgenomes.org/pub/plants/release-59/fasta/brassica_rapa/dna/Brassica_rapa.Brapa_1.0.dna.toplevel.fa.gz,http://ftp.ensemblgenomes.org/pub/plants/release-59/gtf/brassica_rapa/Brassica_rapa.Brapa_1.0.59.gtf.gz,,,,,
6+
WORM,Caenorhabditis elegans,,112,ensembl,https://ftp.ensembl.org/pub/release-112/fasta/caenorhabditis_elegans/dna/Caenorhabditis_elegans.WBcel235.dna.toplevel.fa.gz,https://ftp.ensembl.org/pub/release-112/gtf/caenorhabditis_elegans/Caenorhabditis_elegans.WBcel235.112.gtf.gz,6239,org.Ce.eg.db,,https://figshare.com/ndownloader/files/48354373,https://figshare.com/ndownloader/files/48354364
7+
ZEBRAFISH,Danio rerio,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/danio_rerio/dna/Danio_rerio.GRCz11.dna.primary_assembly.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/danio_rerio/Danio_rerio.GRCz11.112.gtf.gz,7955,org.Dr.eg.db,,https://figshare.com/ndownloader/files/48354388,https://figshare.com/ndownloader/files/48354367
8+
FLY,Drosophila melanogaster,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/drosophila_melanogaster/dna/Drosophila_melanogaster.BDGP6.46.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP6.46.112.gtf.gz,7227,org.Dm.eg.db,,https://figshare.com/ndownloader/files/48354382,https://figshare.com/ndownloader/files/48354376
9+
ERCC,,,,ThermoFisher,https://assets.thermofisher.com/TFS-Assets/LSG/manuals/ERCC92.zip,https://assets.thermofisher.com/TFS-Assets/LSG/manuals/ERCC92.zip,,,,,
10+
ECOLI,Escherichia coli,str. K-12 substr. MG1655,59,ensembl_bacteria,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/fasta/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655_gca_000005845/dna/Escherichia_coli_str_k_12_substr_mg1655_gca_000005845.ASM584v2.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacteria/release-59/gtf/bacteria_0_collection/escherichia_coli_str_k_12_substr_mg1655_gca_000005845/Escherichia_coli_str_k_12_substr_mg1655_gca_000005845.ASM584v2.59.gtf.gz,511145,,org.EcolistrK12substrMG1655.eg.db,https://figshare.com/ndownloader/files/48354379,https://figshare.com/ndownloader/files/48354394
11+
HUMAN,Homo sapiens,,112,ensembl,https://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz,https://ftp.ensembl.org/pub/release-112/gtf/homo_sapiens/Homo_sapiens.GRCh38.112.gtf.gz,9606,org.Hs.eg.db,,https://figshare.com/ndownloader/files/48354445,https://figshare.com/ndownloader/files/48354448
12+
,Lactobacillus acidophilus,NCFM,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/985/GCF_000011985.1_ASM1198v1/GCF_000011985.1_ASM1198v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/985/GCF_000011985.1_ASM1198v1/GCF_000011985.1_ASM1198v1_genomic.gtf.gz,272621,,,https://figshare.com/ndownloader/files/49061254,https://figshare.com/ndownloader/files/49061257
13+
MOUSE,Mus musculus,,112,ensembl,https://ftp.ensembl.org/pub/release-112/fasta/mus_musculus/dna/Mus_musculus.GRCm39.dna.primary_assembly.fa.gz,https://ftp.ensembl.org/pub/release-112/gtf/mus_musculus/Mus_musculus.GRCm39.112.gtf.gz,10090,org.Mm.eg.db,,https://figshare.com/ndownloader/files/48354460,https://figshare.com/ndownloader/files/48354457
14+
,Mycobacterium marinum,M,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/018/345/GCF_000018345.1_ASM1834v1/GCF_000018345.1_ASM1834v1_genomic.gtf.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/018/345/GCF_000018345.1_ASM1834v1/GCF_000018345.1_ASM1834v1_genomic.gtf.gz,216594,,,https://figshare.com/ndownloader/files/49061260,https://figshare.com/ndownloader/files/49061263
15+
ORYSJ,Oryza sativa,Japonica,59,ensembl_plants,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/fasta/oryza_sativa/dna/Oryza_sativa.IRGSP-1.0.dna.toplevel.fa.gz,https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-59/gtf/oryza_sativa/Oryza_sativa.IRGSP-1.0.59.gtf.gz,39947,,,https://figshare.com/ndownloader/files/48354451,https://figshare.com/ndownloader/files/48354454
16+
ORYLA,Oryzias latipes,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/oryzias_latipes/dna/Oryzias_latipes.ASM223467v1.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/oryzias_latipes/Oryzias_latipes.ASM223467v1.112.gtf.gz,8090,,org.Olatipes.eg.db,https://figshare.com/ndownloader/files/48354463,https://figshare.com/ndownloader/files/48354466
17+
,Pseudomonas aeruginosa,UCBPP-PA14,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/014/625/GCF_000014625.1_ASM1462v1/GCF_000014625.1_ASM1462v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/014/625/GCF_000014625.1_ASM1462v1/GCF_000014625.1_ASM1462v1_genomic.gtf.gz,208963,,,https://figshare.com/ndownloader/files/49061266,https://figshare.com/ndownloader/files/49061269
18+
RAT,Rattus norvegicus,,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/rattus_norvegicus/dna/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/rattus_norvegicus/Rattus_norvegicus.mRatBN7.2.112.gtf.gz,10116,org.Rn.eg.db,,https://figshare.com/ndownloader/files/48354472,https://figshare.com/ndownloader/files/48354475
19+
YEAST,Saccharomyces cerevisiae,S288C,112,ensembl,http://ftp.ensembl.org/pub/release-112/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa.gz,http://ftp.ensembl.org/pub/release-112/gtf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.112.gtf.gz,559292,org.Sc.sgd.db,,https://figshare.com/ndownloader/files/48354469,https://figshare.com/ndownloader/files/48354478
20+
SALTY,Salmonella enterica,serovar Typhimurium str. LT2,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/945/GCF_000006945.2_ASM694v2/GCF_000006945.2_ASM694v2_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/945/GCF_000006945.2_ASM694v2/GCF_000006945.2_ASM694v2_genomic.gtf.gz,99287,,org.SentericaserovarTyphimuriumstrLT2.eg.db,https://figshare.com/ndownloader/files/49061272,https://figshare.com/ndownloader/files/49061275
21+
,Serratia liquefaciens,ATCC 27592,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/422/085/GCF_000422085.1_ASM42208v1/GCF_000422085.1_ASM42208v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/422/085/GCF_000422085.1_ASM42208v1/GCF_000422085.1_ASM42208v1_genomic.gtf.gz,1346614,,,https://figshare.com/ndownloader/files/49061278,https://figshare.com/ndownloader/files/49061281
22+
,Staphylococcus aureus,MRSA252,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/505/GCF_000011505.1_ASM1150v1/GCF_000011505.1_ASM1150v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/505/GCF_000011505.1_ASM1150v1/GCF_000011505.1_ASM1150v1_genomic.gtf.gz,282458,,,https://figshare.com/ndownloader/files/49061284,https://figshare.com/ndownloader/files/49061287
23+
,Streptococcus mutans,UA159,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/465/GCF_000007465.2_ASM746v2/GCF_000007465.2_ASM746v2_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/465/GCF_000007465.2_ASM746v2/GCF_000007465.2_ASM746v2_genomic.gtf.gz,210007,,,https://figshare.com/ndownloader/files/49061290,https://figshare.com/ndownloader/files/49061293
24+
,Vibrio fischeri,ES114,,ncbi,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/805/GCF_000011805.1_ASM1180v1/GCF_000011805.1_ASM1180v1_genomic.fna.gz,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/011/805/GCF_000011805.1_ASM1180v1/GCF_000011805.1_ASM1180v1_genomic.gtf.gz,312309,,,https://figshare.com/ndownloader/files/49061296,https://figshare.com/ndownloader/files/49061299

GeneLab_Reference_Annotations/README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# GeneLab pipeline for generating reference annotation tables
22

3-
> **The document [`GL-DPPD-7110.md`](Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110/GL-DPPD-7110.md) holds an overview and example commands for how GeneLab generates reference annotation tables. See the [Repository Links](#repository-links) descriptions below for more information.**
3+
> **The document [`GL-DPPD-7110-A.md`](Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A.md) holds an overview and example commands for how GeneLab generates reference annotation tables. See the [Repository Links](#repository-links) descriptions below for more information.**
44
55
---
66
## Repository Links
@@ -17,6 +17,9 @@
1717

1818
---
1919

20-
**Developed and maintained by:**
20+
**Developed by:**
2121
Mike Lee
2222

23+
**Maintained by:**
24+
Alexis Torres
25+
Crystal Han
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [1.1.0](https://github.com/nasa/GeneLab_Data_Processing/blob/DEV_GeneLab_Reference_Annotations_vGL-DPPD-7110-A/GeneLab_Reference_Annotations/Workflow_Documentation/GL_RefAnnotTable-A)
9+
10+
### Added
11+
12+
- Added software:
13+
- AnnotationForge version 1.46.0
14+
- biomaRt version 2.60.1
15+
- GO.db version 3.19.1
16+
- Added support for:
17+
- Bacillus subtilis, subsp. subtilis 168
18+
- Brachypodium distachyon
19+
- Escherichia coli,str. K-12 substr. MG1655
20+
- Oryzias latipes
21+
- Lactobacillus acidophilus NCFM
22+
- Mycobacterium marinum M
23+
- Oryza sativa Japonica
24+
- Pseudomonas aeruginosa UCBPP-PA14
25+
- Salmonella enterica subsp. enterica serovar Typhimurium str. LT2
26+
- Serratia liquefaciens ATCC 27592
27+
- Staphylococcus aureus MRSA252
28+
- Streptococcus mutans UA159
29+
- Vibrio fischeri ES114
30+
- Added AnnotationForge helper script install-org-db.R to create
31+
organism-specific annotation packages (org.*.eg.db) in R if not available on
32+
Bioconductor. Used for:
33+
- Bacillus subtilis, subsp. subtilis 168
34+
- Brachypodium distachyon
35+
- Escherichia coli,str. K-12 substr. MG1655
36+
- Oryzias latipes
37+
- Salmonella enterica subsp. enterica serovar Typhimurium str. LT2
38+
- Added NCBI as a source for FASTA and GTF files
39+
40+
### Fixed
41+
42+
- Fixed processing for ECOLI
43+
44+
### Changed
45+
46+
- Updated Ensembl versions:
47+
- Animals: Ensembl release 112
48+
- Plants: Ensembl plants release 59
49+
- Bacteria: Ensembl bacteria release 59
50+
- Updated software:
51+
- tidyverse version updated from 1.3.2 to 2.0.0
52+
- STRINGdb version updated from 2.8.4 to 2.16.4
53+
- PANTHER.db version updated from 1.0.11 to 1.0.12
54+
- rtracklayer version updated from 1.56.1 to 1.64.0
55+
- Bioconductor version updated from 3.15.1 to 3.19
56+
- Removed org.EcK12.eg.db and replaced it with a locally created annotations
57+
database, as it is no longer available on Bioconductor
58+
- Changed the first argument of GL-DPPD-7110-A_build-genome-annots-tab.R from
59+
the 'name' column value to the 'species' column value (e.g., 'Mus musculus' instead of 'MOUSE')
60+
61+
62+
## [1.0.0](https://github.com/nasa/GeneLab_Data_Processing/releases/tag/GL_RefAnnotTable_1.0.0)

0 commit comments

Comments
 (0)