Skip to content

Migrate to using rnc_taxonomy instead of rnc_accession columns #161

@blakesweeney

Description

@blakesweeney

The rnc_taxonomy table is based off the NCBI taxonomy and is actually kept up-to-date and accurate. The columns in rnc_taxonomy are not. We should move to using that for everything. The webfront end uses it, but the pipeline still does work to parse taxonomy information. This is uneeded and should be removed.

  • Validate that all taxids in xref are present in rnc_taxonomy
  • Create any needed fake entries
  • Add fk constraint from xref.taxid to rnc_taxonomy.id
  • Modify pipeline to write empty strings for rnc_accessions.{species,common_name,lineage}
  • Update export steps to ingore the species, common_name, lineage columns
  • Update the rnc_accession update script to reflect the missing species, common_name, lineage columns
  • Validate that ENA can still be parsed and imported
  • Remove species, common_name, lineage properties from Entry object
  • Remove all attempts to set the taxonomy information in Entries
  • Validate the pipeline can still run
  • Remove the rnc_accessions.{species,common_name,lineage} columns

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions