Update and standardize Nextclade datasets for all lineages and gene segments

## Context

Only H3N2 and H1N1pdm currently have Nextclade datasets for all 8 gene segments, while B/Vic has HA and NA datasets and B/Yam only has HA. The default HA and NA datasets for H3N2 and H1N1pdm use more modern reference strains (e.g., A/Darwin/6/2021 for H3N2), while the other segments use older reference strains (e.g., A/NewYork/392/2004 for all other H3N2 genes).

Only HA and NA datasets include shortcut aliases like `flu_h3n2_ha` while other gene segments have a single longer name like `nextstrain/flu/h3n2/ns`.

## Description

We should update the Nextclade datasets for all lineages and genes to use the same modern reference strains and provide the same shortcut aliases.

This means getting additional gene sequences and coordinates for A/Darwin/6/2021 (H3N2), A/Wisconsin/588/2019 (H1N1pdm), and B/Brisbane/60/2008 (B/Vic). For completeness, we could add all remaining genes for B/Wisconsin/01/2010 (B/Yam). We wouldn't use the Yam dataset for surveillance analyses, but it could be helpful for historical analyses.

If we wanted to update the references to newer strains for all subtypes, this could be a good time to make that change, too.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update and standardize Nextclade datasets for all lineages and gene segments #186

Context

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update and standardize Nextclade datasets for all lineages and gene segments #186

Description

Context

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions