Skip to content

Dev #992

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 25, 2024
Merged

Dev #992

merged 3 commits into from
Nov 25, 2024

Conversation

bpblanken
Copy link
Collaborator

No description provided.

bpblanken and others added 2 commits November 25, 2024 14:39
* begin reference dataset refactor

* hgmd

* basewritetask

* PR commentes

* Reference data refactor feature branch

* remove utils for now

* cadd

* hgmd selects

* import

* minor things

* config enum attribute

* config out of enum, get_ht, for_reference_genome_dataset_type

* return table

* kwargs

* tiny changes

* frozenset

* cadd filtering

* changes to the cadd script that will be moot?

* add some gnomad datasets

* hacking on clinvar

* ruff

* add 38 dbnsfp config

* get cadd from dbnsfp

* get primate ai and mpc from dbnsfp

* Cleanup

* cleanup

* Update misc.py

* Update clinvar.py

* Update clinvar.py

* Update clinvar_test.py

* poach some files from bens pr

* Update definitions.py

* first pass enums

* use liftover for 37 data instead of old version

* remove cadd

* Add clinvar path (#961)

* Add clinvar path

* Fix missing requires bug

* remove dataset type from filter contigs

* Move filter_contigs to "get_ht" so its generalizable

* gnomad_exomes unit tests

* all enum selects helper

* gnomad_genomes tests

* clean up

* Generalize enum annotation

* fix tempdir usage

* add topmed

* Benb/clinvar refactor (#960)

* hacking on clinvar

* ruff

* Cleanup

* cleanup

* Update misc.py

* Update clinvar.py

* Update clinvar.py

* Update clinvar_test.py

* Update definitions.py

* Add clinvar path (#961)

* Add clinvar path

* Fix missing requires bug

* remove dataset type from filter contigs

* Move filter_contigs to "get_ht" so its generalizable

* Generalize enum annotation

* Add back enum select fields

* remove unnecessary line

* clean up

* ruff

* wip hgmd test

* ruff

* share enum transmute

* done

* notebook

* ruff

* linter for now

* first pass splice ai

* Mitimpact

* Add the enum 🤦

* bad typo

* gnomad_mito, gnomad_non_coding_constraint, local_constraint_mito, screen

* gnomad_qc typo

* module_file_name

* gnomad_genomes CONFIG deduplication

* zipfile helper

* MITIMPACT (#965)

* Mitimpact

* Add the enum 🤦

* bad typo

* use helper for zip download

* pr feedback

* ruff

* ruff

* ruff

* ruff

* unshare extracted filename

* clean up transmute

* ruff

* trailing comma

* maybe clearer gnomad

* fix property syntax

* gnomad_mito selects

* use hanas enum notation

* shared import vcf helper

* proper splice ai parsing

* valid paths

* ruff

* ruff

* mitomap

* add coment

* merge

* screenums

* explicit handling for already mapped enums

* add tests

* ruff

* ruff

* ruff

* min_partitions

* simplify mitomap

* jupyter

* hmtvar reference dataset (#971)

* hmtvar reference dataset

* ruff

* eigen reference dataset (#970)

* eigen reference dataset

* Fix typo

---------

Co-authored-by: Benjamin Blankenmeister <b.p.blankenmeister@gmail.com>

* Exac reference dataset (#969)

* add exac reference dataset

* use vcf

* remove comment

---------

Co-authored-by: Benjamin Blankenmeister <b.p.blankenmeister@gmail.com>

* helix mito (#972)

* split genomes and exomes again

* fix screen

* screen and gnomad non coding

* unzip local_constraint_mito

* Fix bugs related to nested fields/split_multi (#973)

* helix mito

* Fix split_multi and select bugs

* fixme

* ruff

* Add test for exac

* Add test for split multi check

* Add test for `UpdatedReferenceDataset` and `UpdatedReferenceDatasetQuery` (#974)

* helix mito

* Fix split_multi and select bugs

* fixme

* ruff

* get test working

* fix bugs

* bug fixes

* Bugfixes

* Refactor tests

* Add comment

* quixotic

* missed one

* Add test for exac

* Add test for split multi check

* fix zip write

* Benb/add missing queries (#977)

* Add missing datasets

* Fix reference

* Add test

* lint

* remove complete() (#979)

* remove complete()

* ruff

* Fix mock

* Benb/update gnomad qc crdq with updated format (#980)

* remove complete()

* ruff

* Fix mock

* Replace the gnomad_qc crdq

* Fix test

* format

* Remove ht and tests (#981)

* remove complete()

* ruff

* Fix mock

* Replace the gnomad_qc crdq

* Fix test

* format

* Remove ht and tests

* Updated `gnomad_coding_and_noncoding` test table. (#982)

* remove complete()

* ruff

* Fix mock

* Replace the gnomad_qc crdq

* Fix test

* format

* Remove ht and tests

* Change validation table reference

* Update README.txt

* remove crdq reference

* Update mock

* ruff

* Fix imports

* remove mock

* fixme

* Change rsync to new path (#983)

* Remove `version` from reference dataset query path (#984)

* Change rsync to new path

* Remove version from reference dataset query path

* Make rdq dataset type specific (#985)

* Make rdq dataset type specific

* Add test for mito

* Add pathogenicities to clinvar

* tweak

* update annotations with updated reference datasets refactor (#978)

* first pass update vat

* merge feature

* fix the diff for now

* include_queries

* interval ht

* tests

* exclude

* nicer

* fix inteval test

* split fn

* eigen test

* clinvar wip

* hgmd

* clinvar

* gnomad genomes and exomes

* delete

* 38 snv_indel done

* mito tests

* done with tests?

* custom_select

* fields test

* disable write new samples tests for now

* working on tests

* update update vat with new samples tests

* extra file

* other skipped test

* make select and filter similar

* tweak

* rename path and locus/interval filtering

* make select and filter similar (#988)

* make select and filter similar

* tweak

* Cleanest set diff

* Finish off

* Tests passing!

* ruff

* ruff

* Change the params

* Fix params

* params

* More clinvar mocking

* hardcode these

---------

Co-authored-by: Benjamin Blankenmeister <bblanken@broadinstitute.org>
Co-authored-by: Benjamin Blankenmeister <b.p.blankenmeister@gmail.com>

* delete old reference data code 😝  (#990)

* first pass update vat

* merge feature

* fix the diff for now

* include_queries

* interval ht

* tests

* exclude

* nicer

* fix inteval test

* split fn

* eigen test

* clinvar wip

* hgmd

* clinvar

* gnomad genomes and exomes

* delete

* 38 snv_indel done

* mito tests

* done with tests?

* custom_select

* fields test

* disable write new samples tests for now

* working on tests

* update update vat with new samples tests

* extra file

* other skipped test

* make select and filter similar

* tweak

* rename path and locus/interval filtering

* make select and filter similar (#988)

* make select and filter similar

* tweak

* Cleanest set diff

* Finish off

* Tests passing!

* ruff

* ruff

* Change the params

* Fix params

* params

* More clinvar mocking

* hardcode these

* delete a bunch of stuff

* ruff

* remove rdc and crdq

* delete v02

* remove comment references to deleted file

* last test

---------

Co-authored-by: Benjamin Blankenmeister <bblanken@broadinstitute.org>
Co-authored-by: Benjamin Blankenmeister <b.p.blankenmeister@gmail.com>

---------

Co-authored-by: Julia Klugherz <juliaklugherz@gmail.com>
Co-authored-by: Hana Snow <hsnow@broadinstitute.org>
@bpblanken bpblanken requested a review from a team as a code owner November 25, 2024 19:42
@bpblanken bpblanken merged commit 7357dfa into main Nov 25, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant