Skip to content

Reference Data Update Type Equality Check #789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 23, 2024
Merged

Conversation

bpblanken
Copy link
Collaborator

No description provided.

@bpblanken bpblanken changed the title Reference Data Update Validity Test Reference Data Update Type Validity May 20, 2024
@bpblanken bpblanken changed the title Reference Data Update Type Validity Reference Data Update Type Equality Check May 23, 2024
@bpblanken bpblanken marked this pull request as ready for review May 23, 2024 13:48
@bpblanken bpblanken requested a review from a team as a code owner May 23, 2024 13:48
)
return cls(paths, versions, enums, selects)


def validate_selects_types(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is to just ensure that any fields shared between the existing table and an "update" are identically typed.

imo we're getting very close to needing a statically defined schema for these, but trying to squeeze that in was much messier.

@@ -45,7 +45,7 @@
hl.tstruct(
locus=hl.tlocus('GRCh38'),
alleles=hl.tarray(hl.tstr),
PHRED=hl.tint32,
PHRED=hl.tfloat32,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had a few types in our tests/mock data that weren't right and doing this work caught them 🤩 !

@@ -183,11 +194,12 @@ def test_from_rdc_or_annotations_ht(self):
self.assertTrue(
rdc_globals.enums == {'screen': {'region_type': ['C', 'D']}},
)
print(rdc_globals.selects)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is left in by mistake?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep 🤦 I'm going to add a lint rule for this...

@bpblanken bpblanken merged commit 474d0ca into dev May 23, 2024
3 checks passed
@bpblanken bpblanken deleted the benb/validity_checks branch May 23, 2024 14:51
bpblanken added a commit that referenced this pull request May 28, 2024
* Miscellaneous VEP tweaks

* Benb/validate with allele type (#785)

* Bump requirements

* add validation

* format

* Fix syntax (#787)

* Bump requirements

* add validation

* format

* Fix syntax

* allele registry (#759)

* add allele registry step in update vat with samples task

* shh

* existing tests pass

* fix test deps

* test

* annotation_dependencies

* ruff

* take out the zero check

* fix requirements new task name

* move vep into new variants task

* only annotate lookup from callset_ht

* clean up mocks

* r

* working

* working?

* not that

* minor changes and test cases

* most recent script

* working version

* fix the test

* implement ht chunking

* fix patches

* fix patches

* register now yields id map of returned caids

* r

* fix some tests

* return a hail table instead

* use __str__

* log to track variants we can't map back

* move to gcs with flag

* union ar_ht instead of a bunch of left joins to prevent CAID, CAID_1, CAID_2...

* cleaner

* it is all coming together now
'

* gnomad ids for 37'

* use genomicalleles and gnomad ids

* secrets

* secret

* move stuff out of environment file

* add more logging

* fix test

* fix the other test

* ruff

* test

* comments

* o

* Reference Data Update Type Equality Check (#789)

* Finish validity check test

* ruff

* update dbnsfp field

* More types

* more types

* ugh

* twiddle it back

* update type

* more tweaks

* lint

* fix floats

* decompose

* ruff formatg

* Update compare_globals_test.py

* print lint rule (#791)

* tiny ar bug (#792)

---------

Co-authored-by: Benjamin Blankenmeister <bblanken@broadinstitute.org>
Co-authored-by: Benjamin Blankenmeister <b.p.blankenmeister@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants