Skip to content

Commit b3995d8

Browse files
jklugherzbpblanken
andauthored
main <- dev (#793)
* Miscellaneous VEP tweaks * Benb/validate with allele type (#785) * Bump requirements * add validation * format * Fix syntax (#787) * Bump requirements * add validation * format * Fix syntax * allele registry (#759) * add allele registry step in update vat with samples task * shh * existing tests pass * fix test deps * test * annotation_dependencies * ruff * take out the zero check * fix requirements new task name * move vep into new variants task * only annotate lookup from callset_ht * clean up mocks * r * working * working? * not that * minor changes and test cases * most recent script * working version * fix the test * implement ht chunking * fix patches * fix patches * register now yields id map of returned caids * r * fix some tests * return a hail table instead * use __str__ * log to track variants we can't map back * move to gcs with flag * union ar_ht instead of a bunch of left joins to prevent CAID, CAID_1, CAID_2... * cleaner * it is all coming together now ' * gnomad ids for 37' * use genomicalleles and gnomad ids * secrets * secret * move stuff out of environment file * add more logging * fix test * fix the other test * ruff * test * comments * o * Reference Data Update Type Equality Check (#789) * Finish validity check test * ruff * update dbnsfp field * More types * more types * ugh * twiddle it back * update type * more tweaks * lint * fix floats * decompose * ruff formatg * Update compare_globals_test.py * print lint rule (#791) * tiny ar bug (#792) --------- Co-authored-by: Benjamin Blankenmeister <bblanken@broadinstitute.org> Co-authored-by: Benjamin Blankenmeister <b.p.blankenmeister@gmail.com>
1 parent 7710b1d commit b3995d8

File tree

90 files changed

+835
-151
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

90 files changed

+835
-151
lines changed

.cloudbuild/vep-docker.cloudbuild.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# Run locally with:
22
#
3-
# gcloud builds submit --quiet --substitutions='_VEP_VERSION=110' --config .cloudbuild/vep-docker.cloudbuild.yaml v03_pipeline/
3+
# gcloud builds submit --quiet --substitutions='_VEP_VERSION=110' --config .cloudbuild/vep-docker.cloudbuild.yaml v03_pipeline/deploy
44
steps:
55
- name: 'gcr.io/kaniko-project/executor:v1.3.0'
66
args:
77
- --destination=gcr.io/seqr-project/vep-docker-image:${_VEP_VERSION}
8-
- --dockerfile=deploy/Dockerfile.vep
8+
- --dockerfile=Dockerfile.vep
99
- --cache=true
1010
- --cache-ttl=168h
1111

pyproject.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,6 @@ ignore = [
4141
"FBT", # flake-boolean-trap... disallows boolean args to functions... fixing this code will require refactors.
4242
"ANN", # flake8-annotations is for typed code
4343
"DJ", # django specific
44-
"T20", # forbids print, we print quite a bit
4544
"PYI", # pyi is typing stub files
4645
"PT", # pytest specific
4746
"PTH", # pathlib is preferred, but we're not using it yet

requirements.in

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
elasticsearch==7.9.1
22
google-api-python-client>=1.8.0
3-
hail==0.2.128
3+
hail==0.2.130
44
luigi>=3.4.0
55
gnomad==0.6.4
66
google-cloud-storage>=2.14.0
7+
google-cloud-secret-manager>=2.20.0

requirements.txt

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -97,10 +97,11 @@ frozenlist==1.4.0
9797
# hail
9898
gnomad==0.6.4
9999
# via -r requirements.in
100-
google-api-core==2.14.0
100+
google-api-core[grpc]==2.14.0
101101
# via
102102
# google-api-python-client
103103
# google-cloud-core
104+
# google-cloud-secret-manager
104105
# google-cloud-storage
105106
google-api-python-client==2.108.0
106107
# via -r requirements.in
@@ -111,6 +112,7 @@ google-auth==2.23.4
111112
# google-auth-httplib2
112113
# google-auth-oauthlib
113114
# google-cloud-core
115+
# google-cloud-secret-manager
114116
# google-cloud-storage
115117
# hail
116118
google-auth-httplib2==0.1.1
@@ -119,6 +121,8 @@ google-auth-oauthlib==0.8.0
119121
# via hail
120122
google-cloud-core==2.4.1
121123
# via google-cloud-storage
124+
google-cloud-secret-manager==2.20.0
125+
# via -r requirements.in
122126
google-cloud-storage==2.14.0
123127
# via -r requirements.in
124128
google-crc32c==1.5.0
@@ -127,9 +131,22 @@ google-crc32c==1.5.0
127131
# google-resumable-media
128132
google-resumable-media==2.7.0
129133
# via google-cloud-storage
130-
googleapis-common-protos==1.61.0
134+
googleapis-common-protos[grpc]==1.61.0
135+
# via
136+
# google-api-core
137+
# grpc-google-iam-v1
138+
# grpcio-status
139+
grpc-google-iam-v1==0.13.0
140+
# via google-cloud-secret-manager
141+
grpcio==1.63.0
142+
# via
143+
# google-api-core
144+
# googleapis-common-protos
145+
# grpc-google-iam-v1
146+
# grpcio-status
147+
grpcio-status==1.48.2
131148
# via google-api-core
132-
hail==0.2.128
149+
hail==0.2.130
133150
# via -r requirements.in
134151
hdbscan==0.8.33
135152
# via gnomad
@@ -202,7 +219,7 @@ numpy==1.26.2
202219
# scipy
203220
oauthlib==3.2.2
204221
# via requests-oauthlib
205-
orjson==3.9.11
222+
orjson==3.9.10
206223
# via hail
207224
packaging==23.2
208225
# via
@@ -226,11 +243,16 @@ portalocker==2.8.2
226243
# via msal-extensions
227244
prompt-toolkit==3.0.41
228245
# via ipython
246+
proto-plus==1.23.0
247+
# via google-cloud-secret-manager
229248
protobuf==3.20.2
230249
# via
231250
# google-api-core
251+
# google-cloud-secret-manager
232252
# googleapis-common-protos
233-
# hail
253+
# grpc-google-iam-v1
254+
# grpcio-status
255+
# proto-plus
234256
ptyprocess==0.7.0
235257
# via pexpect
236258
pure-eval==0.2.2

v03_pipeline/bin/vep-110-GRCh38.sh

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -42,12 +42,6 @@ gcloud storage cp --billing-project $PROJECT gs://seqr-reference-data/vep/110/ve
4242
gcloud storage cp --billing-project $PROJECT gs://seqr-reference-data/vep/110/uORF_5UTR_${ASSEMBLY}_PUBLIC.txt /vep_data/ &
4343

4444
# Raw data files copied from the bucket (https://console.cloud.google.com/storage/browser/dm_alphamissense;tab=objects?prefix=&forceOnObjectsSortingFiltering=false)
45-
# Some investigation led us to want to combine the canonical and non-canonical transcript tsvs (run inside the VEP docker container):
46-
# cat AlphaMissense_hg38.tsv.gz | gunzip | grep -v '#' | awk 'BEGIN { OFS = "\t" };{$6=""; print $0}' > AlphaMissense_combined_hg38.tsv
47-
# cat AlphaMissense_isoforms_hg38.tsv.gz | gunzip | grep -v '#' >> AlphaMissense_combined_hg38.tsv
48-
# cat AlphaMissense_combined_hg38.tsv | sort --parallel=12 --buffer-size=20G -k1,1 -k2,2n > AlphaMissense_combined_sorted_hg38.tsv
49-
# cat AlphaMissense_combined_sorted_hg38.tsv | sed '1i #CHROM\tPOS\tREF\tALT\tgenome\ttranscript_id\tprotein_variant\tam_pathogenicity\tam_class' > AlphaMissense_hg38.tsv
50-
# bgzip AlphaMissense_hg38.tsv
5145
# tabix -s 1 -b 2 -e 2 -f -S 1 AlphaMissense_hg38.tsv.gz
5246
gcloud storage cp --billing-project $PROJECT 'gs://seqr-reference-data/vep/110/AlphaMissense_hg38.tsv.*' /vep_data/ &
5347

v03_pipeline/bin/write_cached_reference_dataset_query_ht.py

Lines changed: 0 additions & 93 deletions
This file was deleted.

0 commit comments

Comments
 (0)