From f89a6d363dfbd716804ae9066ee6410fbb45fdc7 Mon Sep 17 00:00:00 2001 From: Benjamin Blankenmeister Date: Mon, 25 Nov 2024 14:39:10 -0500 Subject: [PATCH 1/2] Reference data refactor (#991) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * begin reference dataset refactor * hgmd * basewritetask * PR commentes * Reference data refactor feature branch * remove utils for now * cadd * hgmd selects * import * minor things * config enum attribute * config out of enum, get_ht, for_reference_genome_dataset_type * return table * kwargs * tiny changes * frozenset * cadd filtering * changes to the cadd script that will be moot? * add some gnomad datasets * hacking on clinvar * ruff * add 38 dbnsfp config * get cadd from dbnsfp * get primate ai and mpc from dbnsfp * Cleanup * cleanup * Update misc.py * Update clinvar.py * Update clinvar.py * Update clinvar_test.py * poach some files from bens pr * Update definitions.py * first pass enums * use liftover for 37 data instead of old version * remove cadd * Add clinvar path (#961) * Add clinvar path * Fix missing requires bug * remove dataset type from filter contigs * Move filter_contigs to "get_ht" so its generalizable * gnomad_exomes unit tests * all enum selects helper * gnomad_genomes tests * clean up * Generalize enum annotation * fix tempdir usage * add topmed * Benb/clinvar refactor (#960) * hacking on clinvar * ruff * Cleanup * cleanup * Update misc.py * Update clinvar.py * Update clinvar.py * Update clinvar_test.py * Update definitions.py * Add clinvar path (#961) * Add clinvar path * Fix missing requires bug * remove dataset type from filter contigs * Move filter_contigs to "get_ht" so its generalizable * Generalize enum annotation * Add back enum select fields * remove unnecessary line * clean up * ruff * wip hgmd test * ruff * share enum transmute * done * notebook * ruff * linter for now * first pass splice ai * Mitimpact * Add the enum :facepalm: * bad typo * gnomad_mito, gnomad_non_coding_constraint, local_constraint_mito, screen * gnomad_qc typo * module_file_name * gnomad_genomes CONFIG deduplication * zipfile helper * MITIMPACT (#965) * Mitimpact * Add the enum :facepalm: * bad typo * use helper for zip download * pr feedback * ruff * ruff * ruff * ruff * unshare extracted filename * clean up transmute * ruff * trailing comma * maybe clearer gnomad * fix property syntax * gnomad_mito selects * use hanas enum notation * shared import vcf helper * proper splice ai parsing * valid paths * ruff * ruff * mitomap * add coment * merge * screenums * explicit handling for already mapped enums * add tests * ruff * ruff * ruff * min_partitions * simplify mitomap * jupyter * hmtvar reference dataset (#971) * hmtvar reference dataset * ruff * eigen reference dataset (#970) * eigen reference dataset * Fix typo --------- Co-authored-by: Benjamin Blankenmeister * Exac reference dataset (#969) * add exac reference dataset * use vcf * remove comment --------- Co-authored-by: Benjamin Blankenmeister * helix mito (#972) * split genomes and exomes again * fix screen * screen and gnomad non coding * unzip local_constraint_mito * Fix bugs related to nested fields/split_multi (#973) * helix mito * Fix split_multi and select bugs * fixme * ruff * Add test for exac * Add test for split multi check * Add test for `UpdatedReferenceDataset` and `UpdatedReferenceDatasetQuery` (#974) * helix mito * Fix split_multi and select bugs * fixme * ruff * get test working * fix bugs * bug fixes * Bugfixes * Refactor tests * Add comment * quixotic * missed one * Add test for exac * Add test for split multi check * fix zip write * Benb/add missing queries (#977) * Add missing datasets * Fix reference * Add test * lint * remove complete() (#979) * remove complete() * ruff * Fix mock * Benb/update gnomad qc crdq with updated format (#980) * remove complete() * ruff * Fix mock * Replace the gnomad_qc crdq * Fix test * format * Remove ht and tests (#981) * remove complete() * ruff * Fix mock * Replace the gnomad_qc crdq * Fix test * format * Remove ht and tests * Updated `gnomad_coding_and_noncoding` test table. (#982) * remove complete() * ruff * Fix mock * Replace the gnomad_qc crdq * Fix test * format * Remove ht and tests * Change validation table reference * Update README.txt * remove crdq reference * Update mock * ruff * Fix imports * remove mock * fixme * Change rsync to new path (#983) * Remove `version` from reference dataset query path (#984) * Change rsync to new path * Remove version from reference dataset query path * Make rdq dataset type specific (#985) * Make rdq dataset type specific * Add test for mito * Add pathogenicities to clinvar * tweak * update annotations with updated reference datasets refactor (#978) * first pass update vat * merge feature * fix the diff for now * include_queries * interval ht * tests * exclude * nicer * fix inteval test * split fn * eigen test * clinvar wip * hgmd * clinvar * gnomad genomes and exomes * delete * 38 snv_indel done * mito tests * done with tests? * custom_select * fields test * disable write new samples tests for now * working on tests * update update vat with new samples tests * extra file * other skipped test * make select and filter similar * tweak * rename path and locus/interval filtering * make select and filter similar (#988) * make select and filter similar * tweak * Cleanest set diff * Finish off * Tests passing! * ruff * ruff * Change the params * Fix params * params * More clinvar mocking * hardcode these --------- Co-authored-by: Benjamin Blankenmeister Co-authored-by: Benjamin Blankenmeister * delete old reference data code 😝 (#990) * first pass update vat * merge feature * fix the diff for now * include_queries * interval ht * tests * exclude * nicer * fix inteval test * split fn * eigen test * clinvar wip * hgmd * clinvar * gnomad genomes and exomes * delete * 38 snv_indel done * mito tests * done with tests? * custom_select * fields test * disable write new samples tests for now * working on tests * update update vat with new samples tests * extra file * other skipped test * make select and filter similar * tweak * rename path and locus/interval filtering * make select and filter similar (#988) * make select and filter similar * tweak * Cleanest set diff * Finish off * Tests passing! * ruff * ruff * Change the params * Fix params * params * More clinvar mocking * hardcode these * delete a bunch of stuff * ruff * remove rdc and crdq * delete v02 * remove comment references to deleted file * last test --------- Co-authored-by: Benjamin Blankenmeister Co-authored-by: Benjamin Blankenmeister --------- Co-authored-by: Julia Klugherz Co-authored-by: Hana Snow --- .../v02/create_ht__cadd.py | 8 - .../v02/create_ht__clinvar.py | 8 - .../v02/create_ht__combined_reference_data.py | 14 - .../v02/create_ht__eigen.py | 14 - .../v02/create_ht__mpc.py | 14 - .../v02/create_ht__primate_ai.py | 14 - .../v02/create_ht__topmed.py | 14 - .../v02/hail_scripts/write_1kg_ht.py | 71 - .../v02/hail_scripts/write_cadd_ht.py | 49 - .../v02/hail_scripts/write_ccREs_ht.py | 59 - .../v02/hail_scripts/write_clinvar_ht.py | 29 - .../write_combined_interval_ref_data.py | 43 - .../write_combined_reference_data_ht.py | 30 - .../v02/hail_scripts/write_dbnsfp_ht.py | 147 -- .../v02/hail_scripts/write_splice_ai_ht.py | 94 - .../v02/mito/utils.py | 92 - .../write_combined_mito_reference_data_hts.py | 44 - .../v02/mito/write_mito_helix_ht.py | 19 - .../v02/mito/write_mito_hmtvar_ht.py | 19 - .../v02/mito/write_mito_mitimpact_ht.py | 18 - .../v02/mito/write_mito_mitomap_ht.py | 20 - requirements-dev.in | 1 + requirements-dev.txt | 30 +- v03_pipeline/bin/rsync_reference_data.bash | 2 +- v03_pipeline/lib/annotations/enums.py | 15 + v03_pipeline/lib/annotations/fields_test.py | 32 +- v03_pipeline/lib/annotations/mito.py | 8 - .../lib/annotations/rdc_dependencies.py | 29 - v03_pipeline/lib/annotations/snv_indel.py | 14 +- v03_pipeline/lib/misc/validation.py | 36 - v03_pipeline/lib/misc/validation_test.py | 5 +- v03_pipeline/lib/model/__init__.py | 8 - .../model/cached_reference_dataset_query.py | 65 - v03_pipeline/lib/model/dataset_type.py | 1 - v03_pipeline/lib/model/definitions.py | 7 + .../lib/model/reference_dataset_collection.py | 110 -- v03_pipeline/lib/paths.py | 63 +- v03_pipeline/lib/paths_test.py | 34 - v03_pipeline/lib/reference_data/clinvar.py | 214 --- .../lib/reference_data/clinvar_test.py | 281 --- .../lib/reference_data/compare_globals.py | 137 -- .../reference_data/compare_globals_test.py | 321 ---- v03_pipeline/lib/reference_data/config.py | 549 ------ .../dataset_table_operations.py | 218 --- .../dataset_table_operations_test.py | 585 ------- v03_pipeline/lib/reference_data/hgmd.py | 18 - v03_pipeline/lib/reference_data/hgmd_test.py | 12 - v03_pipeline/lib/reference_data/mito.py | 16 - .../__init__.py | 0 .../lib/reference_datasets/clinvar.py | 172 ++ .../clinvar_path_variants.py | 38 + .../lib/reference_datasets/clinvar_test.py | 171 ++ v03_pipeline/lib/reference_datasets/dbnsfp.py | 83 + v03_pipeline/lib/reference_datasets/eigen.py | 6 + v03_pipeline/lib/reference_datasets/exac.py | 22 + .../lib/reference_datasets/exac_test.py | 54 + .../gencode/__init__.py | 0 .../gencode/mapping_gene_ids.py | 0 .../gencode/mapping_gene_ids_tests.py | 2 +- .../gnomad_coding_and_noncoding.py | 59 + .../lib/reference_datasets/gnomad_exomes.py | 21 + .../reference_datasets/gnomad_exomes_test.py | 72 + .../lib/reference_datasets/gnomad_genomes.py | 21 + .../reference_datasets/gnomad_genomes_test.py | 72 + .../lib/reference_datasets/gnomad_mito.py | 14 + .../gnomad_non_coding_constraint.py | 23 + .../lib/reference_datasets/gnomad_qc.py | 9 + .../lib/reference_datasets/gnomad_utils.py | 55 + .../lib/reference_datasets/helix_mito.py | 53 + v03_pipeline/lib/reference_datasets/hgmd.py | 20 + .../lib/reference_datasets/hgmd_test.py | 41 + .../reference_datasets/high_af_variants.py | 20 + v03_pipeline/lib/reference_datasets/hmtvar.py | 23 + .../local_constraint_mito.py | 24 + v03_pipeline/lib/reference_datasets/misc.py | 150 ++ .../lib/reference_datasets/misc_test.py | 76 + .../lib/reference_datasets/mitimpact.py | 26 + .../lib/reference_datasets/mitomap.py | 22 + .../lib/reference_datasets/mitomap_test.py | 51 + .../queries.py | 0 .../reference_datasets/reference_dataset.py | 412 +++++ v03_pipeline/lib/reference_datasets/screen.py | 38 + .../lib/reference_datasets/splice_ai.py | 33 + v03_pipeline/lib/reference_datasets/topmed.py | 20 + .../base_update_variant_annotations_table.py | 77 +- ...e_update_variant_annotations_table_test.py | 106 +- ...update_cached_reference_dataset_queries.py | 37 - ...e_cached_reference_dataset_queries_test.py | 124 -- ...ns_table_with_updated_reference_dataset.py | 129 +- ...ble_with_updated_reference_dataset_test.py | 1532 ++++------------- .../updated_cached_reference_dataset_query.py | 142 -- ...ted_cached_reference_dataset_query_test.py | 264 --- .../updated_reference_dataset.py | 27 + .../updated_reference_dataset_collection.py | 115 -- ...dated_reference_dataset_collection_test.py | 345 ---- .../updated_reference_dataset_query.py | 54 + .../updated_reference_dataset_query_test.py | 182 ++ ...annotations_table_with_new_samples_test.py | 1426 ++++++--------- v03_pipeline/lib/tasks/validate_callset.py | 39 +- .../lib/tasks/validate_callset_test.py | 22 +- .../lib/tasks/write_new_variants_table.py | 42 +- .../tasks/write_relatedness_check_table.py | 10 +- .../write_relatedness_check_table_test.py | 133 +- .../write_variant_annotations_vcf_test.py | 6 +- v03_pipeline/lib/test/mock_clinvar_urls.py | 37 + .../mocked_reference_datasets_testcase.py | 40 + .../gnomad_qc_crdq.ht/.README.txt.crc | Bin 12 -> 0 bytes .../gnomad_qc_crdq.ht/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../gnomad_qc_crdq.ht/README.txt | 3 - .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/metadata.json.gz | Bin 293 -> 0 bytes .../globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../gnomad_qc_crdq.ht/globals/parts/part-0 | Bin 113 -> 0 bytes .../.index.crc | Bin 12 -> 0 bytes .../index | Bin 73 -> 0 bytes .../gnomad_qc_crdq.ht/metadata.json.gz | Bin 312 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../gnomad_qc_crdq.ht/rows/metadata.json.gz | Bin 562 -> 0 bytes ...0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.crc | Bin 12 -> 0 bytes ...art-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75 | Bin 51 -> 0 bytes .../.README.txt.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/metadata.json.gz | Bin 322 -> 0 bytes .../globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../globals/parts/part-0 | Bin 400 -> 0 bytes .../metadata.json.gz | Bin 345 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../rows/metadata.json.gz | Bin 586 -> 0 bytes ...0-9e75273d-7113-40e4-a327-453f3451dc8c.crc | Bin 12 -> 0 bytes ...art-0-9e75273d-7113-40e4-a327-453f3451dc8c | Bin 51 -> 0 bytes .../test_combined_1.ht.ht/.README.txt.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 16 -> 0 bytes .../globals/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../globals/metadata.json.gz | Bin 546 -> 0 bytes .../globals/parts/.part-0.crc | Bin 16 -> 0 bytes .../globals/parts/part-0 | Bin 774 -> 0 bytes .../test_combined_1.ht.ht/metadata.json.gz | Bin 725 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 20 -> 0 bytes .../rows/metadata.json.gz | Bin 1064 -> 0 bytes ...0-3569201c-d630-43c4-9056-cbace806fe8d.crc | Bin 12 -> 0 bytes ...art-0-3569201c-d630-43c4-9056-cbace806fe8d | Bin 106 -> 0 bytes .../test_combined_1.ht/.README.txt.crc | Bin 12 -> 0 bytes .../test_combined_1.ht/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../test_combined_1.ht/README.txt | 3 - .../globals/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../globals/metadata.json.gz | Bin 546 -> 0 bytes .../globals/parts/.part-0.crc | Bin 16 -> 0 bytes .../test_combined_1.ht/globals/parts/part-0 | Bin 774 -> 0 bytes .../test_combined_1.ht/metadata.json.gz | Bin 725 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 20 -> 0 bytes .../test_combined_1.ht/rows/metadata.json.gz | Bin 1062 -> 0 bytes ...0-1d126232-414b-4ffa-aa43-9ed52895fbf2.crc | Bin 12 -> 0 bytes ...art-0-1d126232-414b-4ffa-aa43-9ed52895fbf2 | Bin 106 -> 0 bytes .../test_combined_2.ht/.README.txt.crc | Bin 12 -> 0 bytes .../test_combined_2.ht/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../test_combined_2.ht/README.txt | 3 - .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/metadata.json.gz | Bin 299 -> 0 bytes .../globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../test_combined_2.ht/globals/parts/part-0 | Bin 147 -> 0 bytes .../test_combined_2.ht/metadata.json.gz | Bin 329 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../test_combined_2.ht/rows/metadata.json.gz | Bin 583 -> 0 bytes ...0-20336911-c437-4deb-9fa4-7c7fe61f0408.crc | Bin 12 -> 0 bytes ...0-7d0599cd-6874-47f8-b6de-a7db0b41817c.crc | Bin 12 -> 0 bytes ...art-0-20336911-c437-4deb-9fa4-7c7fe61f0408 | Bin 54 -> 0 bytes ...art-0-7d0599cd-6874-47f8-b6de-a7db0b41817c | Bin 55 -> 0 bytes .../test_combined_37.ht/.README.txt.crc | Bin 12 -> 0 bytes .../test_combined_37.ht/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../globals/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../globals/metadata.json.gz | Bin 547 -> 0 bytes .../globals/parts/.part-0.crc | Bin 16 -> 0 bytes .../test_combined_37.ht/globals/parts/part-0 | Bin 798 -> 0 bytes .../.index.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 12 -> 0 bytes .../index | Bin 140 -> 0 bytes .../metadata.json.gz | Bin 187 -> 0 bytes .../test_combined_37.ht/metadata.json.gz | Bin 702 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 20 -> 0 bytes .../test_combined_37.ht/rows/metadata.json.gz | Bin 1027 -> 0 bytes ...0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2.crc | Bin 12 -> 0 bytes ...art-0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2 | Bin 210 -> 0 bytes .../test_combined_mito_1.ht/.README.txt.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 16 -> 0 bytes .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/metadata.json.gz | Bin 476 -> 0 bytes .../globals/parts/.part-0.crc | Bin 16 -> 0 bytes .../globals/parts/part-0 | Bin 708 -> 0 bytes .../.index.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 12 -> 0 bytes .../index | Bin 130 -> 0 bytes .../metadata.json.gz | Bin 186 -> 0 bytes .../test_combined_mito_1.ht/metadata.json.gz | Bin 588 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../rows/metadata.json.gz | Bin 880 -> 0 bytes ...0-f96f626e-c873-4613-a02b-88ee1e3f2923.crc | Bin 12 -> 0 bytes ...art-0-f96f626e-c873-4613-a02b-88ee1e3f2923 | Bin 192 -> 0 bytes .../.README.txt.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/metadata.json.gz | Bin 294 -> 0 bytes .../globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../globals/parts/part-0 | Bin 111 -> 0 bytes .../metadata.json.gz | Bin 316 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../rows/metadata.json.gz | Bin 587 -> 0 bytes .../test_hgmd_1.ht/.README.txt.crc | Bin 12 -> 0 bytes .../test_hgmd_1.ht/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../reference_data/test_hgmd_1.ht/README.txt | 3 - .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../test_hgmd_1.ht/globals/metadata.json.gz | Bin 312 -> 0 bytes .../test_hgmd_1.ht/globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../test_hgmd_1.ht/globals/parts/part-0 | Bin 162 -> 0 bytes .../test_hgmd_1.ht/metadata.json.gz | Bin 338 -> 0 bytes .../test_hgmd_1.ht/rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../test_hgmd_1.ht/rows/metadata.json.gz | Bin 587 -> 0 bytes .../test_hgmd_37.ht/.README.txt.crc | Bin 12 -> 0 bytes .../test_hgmd_37.ht/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../reference_data/test_hgmd_37.ht/README.txt | 3 - .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../test_hgmd_37.ht/globals/metadata.json.gz | Bin 312 -> 0 bytes .../test_hgmd_37.ht/globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../test_hgmd_37.ht/globals/parts/part-0 | Bin 159 -> 0 bytes .../test_hgmd_37.ht/metadata.json.gz | Bin 338 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../test_hgmd_37.ht/rows/metadata.json.gz | Bin 585 -> 0 bytes .../test_interval_1.ht/.README.txt.crc | Bin 12 -> 0 bytes .../test_interval_1.ht/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/metadata.json.gz | Bin 357 -> 0 bytes .../globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../test_interval_1.ht/globals/parts/part-0 | Bin 217 -> 0 bytes .../test_interval_1.ht/metadata.json.gz | Bin 372 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../test_interval_1.ht/rows/metadata.json.gz | Bin 643 -> 0 bytes ...0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.crc | Bin 12 -> 0 bytes ...art-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683 | Bin 60 -> 0 bytes .../test_interval_mito_1.ht/.README.txt.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 12 -> 0 bytes .../test_interval_mito_1.ht/README.txt | 3 - .../globals/.metadata.json.gz.crc | Bin 12 -> 0 bytes .../globals/metadata.json.gz | Bin 314 -> 0 bytes .../globals/parts/.part-0.crc | Bin 12 -> 0 bytes .../globals/parts/part-0 | Bin 153 -> 0 bytes .../.index.crc | Bin 12 -> 0 bytes .../.metadata.json.gz.crc | Bin 12 -> 0 bytes .../index | Bin 162 -> 0 bytes .../metadata.json.gz | Bin 178 -> 0 bytes .../test_interval_mito_1.ht/metadata.json.gz | Bin 323 -> 0 bytes .../rows/.metadata.json.gz.crc | Bin 16 -> 0 bytes .../rows/metadata.json.gz | Bin 605 -> 0 bytes ...0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.crc | Bin 12 -> 0 bytes ...art-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0 | Bin 113 -> 0 bytes .../clinvar/2024-11-11.ht/.README.txt.crc | Bin 0 -> 12 bytes .../clinvar/2024-11-11.ht}/._SUCCESS.crc | Bin .../2024-11-11.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/clinvar/2024-11-11.ht}/README.txt | 2 +- .../GRCh37/clinvar/2024-11-11.ht}/_SUCCESS | 0 .../globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../2024-11-11.ht/globals/metadata.json.gz | Bin 0 -> 298 bytes .../2024-11-11.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../2024-11-11.ht/globals/parts/part-0 | Bin 0 -> 360 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 138 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../clinvar/2024-11-11.ht/metadata.json.gz | Bin 0 -> 380 bytes .../2024-11-11.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../2024-11-11.ht/rows/metadata.json.gz | Bin 0 -> 668 bytes ...0-16d3574b-02c6-4ade-8054-836f2bbce002.crc | Bin 0 -> 12 bytes ...art-0-16d3574b-02c6-4ade-8054-836f2bbce002 | Bin 0 -> 100 bytes .../GRCh37/dbnsfp/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh37/dbnsfp/1.0.ht}/._SUCCESS.crc | Bin .../dbnsfp/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/dbnsfp/1.0.ht/README.txt | 3 + .../GRCh37/dbnsfp/1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../dbnsfp/1.0.ht/globals/metadata.json.gz | Bin 0 -> 291 bytes .../dbnsfp/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh37/dbnsfp/1.0.ht/globals/parts/part-0 | Bin 0 -> 51 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 138 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../GRCh37/dbnsfp/1.0.ht/metadata.json.gz | Bin 0 -> 378 bytes .../dbnsfp/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../dbnsfp/1.0.ht/rows/metadata.json.gz | Bin 0 -> 656 bytes ...0-67410585-d883-48cc-8d33-933fff287418.crc | Bin 0 -> 12 bytes ...art-0-67410585-d883-48cc-8d33-933fff287418 | Bin 0 -> 128 bytes .../GRCh37/eigen/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh37/eigen/1.0.ht}/._SUCCESS.crc | Bin .../GRCh37/eigen/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/eigen/1.0.ht/README.txt | 3 + .../GRCh37/eigen/1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../eigen/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../eigen/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh37/eigen/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 138 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../GRCh37/eigen/1.0.ht/metadata.json.gz | Bin 0 -> 313 bytes .../eigen/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../GRCh37/eigen/1.0.ht/rows/metadata.json.gz | Bin 0 -> 581 bytes ...0-04c0af8a-a562-4e97-a303-1047deca5f45.crc | Bin 0 -> 12 bytes ...art-0-04c0af8a-a562-4e97-a303-1047deca5f45 | Bin 0 -> 107 bytes .../GRCh37/exac/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh37/exac/1.0.ht}/._SUCCESS.crc | Bin .../GRCh37/exac/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/exac/1.0.ht/README.txt | 3 + .../GRCh37/exac/1.0.ht}/_SUCCESS | 0 .../exac/1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../exac/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../exac/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh37/exac/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 138 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../GRCh37/exac/1.0.ht/metadata.json.gz | Bin 0 -> 338 bytes .../exac/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../GRCh37/exac/1.0.ht/rows/metadata.json.gz | Bin 0 -> 624 bytes ...0-dc3793f5-157b-42ff-8a87-4e367441c4b7.crc | Bin 0 -> 12 bytes ...art-0-dc3793f5-157b-42ff-8a87-4e367441c4b7 | Bin 0 -> 119 bytes .../1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../1.0.ht}/._SUCCESS.crc | Bin .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht}/README.txt | 2 +- .../1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 776 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 32 bytes .../1.0.ht/globals/parts/part-0 | Bin 0 -> 2599 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../1.0.ht/metadata.json.gz | Bin 0 -> 604 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 596 bytes ...-690f60f1-5897-4a95-9d74-fce92d3e5de7.crc} | Bin ...rt-0-690f60f1-5897-4a95-9d74-fce92d3e5de7} | Bin .../gnomad_exomes/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../gnomad_exomes/1.0.ht}/._SUCCESS.crc | Bin .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/gnomad_exomes/1.0.ht/README.txt | 3 + .../GRCh37/gnomad_exomes/1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../gnomad_exomes/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 138 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../gnomad_exomes/1.0.ht/metadata.json.gz | Bin 0 -> 347 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 630 bytes ...0-5419bf36-548c-4524-b44c-cd77ed3f191e.crc | Bin 0 -> 12 bytes ...art-0-5419bf36-548c-4524-b44c-cd77ed3f191e | Bin 0 -> 123 bytes .../gnomad_genomes/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../gnomad_genomes/1.0.ht}/._SUCCESS.crc | Bin .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/gnomad_genomes/1.0.ht/README.txt | 3 + .../GRCh37/gnomad_genomes/1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 138 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../gnomad_genomes/1.0.ht/metadata.json.gz | Bin 0 -> 347 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 630 bytes ...0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.crc | Bin 0 -> 12 bytes ...art-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8 | Bin 0 -> 100 bytes .../GRCh37/gnomad_qc/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh37/gnomad_qc/1.0.ht}/._SUCCESS.crc | Bin .../gnomad_qc/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/gnomad_qc/1.0.ht}/README.txt | 2 +- .../GRCh37/gnomad_qc/1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../gnomad_qc/1.0.ht/globals/metadata.json.gz | Bin 0 -> 254 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../gnomad_qc/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 67 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../GRCh37/gnomad_qc/1.0.ht/metadata.json.gz | Bin 0 -> 473 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../gnomad_qc/1.0.ht/rows/metadata.json.gz | Bin 0 -> 813 bytes ...0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.crc | Bin 0 -> 12 bytes ...art-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b | Bin 0 -> 191 bytes .../GRCh37/hgmd/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh37/hgmd/1.0.ht}/._SUCCESS.crc | Bin .../GRCh37/hgmd/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/hgmd/1.0.ht/README.txt | 3 + .../GRCh37/hgmd/1.0.ht}/_SUCCESS | 0 .../hgmd/1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../hgmd/1.0.ht/globals/metadata.json.gz | Bin 0 -> 280 bytes .../hgmd/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh37/hgmd/1.0.ht/globals/parts/part-0 | Bin 0 -> 60 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 19 bytes .../metadata.json.gz | Bin 0 -> 182 bytes .../GRCh37/hgmd/1.0.ht/metadata.json.gz | Bin 0 -> 317 bytes .../hgmd/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../GRCh37/hgmd/1.0.ht/rows/metadata.json.gz | Bin 0 -> 576 bytes ...0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.crc | Bin 0 -> 12 bytes ...art-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893 | Bin 0 -> 35 bytes .../GRCh37/splice_ai/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh37/splice_ai/1.0.ht}/._SUCCESS.crc | Bin .../splice_ai/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/splice_ai/1.0.ht/README.txt | 3 + .../GRCh37/splice_ai/1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../splice_ai/1.0.ht/globals/metadata.json.gz | Bin 0 -> 290 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../splice_ai/1.0.ht/globals/parts/part-0 | Bin 0 -> 92 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 138 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../GRCh37/splice_ai/1.0.ht/metadata.json.gz | Bin 0 -> 335 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../splice_ai/1.0.ht/rows/metadata.json.gz | Bin 0 -> 601 bytes ...0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.crc | Bin 0 -> 12 bytes ...art-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e | Bin 0 -> 123 bytes .../GRCh37/topmed/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh37/topmed/1.0.ht}/._SUCCESS.crc | Bin .../topmed/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh37/topmed/1.0.ht/README.txt | 3 + .../GRCh37/topmed/1.0.ht}/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../topmed/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../topmed/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh37/topmed/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 137 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../GRCh37/topmed/1.0.ht/metadata.json.gz | Bin 0 -> 321 bytes .../topmed/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../topmed/1.0.ht/rows/metadata.json.gz | Bin 0 -> 600 bytes ...0-c09ec7db-1671-4dc3-95d4-6426532e00f1.crc | Bin 0 -> 12 bytes ...art-0-c09ec7db-1671-4dc3-95d4-6426532e00f1 | Bin 0 -> 99 bytes .../clinvar/2024-11-11.ht/.README.txt.crc | Bin 0 -> 12 bytes .../clinvar/2024-11-11.ht}/._SUCCESS.crc | Bin .../2024-11-11.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/clinvar/2024-11-11.ht/README.txt | 3 + .../GRCh38/clinvar/2024-11-11.ht}/_SUCCESS | 0 .../globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../2024-11-11.ht/globals/metadata.json.gz | Bin 0 -> 298 bytes .../2024-11-11.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../2024-11-11.ht/globals/parts/part-0 | Bin 0 -> 360 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin .../index | Bin 0 -> 73 bytes .../metadata.json.gz | Bin .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 129 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../clinvar/2024-11-11.ht/metadata.json.gz | Bin 0 -> 381 bytes .../2024-11-11.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../2024-11-11.ht/rows/metadata.json.gz | Bin 0 -> 718 bytes ...0-a71ea1dc-61b1-4cba-985b-155a977bebff.crc | Bin 0 -> 12 bytes ...1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.crc | Bin 0 -> 12 bytes ...art-0-a71ea1dc-61b1-4cba-985b-155a977bebff | Bin 0 -> 52 bytes ...art-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168 | Bin 0 -> 132 bytes .../GRCh38/dbnsfp/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/dbnsfp/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../dbnsfp/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/dbnsfp/1.0.ht/README.txt | 3 + .../GRCh38/dbnsfp/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../dbnsfp/1.0.ht/globals/metadata.json.gz | Bin 0 -> 291 bytes .../dbnsfp/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh38/dbnsfp/1.0.ht/globals/parts/part-0 | Bin 0 -> 51 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 73 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 129 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../GRCh38/dbnsfp/1.0.ht/metadata.json.gz | Bin 0 -> 405 bytes .../dbnsfp/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../dbnsfp/1.0.ht/rows/metadata.json.gz | Bin 0 -> 741 bytes ...0-113d0935-f89b-4d20-9f25-225c16c2f941.crc | Bin 0 -> 12 bytes ...1-a918a0a7-ef41-490f-9d13-73a3e17beead.crc | Bin 0 -> 12 bytes ...art-0-113d0935-f89b-4d20-9f25-225c16c2f941 | Bin 0 -> 61 bytes ...art-1-a918a0a7-ef41-490f-9d13-73a3e17beead | Bin 0 -> 98 bytes .../GRCh38/eigen/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/eigen/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../GRCh38/eigen/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/eigen/1.0.ht/README.txt | 3 + .../GRCh38/eigen/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../eigen/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../eigen/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh38/eigen/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../GRCh38/eigen/1.0.ht/metadata.json.gz | Bin 0 -> 312 bytes .../eigen/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../GRCh38/eigen/1.0.ht/rows/metadata.json.gz | Bin 0 -> 575 bytes ...0-24084335-917b-4b51-8a30-4fe509d64745.crc | Bin 0 -> 12 bytes ...art-0-24084335-917b-4b51-8a30-4fe509d64745 | Bin 0 -> 54 bytes .../GRCh38/exac/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/exac/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../GRCh38/exac/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/exac/1.0.ht/README.txt | 3 + .../GRCh38/exac/1.0.ht/_SUCCESS | 0 .../exac/1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../exac/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../exac/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh38/exac/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../GRCh38/exac/1.0.ht/metadata.json.gz | Bin 0 -> 337 bytes .../exac/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../GRCh38/exac/1.0.ht/rows/metadata.json.gz | Bin 0 -> 615 bytes ...0-018c9528-a303-4d50-8cf8-eb42ad4d7486.crc | Bin 0 -> 12 bytes ...art-0-018c9528-a303-4d50-8cf8-eb42ad4d7486 | Bin 0 -> 64 bytes .../1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/README.txt | 3 + .../1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 254 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 19 bytes .../metadata.json.gz | Bin 0 -> 181 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../1.0.ht/metadata.json.gz | Bin 0 -> 294 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 587 bytes ...-345f1488-be53-4c4b-8207-b052e86084d6.crc} | Bin ...0-90a40f33-45f1-4319-b895-a6f9f6f3364c.crc | Bin 0 -> 12 bytes ...-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.crc} | Bin ...rt-0-345f1488-be53-4c4b-8207-b052e86084d6} | Bin ...art-0-90a40f33-45f1-4319-b895-a6f9f6f3364c | Bin 0 -> 35 bytes ...rt-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07} | Bin .../gnomad_exomes/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/gnomad_exomes/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/gnomad_exomes/1.0.ht/README.txt | 3 + .../GRCh38/gnomad_exomes/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../gnomad_exomes/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../gnomad_exomes/1.0.ht/metadata.json.gz | Bin 0 -> 346 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 623 bytes ...0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.crc | Bin 0 -> 12 bytes ...art-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d | Bin 0 -> 68 bytes .../gnomad_genomes/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../gnomad_genomes/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/gnomad_genomes/1.0.ht/README.txt | 3 + .../GRCh38/gnomad_genomes/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../gnomad_genomes/1.0.ht/metadata.json.gz | Bin 0 -> 346 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 623 bytes ...0-7791073a-d4da-48f7-903f-59f1ac95d459.crc | Bin 0 -> 12 bytes ...art-0-7791073a-d4da-48f7-903f-59f1ac95d459 | Bin 0 -> 50 bytes .../GRCh38/gnomad_mito/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/gnomad_mito/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../gnomad_mito/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/gnomad_mito/1.0.ht/README.txt | 3 + .../GRCh38/gnomad_mito/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../gnomad_mito/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 127 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../gnomad_mito/1.0.ht/metadata.json.gz | Bin 0 -> 334 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../gnomad_mito/1.0.ht/rows/metadata.json.gz | Bin 0 -> 613 bytes ...0-bccae774-994f-469e-9b30-01becb2109a0.crc | Bin 0 -> 12 bytes ...art-0-bccae774-994f-469e-9b30-01becb2109a0 | Bin 0 -> 107 bytes .../1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/README.txt | 3 + .../1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../1.0.ht/metadata.json.gz | Bin 0 -> 299 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 591 bytes ...0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.crc | Bin 0 -> 12 bytes ...art-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba | Bin 0 -> 58 bytes .../GRCh38/gnomad_qc/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/gnomad_qc/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../gnomad_qc/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/gnomad_qc/1.0.ht}/README.txt | 2 +- .../GRCh38/gnomad_qc/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../gnomad_qc/1.0.ht/globals/metadata.json.gz | Bin 0 -> 254 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../gnomad_qc/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 73 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../GRCh38/gnomad_qc/1.0.ht/metadata.json.gz | Bin 0 -> 310 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../gnomad_qc/1.0.ht/rows/metadata.json.gz | Bin 0 -> 600 bytes ...0-46f30121-756f-4290-b7f1-e0f9993c9593.crc | Bin 0 -> 12 bytes ...art-0-46f30121-756f-4290-b7f1-e0f9993c9593 | Bin 0 -> 305 bytes .../GRCh38/helix_mito/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/helix_mito/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../helix_mito/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/helix_mito/1.0.ht/README.txt | 3 + .../GRCh38/helix_mito/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../helix_mito/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 127 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../GRCh38/helix_mito/1.0.ht/metadata.json.gz | Bin 0 -> 332 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../helix_mito/1.0.ht/rows/metadata.json.gz | Bin 0 -> 611 bytes ...0-eceecf38-7b1a-46ab-98c2-147256aff633.crc | Bin 0 -> 12 bytes ...art-0-eceecf38-7b1a-46ab-98c2-147256aff633 | Bin 0 -> 107 bytes .../GRCh38/hgmd/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/hgmd/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../GRCh38/hgmd/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/hgmd/1.0.ht/README.txt | 3 + .../GRCh38/hgmd/1.0.ht/_SUCCESS | 0 .../hgmd/1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../hgmd/1.0.ht/globals/metadata.json.gz | Bin 0 -> 281 bytes .../hgmd/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh38/hgmd/1.0.ht/globals/parts/part-0 | Bin 0 -> 63 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../GRCh38/hgmd/1.0.ht/metadata.json.gz | Bin 0 -> 317 bytes .../hgmd/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../GRCh38/hgmd/1.0.ht/rows/metadata.json.gz | Bin 0 -> 576 bytes ...-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.crc} | Bin ...rt-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a} | Bin .../GRCh38/hmtvar/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/hmtvar/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../hmtvar/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/hmtvar/1.0.ht/README.txt | 3 + .../GRCh38/hmtvar/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../hmtvar/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../hmtvar/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh38/hmtvar/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 129 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../GRCh38/hmtvar/1.0.ht/metadata.json.gz | Bin 0 -> 309 bytes .../hmtvar/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../hmtvar/1.0.ht/rows/metadata.json.gz | Bin 0 -> 574 bytes ...0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.crc | Bin 0 -> 12 bytes ...art-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5 | Bin 0 -> 127 bytes .../1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../local_constraint_mito/1.0.ht/README.txt | 3 + .../local_constraint_mito/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 129 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../1.0.ht/metadata.json.gz | Bin 0 -> 309 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../1.0.ht/rows/metadata.json.gz | Bin 0 -> 576 bytes ...0-b707f718-6196-4c02-9d68-148cf0c9438e.crc | Bin 0 -> 12 bytes ...art-0-b707f718-6196-4c02-9d68-148cf0c9438e | Bin 0 -> 103 bytes .../GRCh38/mitimpact/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/mitimpact/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../mitimpact/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/mitimpact/1.0.ht/README.txt | 3 + .../GRCh38/mitimpact/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../mitimpact/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../mitimpact/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 129 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../GRCh38/mitimpact/1.0.ht/metadata.json.gz | Bin 0 -> 309 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../mitimpact/1.0.ht/rows/metadata.json.gz | Bin 0 -> 576 bytes ...0-e16f2759-68b2-4794-978c-4bfcd2f29974.crc | Bin 0 -> 12 bytes ...art-0-e16f2759-68b2-4794-978c-4bfcd2f29974 | Bin 0 -> 103 bytes .../GRCh38/mitomap/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/mitomap/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../mitomap/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/mitomap/1.0.ht/README.txt | 3 + .../GRCh38/mitomap/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../mitomap/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../mitomap/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../mitomap/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 127 bytes .../metadata.json.gz | Bin 0 -> 186 bytes .../GRCh38/mitomap/1.0.ht/metadata.json.gz | Bin 0 -> 311 bytes .../mitomap/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../mitomap/1.0.ht/rows/metadata.json.gz | Bin 0 -> 577 bytes ...0-430d2a33-3c80-49e7-91ec-31484d8fc41b.crc | Bin 0 -> 12 bytes ...art-0-430d2a33-3c80-49e7-91ec-31484d8fc41b | Bin 0 -> 95 bytes .../GRCh38/screen/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/screen/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../screen/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/screen/1.0.ht/README.txt | 3 + .../GRCh38/screen/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../screen/1.0.ht/globals/metadata.json.gz | Bin 0 -> 285 bytes .../screen/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh38/screen/1.0.ht/globals/parts/part-0 | Bin 0 -> 111 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 79 bytes .../metadata.json.gz | Bin 0 -> 176 bytes .../GRCh38/screen/1.0.ht/metadata.json.gz | Bin 0 -> 313 bytes .../screen/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../screen/1.0.ht/rows/metadata.json.gz | Bin 0 -> 594 bytes ...0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.crc | Bin 0 -> 12 bytes ...art-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1 | Bin 0 -> 56 bytes .../GRCh38/splice_ai/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/splice_ai/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../splice_ai/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/splice_ai/1.0.ht/README.txt | 3 + .../GRCh38/splice_ai/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../splice_ai/1.0.ht/globals/metadata.json.gz | Bin 0 -> 290 bytes .../1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../splice_ai/1.0.ht/globals/parts/part-0 | Bin 0 -> 92 bytes .../.index.crc | Bin .../.metadata.json.gz.crc | Bin .../index | Bin .../metadata.json.gz | Bin .../GRCh38/splice_ai/1.0.ht/metadata.json.gz | Bin 0 -> 334 bytes .../1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../splice_ai/1.0.ht/rows/metadata.json.gz | Bin 0 -> 594 bytes ...0-6272a9a2-b08b-4926-9552-feb84ffa2308.crc | Bin 0 -> 12 bytes ...art-0-6272a9a2-b08b-4926-9552-feb84ffa2308 | Bin 0 -> 55 bytes .../GRCh38/topmed/1.0.ht/.README.txt.crc | Bin 0 -> 12 bytes .../GRCh38/topmed/1.0.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../topmed/1.0.ht/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../GRCh38/topmed/1.0.ht/README.txt | 3 + .../GRCh38/topmed/1.0.ht/_SUCCESS | 0 .../1.0.ht/globals/.metadata.json.gz.crc | Bin 0 -> 12 bytes .../topmed/1.0.ht/globals/metadata.json.gz | Bin 0 -> 263 bytes .../topmed/1.0.ht/globals/parts/.part-0.crc | Bin 0 -> 12 bytes .../GRCh38/topmed/1.0.ht/globals/parts/part-0 | Bin 0 -> 40 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 69 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../GRCh38/topmed/1.0.ht/metadata.json.gz | Bin 0 -> 323 bytes .../topmed/1.0.ht/rows/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../topmed/1.0.ht/rows/metadata.json.gz | Bin 0 -> 592 bytes ...0-795ab066-10c9-4aac-ad59-f29794a4b01f.crc | Bin 0 -> 12 bytes ...art-0-795ab066-10c9-4aac-ad59-f29794a4b01f | Bin 0 -> 50 bytes .../test/reference_datasets/raw/clinvar.vcf | 55 + .../var/test/reference_datasets/raw/exac.vcf | 202 +++ .../raw/gnomad_exomes_37.ht/.README.txt.crc | Bin 0 -> 12 bytes .../raw/gnomad_exomes_37.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../gnomad_exomes_37.ht/.metadata.json.gz.crc | Bin 0 -> 24 bytes .../raw/gnomad_exomes_37.ht}/README.txt | 2 +- .../raw/gnomad_exomes_37.ht/_SUCCESS | 0 .../globals/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../globals/metadata.json.gz | Bin 0 -> 776 bytes .../globals/parts/.part-0.crc | Bin 0 -> 36 bytes .../gnomad_exomes_37.ht/globals/parts/part-0 | Bin 0 -> 3438 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 65 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../raw/gnomad_exomes_37.ht/metadata.json.gz | Bin 0 -> 1606 bytes .../rows/.metadata.json.gz.crc | Bin 0 -> 28 bytes .../gnomad_exomes_37.ht/rows/metadata.json.gz | Bin 0 -> 2409 bytes ...0-ac88ea82-778e-4722-b4a5-67b02b78322d.crc | Bin 0 -> 28 bytes ...art-0-ac88ea82-778e-4722-b4a5-67b02b78322d | Bin 0 -> 2166 bytes .../raw/gnomad_exomes_38.ht/.README.txt.crc | Bin 0 -> 12 bytes .../raw/gnomad_exomes_38.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../gnomad_exomes_38.ht/.metadata.json.gz.crc | Bin 0 -> 24 bytes .../raw/gnomad_exomes_38.ht}/README.txt | 2 +- .../raw/gnomad_exomes_38.ht/_SUCCESS | 0 .../globals/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../globals/metadata.json.gz | Bin 0 -> 898 bytes .../globals/parts/.part-0.crc | Bin 0 -> 60 bytes .../gnomad_exomes_38.ht/globals/parts/part-0 | Bin 0 -> 6499 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 71 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../raw/gnomad_exomes_38.ht/metadata.json.gz | Bin 0 -> 1713 bytes .../rows/.metadata.json.gz.crc | Bin 0 -> 28 bytes .../gnomad_exomes_38.ht/rows/metadata.json.gz | Bin 0 -> 2318 bytes ...0-90a97bd8-3648-4074-89bd-3a64d58266e2.crc | Bin 0 -> 20 bytes ...art-0-90a97bd8-3648-4074-89bd-3a64d58266e2 | Bin 0 -> 1336 bytes .../raw/gnomad_genomes_37.ht/.README.txt.crc | Bin 0 -> 12 bytes .../raw/gnomad_genomes_37.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../.metadata.json.gz.crc | Bin 0 -> 24 bytes .../raw/gnomad_genomes_37.ht/README.txt | 3 + .../raw/gnomad_genomes_37.ht/_SUCCESS | 0 .../globals/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../globals/metadata.json.gz | Bin 0 -> 776 bytes .../globals/parts/.part-0.crc | Bin 0 -> 32 bytes .../gnomad_genomes_37.ht/globals/parts/part-0 | Bin 0 -> 2599 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 67 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../raw/gnomad_genomes_37.ht/metadata.json.gz | Bin 0 -> 1625 bytes .../rows/.metadata.json.gz.crc | Bin 0 -> 28 bytes .../rows/metadata.json.gz | Bin 0 -> 2451 bytes ...0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.crc | Bin 0 -> 24 bytes ...art-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a | Bin 0 -> 1777 bytes .../raw/gnomad_genomes_38.ht/.README.txt.crc | Bin 0 -> 12 bytes .../raw/gnomad_genomes_38.ht/._SUCCESS.crc | Bin 0 -> 8 bytes .../.metadata.json.gz.crc | Bin 0 -> 24 bytes .../raw/gnomad_genomes_38.ht/README.txt | 3 + .../raw/gnomad_genomes_38.ht/_SUCCESS | 0 .../globals/.metadata.json.gz.crc | Bin 0 -> 16 bytes .../globals/metadata.json.gz | Bin 0 -> 776 bytes .../globals/parts/.part-0.crc | Bin 0 -> 56 bytes .../gnomad_genomes_38.ht/globals/parts/part-0 | Bin 0 -> 6048 bytes .../.index.crc | Bin 0 -> 12 bytes .../.metadata.json.gz.crc | Bin 0 -> 12 bytes .../index | Bin 0 -> 71 bytes .../metadata.json.gz | Bin 0 -> 185 bytes .../raw/gnomad_genomes_38.ht/metadata.json.gz | Bin 0 -> 1570 bytes .../rows/.metadata.json.gz.crc | Bin 0 -> 28 bytes .../rows/metadata.json.gz | Bin 0 -> 2217 bytes ...0-a3c7b21c-f8dd-4d21-948b-3746f5229729.crc | Bin 0 -> 24 bytes ...art-0-a3c7b21c-f8dd-4d21-948b-3746f5229729 | Bin 0 -> 1700 bytes .../raw/submission_summary.txt | 100 ++ .../raw}/test_hgmd.vcf | 0 .../reference_datasets/raw/test_mitomap.csv | 4 + 895 files changed, 4008 insertions(+), 6890 deletions(-) delete mode 100755 download_and_create_reference_datasets/v02/create_ht__cadd.py delete mode 100755 download_and_create_reference_datasets/v02/create_ht__clinvar.py delete mode 100644 download_and_create_reference_datasets/v02/create_ht__combined_reference_data.py delete mode 100644 download_and_create_reference_datasets/v02/create_ht__eigen.py delete mode 100755 download_and_create_reference_datasets/v02/create_ht__mpc.py delete mode 100644 download_and_create_reference_datasets/v02/create_ht__primate_ai.py delete mode 100755 download_and_create_reference_datasets/v02/create_ht__topmed.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_1kg_ht.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_cadd_ht.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_ccREs_ht.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_clinvar_ht.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_combined_interval_ref_data.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_combined_reference_data_ht.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_dbnsfp_ht.py delete mode 100644 download_and_create_reference_datasets/v02/hail_scripts/write_splice_ai_ht.py delete mode 100644 download_and_create_reference_datasets/v02/mito/utils.py delete mode 100644 download_and_create_reference_datasets/v02/mito/write_combined_mito_reference_data_hts.py delete mode 100644 download_and_create_reference_datasets/v02/mito/write_mito_helix_ht.py delete mode 100644 download_and_create_reference_datasets/v02/mito/write_mito_hmtvar_ht.py delete mode 100644 download_and_create_reference_datasets/v02/mito/write_mito_mitimpact_ht.py delete mode 100644 download_and_create_reference_datasets/v02/mito/write_mito_mitomap_ht.py delete mode 100644 v03_pipeline/lib/annotations/rdc_dependencies.py delete mode 100644 v03_pipeline/lib/model/cached_reference_dataset_query.py delete mode 100644 v03_pipeline/lib/model/reference_dataset_collection.py delete mode 100644 v03_pipeline/lib/reference_data/clinvar.py delete mode 100644 v03_pipeline/lib/reference_data/clinvar_test.py delete mode 100644 v03_pipeline/lib/reference_data/compare_globals.py delete mode 100644 v03_pipeline/lib/reference_data/compare_globals_test.py delete mode 100644 v03_pipeline/lib/reference_data/config.py delete mode 100644 v03_pipeline/lib/reference_data/dataset_table_operations.py delete mode 100644 v03_pipeline/lib/reference_data/dataset_table_operations_test.py delete mode 100644 v03_pipeline/lib/reference_data/hgmd.py delete mode 100644 v03_pipeline/lib/reference_data/hgmd_test.py delete mode 100644 v03_pipeline/lib/reference_data/mito.py rename v03_pipeline/lib/{reference_data => reference_datasets}/__init__.py (100%) create mode 100644 v03_pipeline/lib/reference_datasets/clinvar.py create mode 100644 v03_pipeline/lib/reference_datasets/clinvar_path_variants.py create mode 100644 v03_pipeline/lib/reference_datasets/clinvar_test.py create mode 100644 v03_pipeline/lib/reference_datasets/dbnsfp.py create mode 100644 v03_pipeline/lib/reference_datasets/eigen.py create mode 100644 v03_pipeline/lib/reference_datasets/exac.py create mode 100644 v03_pipeline/lib/reference_datasets/exac_test.py rename v03_pipeline/lib/{reference_data => reference_datasets}/gencode/__init__.py (100%) rename v03_pipeline/lib/{reference_data => reference_datasets}/gencode/mapping_gene_ids.py (100%) rename v03_pipeline/lib/{reference_data => reference_datasets}/gencode/mapping_gene_ids_tests.py (97%) create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_coding_and_noncoding.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_exomes.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_exomes_test.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_genomes.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_genomes_test.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_mito.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_non_coding_constraint.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_qc.py create mode 100644 v03_pipeline/lib/reference_datasets/gnomad_utils.py create mode 100644 v03_pipeline/lib/reference_datasets/helix_mito.py create mode 100644 v03_pipeline/lib/reference_datasets/hgmd.py create mode 100644 v03_pipeline/lib/reference_datasets/hgmd_test.py create mode 100644 v03_pipeline/lib/reference_datasets/high_af_variants.py create mode 100644 v03_pipeline/lib/reference_datasets/hmtvar.py create mode 100644 v03_pipeline/lib/reference_datasets/local_constraint_mito.py create mode 100644 v03_pipeline/lib/reference_datasets/misc.py create mode 100644 v03_pipeline/lib/reference_datasets/misc_test.py create mode 100644 v03_pipeline/lib/reference_datasets/mitimpact.py create mode 100644 v03_pipeline/lib/reference_datasets/mitomap.py create mode 100644 v03_pipeline/lib/reference_datasets/mitomap_test.py rename v03_pipeline/lib/{reference_data => reference_datasets}/queries.py (100%) create mode 100644 v03_pipeline/lib/reference_datasets/reference_dataset.py create mode 100644 v03_pipeline/lib/reference_datasets/screen.py create mode 100644 v03_pipeline/lib/reference_datasets/splice_ai.py create mode 100644 v03_pipeline/lib/reference_datasets/topmed.py delete mode 100644 v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries.py delete mode 100644 v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries_test.py delete mode 100644 v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query.py delete mode 100644 v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query_test.py create mode 100644 v03_pipeline/lib/tasks/reference_data/updated_reference_dataset.py delete mode 100644 v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection.py delete mode 100644 v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection_test.py create mode 100644 v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query.py create mode 100644 v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query_test.py create mode 100644 v03_pipeline/lib/test/mock_clinvar_urls.py create mode 100644 v03_pipeline/lib/test/mocked_reference_datasets_testcase.py delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/README.txt delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx/.index.crc delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx/index delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/parts/.part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.crc delete mode 100644 v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/parts/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75 delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/parts/.part-0-9e75273d-7113-40e4-a327-453f3451dc8c.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/parts/part-0-9e75273d-7113-40e4-a327-453f3451dc8c delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/parts/.part-0-3569201c-d630-43c4-9056-cbace806fe8d.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/parts/part-0-3569201c-d630-43c4-9056-cbace806fe8d delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/README.txt delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/parts/.part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/parts/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/README.txt delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/.part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/.part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/index/part-0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2.idx/.index.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/index/part-0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2.idx/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/index/part-0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2.idx/index delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/index/part-0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2.idx/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/parts/.part-0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/parts/part-0-6353b1d7-bc23-4f3a-9fa2-dd9321ab97a2 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/.index.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/index delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/rows/parts/.part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/rows/parts/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923 delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/README.txt delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_1.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/README.txt delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_hgmd_37.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/parts/.part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/parts/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683 delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/.README.txt.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/README.txt delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/parts/.part-0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/parts/part-0 delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/.index.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/index delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/.metadata.json.gz.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/metadata.json.gz delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/parts/.part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.crc delete mode 100644 v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/parts/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/gnomad_qc_crdq.ht => reference_datasets/GRCh37/clinvar/2024-11-11.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/.metadata.json.gz.crc rename v03_pipeline/var/test/{reference_data/test_combined_mito_1.ht => reference_datasets/GRCh37/clinvar/2024-11-11.ht}/README.txt (78%) rename v03_pipeline/var/test/{reference_data/gnomad_qc_crdq.ht => reference_datasets/GRCh37/clinvar/2024-11-11.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/parts/.part-0-16d3574b-02c6-4ade-8054-836f2bbce002.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/parts/part-0-16d3574b-02c6-4ade-8054-836f2bbce002 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_clinvar_path_variants_crdq.ht => reference_datasets/GRCh37/dbnsfp/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_clinvar_path_variants_crdq.ht => reference_datasets/GRCh37/dbnsfp/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/parts/.part-0-67410585-d883-48cc-8d33-933fff287418.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/parts/part-0-67410585-d883-48cc-8d33-933fff287418 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_combined_1.ht.ht => reference_datasets/GRCh37/eigen/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_combined_1.ht.ht => reference_datasets/GRCh37/eigen/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/parts/.part-0-04c0af8a-a562-4e97-a303-1047deca5f45.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/parts/part-0-04c0af8a-a562-4e97-a303-1047deca5f45 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_combined_1.ht => reference_datasets/GRCh37/exac/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_combined_1.ht => reference_datasets/GRCh37/exac/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/index/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/index/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/index/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/index/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/parts/.part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/parts/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_combined_2.ht => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/.metadata.json.gz.crc rename v03_pipeline/var/test/{reference_data/test_combined_37.ht => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht}/README.txt (78%) rename v03_pipeline/var/test/{reference_data/test_combined_2.ht => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/metadata.json.gz rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht/rows/parts/.part-0-595a2be1-bb68-41eb-8367-dc7333299edc.crc => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.crc} (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht/rows/parts/part-0-595a2be1-bb68-41eb-8367-dc7333299edc => reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7} (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_combined_37.ht => reference_datasets/GRCh37/gnomad_exomes/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_combined_37.ht => reference_datasets/GRCh37/gnomad_exomes/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/parts/.part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/parts/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_combined_mito_1.ht => reference_datasets/GRCh37/gnomad_genomes/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_combined_mito_1.ht => reference_datasets/GRCh37/gnomad_genomes/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/parts/.part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/parts/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht => reference_datasets/GRCh37/gnomad_qc/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/.metadata.json.gz.crc rename v03_pipeline/var/test/{reference_data/test_clinvar_path_variants_crdq.ht => reference_datasets/GRCh37/gnomad_qc/1.0.ht}/README.txt (78%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht => reference_datasets/GRCh37/gnomad_qc/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/rows/parts/.part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/rows/parts/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht => reference_datasets/GRCh37/hgmd/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht => reference_datasets/GRCh37/hgmd/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/parts/.part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/parts/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht => reference_datasets/GRCh37/splice_ai/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_hgmd_37.ht => reference_datasets/GRCh37/splice_ai/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/parts/.part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/parts/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_interval_1.ht => reference_datasets/GRCh37/topmed/1.0.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_interval_1.ht => reference_datasets/GRCh37/topmed/1.0.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/parts/.part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/parts/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/.README.txt.crc rename v03_pipeline/var/test/{reference_data/test_interval_mito_1.ht => reference_datasets/GRCh38/clinvar/2024-11-11.ht}/._SUCCESS.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/README.txt rename v03_pipeline/var/test/{reference_data/test_interval_mito_1.ht => reference_datasets/GRCh38/clinvar/2024-11-11.ht}/_SUCCESS (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/.index.crc rename v03_pipeline/var/test/{reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx => reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx}/.metadata.json.gz.crc (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/index rename v03_pipeline/var/test/{reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx => reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/.part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/.part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-1-a918a0a7-ef41-490f-9d13-73a3e17beead.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-1-a918a0a7-ef41-490f-9d13-73a3e17beead.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-1-a918a0a7-ef41-490f-9d13-73a3e17beead.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-1-a918a0a7-ef41-490f-9d13-73a3e17beead.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/.part-0-113d0935-f89b-4d20-9f25-225c16c2f941.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/.part-1-a918a0a7-ef41-490f-9d13-73a3e17beead.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/part-0-113d0935-f89b-4d20-9f25-225c16c2f941 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/part-1-a918a0a7-ef41-490f-9d13-73a3e17beead create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx => reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx => reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx => reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx => reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/parts/.part-0-24084335-917b-4b51-8a30-4fe509d64745.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/parts/part-0-24084335-917b-4b51-8a30-4fe509d64745 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx => reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx => reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx => reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx => reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/parts/.part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/parts/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/metadata.json.gz rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/metadata.json.gz rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/.part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.crc => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-0-345f1488-be53-4c4b-8207-b052e86084d6.crc} (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.crc rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/.part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.crc => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.crc} (100%) rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-0-345f1488-be53-4c4b-8207-b052e86084d6} (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff => reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07} (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx => reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx => reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx => reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx => reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/rows/parts/.part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/rows/parts/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx => reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx => reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx => reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx => reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/rows/parts/.part-0-7791073a-d4da-48f7-903f-59f1ac95d459.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/rows/parts/part-0-7791073a-d4da-48f7-903f-59f1ac95d459 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/parts/.part-0-bccae774-994f-469e-9b30-01becb2109a0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/parts/part-0-bccae774-994f-469e-9b30-01becb2109a0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx => reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx => reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx => reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx => reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/rows/parts/.part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/rows/parts/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/.metadata.json.gz.crc rename v03_pipeline/var/test/{reference_data/test_combined_1.ht.ht => reference_datasets/GRCh38/gnomad_qc/1.0.ht}/README.txt (78%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/rows/parts/.part-0-46f30121-756f-4290-b7f1-e0f9993c9593.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/rows/parts/part-0-46f30121-756f-4290-b7f1-e0f9993c9593 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/parts/.part-0-eceecf38-7b1a-46ab-98c2-147256aff633.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/parts/part-0-eceecf38-7b1a-46ab-98c2-147256aff633 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx => reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx => reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx => reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx => reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/rows/metadata.json.gz rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht/rows/parts/.part-0-902230a8-2a45-4126-89b0-fdd919610d79.crc => reference_datasets/GRCh38/hgmd/1.0.ht/rows/parts/.part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.crc} (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht/rows/parts/part-0-902230a8-2a45-4126-89b0-fdd919610d79 => reference_datasets/GRCh38/hgmd/1.0.ht/rows/parts/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a} (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/parts/.part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/parts/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/rows/parts/.part-0-b707f718-6196-4c02-9d68-148cf0c9438e.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/rows/parts/part-0-b707f718-6196-4c02-9d68-148cf0c9438e create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/rows/parts/.part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/rows/parts/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/parts/.part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/parts/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/parts/.part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/parts/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/globals/parts/part-0 rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht/index/part-0-902230a8-2a45-4126-89b0-fdd919610d79.idx => reference_datasets/GRCh38/splice_ai/1.0.ht/index/part-0-6272a9a2-b08b-4926-9552-feb84ffa2308.idx}/.index.crc (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht/index/part-0-902230a8-2a45-4126-89b0-fdd919610d79.idx => reference_datasets/GRCh38/splice_ai/1.0.ht/index/part-0-6272a9a2-b08b-4926-9552-feb84ffa2308.idx}/.metadata.json.gz.crc (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht/index/part-0-902230a8-2a45-4126-89b0-fdd919610d79.idx => reference_datasets/GRCh38/splice_ai/1.0.ht/index/part-0-6272a9a2-b08b-4926-9552-feb84ffa2308.idx}/index (100%) rename v03_pipeline/var/test/{reference_data/test_hgmd_1.ht/index/part-0-902230a8-2a45-4126-89b0-fdd919610d79.idx => reference_datasets/GRCh38/splice_ai/1.0.ht/index/part-0-6272a9a2-b08b-4926-9552-feb84ffa2308.idx}/metadata.json.gz (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/parts/.part-0-6272a9a2-b08b-4926-9552-feb84ffa2308.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/parts/part-0-6272a9a2-b08b-4926-9552-feb84ffa2308 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/parts/.part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.crc create mode 100644 v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/parts/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f create mode 100644 v03_pipeline/var/test/reference_datasets/raw/clinvar.vcf create mode 100644 v03_pipeline/var/test/reference_datasets/raw/exac.vcf create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/.metadata.json.gz.crc rename v03_pipeline/var/test/{reference_data/test_gnomad_coding_noncoding_crdq_1.ht => reference_datasets/raw/gnomad_exomes_37.ht}/README.txt (78%) create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/index/part-0-ac88ea82-778e-4722-b4a5-67b02b78322d.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/index/part-0-ac88ea82-778e-4722-b4a5-67b02b78322d.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/index/part-0-ac88ea82-778e-4722-b4a5-67b02b78322d.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/index/part-0-ac88ea82-778e-4722-b4a5-67b02b78322d.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/parts/.part-0-ac88ea82-778e-4722-b4a5-67b02b78322d.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/parts/part-0-ac88ea82-778e-4722-b4a5-67b02b78322d create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/.metadata.json.gz.crc rename v03_pipeline/var/test/{reference_data/test_interval_1.ht => reference_datasets/raw/gnomad_exomes_38.ht}/README.txt (78%) create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/parts/.part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/parts/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2 create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/parts/.part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/parts/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/.README.txt.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/._SUCCESS.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/README.txt create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/_SUCCESS create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/parts/.part-0.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/parts/part-0 create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/.index.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/index create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/rows/.metadata.json.gz.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/rows/metadata.json.gz create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/rows/parts/.part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.crc create mode 100644 v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/rows/parts/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729 create mode 100644 v03_pipeline/var/test/reference_datasets/raw/submission_summary.txt rename v03_pipeline/var/test/{reference_data => reference_datasets/raw}/test_hgmd.vcf (100%) create mode 100644 v03_pipeline/var/test/reference_datasets/raw/test_mitomap.csv diff --git a/download_and_create_reference_datasets/v02/create_ht__cadd.py b/download_and_create_reference_datasets/v02/create_ht__cadd.py deleted file mode 100755 index 7b90c7c34..000000000 --- a/download_and_create_reference_datasets/v02/create_ht__cadd.py +++ /dev/null @@ -1,8 +0,0 @@ -#!/usr/bin/env python3 - -from kubernetes.shell_utils import simple_run as run - -run(( - "python3 gcloud_dataproc/v02/run_script.py " - "--cluster create-ht-cadd " - "download_and_create_reference_datasets/v02/hail_scripts/write_cadd_ht.py")) diff --git a/download_and_create_reference_datasets/v02/create_ht__clinvar.py b/download_and_create_reference_datasets/v02/create_ht__clinvar.py deleted file mode 100755 index cd5928121..000000000 --- a/download_and_create_reference_datasets/v02/create_ht__clinvar.py +++ /dev/null @@ -1,8 +0,0 @@ -#!/usr/bin/env python3 - -from kubernetes.shell_utils import simple_run as run - -run(( - "python3 gcloud_dataproc/v02/run_script.py " - "--cluster create-ht-clinvar " - "download_and_create_reference_datasets/v02/hail_scripts/write_clinvar_ht.py")) diff --git a/download_and_create_reference_datasets/v02/create_ht__combined_reference_data.py b/download_and_create_reference_datasets/v02/create_ht__combined_reference_data.py deleted file mode 100644 index 482e5298f..000000000 --- a/download_and_create_reference_datasets/v02/create_ht__combined_reference_data.py +++ /dev/null @@ -1,14 +0,0 @@ -#!/usr/bin/env python3 - -import argparse -from kubernetes.shell_utils import simple_run as run - -parser = argparse.ArgumentParser() -parser.add_argument('-b', '--build', help='Reference build, 37 or 38', choices=["37", "38"], required=True) -args = parser.parse_args() - -run(( - "python3 gcloud_dataproc/v02/run_script.py " - "--cluster create-ht-combined-reference-data " - "download_and_create_reference_datasets/v02/hail_scripts/write_combined_reference_data_ht.py " - f"--build {args.build}")) diff --git a/download_and_create_reference_datasets/v02/create_ht__eigen.py b/download_and_create_reference_datasets/v02/create_ht__eigen.py deleted file mode 100644 index 122a21631..000000000 --- a/download_and_create_reference_datasets/v02/create_ht__eigen.py +++ /dev/null @@ -1,14 +0,0 @@ -#!/usr/bin/env python3 - -from kubernetes.shell_utils import simple_run as run - -for genome_version, vcf_path in [ - ("37", "gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.vcf.gz"), - ("38", "gs://seqr-reference-data/GRCh38/eigen/EIGEN_coding_noncoding.liftover_grch38.vcf.gz"), -]: - run(("python3 gcloud_dataproc/v02/run_script.py " - "--cluster create-ht-eigen " - "hail_scripts/v02/convert_vcf_to_hail.py " - "--output-sites-only-ht " - f"--genome-version {genome_version} " - f"{vcf_path}")) diff --git a/download_and_create_reference_datasets/v02/create_ht__mpc.py b/download_and_create_reference_datasets/v02/create_ht__mpc.py deleted file mode 100755 index 030b15613..000000000 --- a/download_and_create_reference_datasets/v02/create_ht__mpc.py +++ /dev/null @@ -1,14 +0,0 @@ -#!/usr/bin/env python3 - -from kubernetes.shell_utils import simple_run as run - -for genome_version, vcf_path in [ - ("37", "gs://seqr-reference-data/GRCh37/MPC/fordist_constraint_official_mpc_values.vcf.gz"), - ("38", "gs://seqr-reference-data/GRCh38/MPC/fordist_constraint_official_mpc_values.liftover.GRCh38.vcf.gz"), -]: - run(("python3 gcloud_dataproc/v02/run_script.py " - "--cluster create-ht-mpc " - "hail_scripts/v02/convert_vcf_to_hail.py " - "--output-sites-only-ht " - f"--genome-version {genome_version} " - f"{vcf_path}")) diff --git a/download_and_create_reference_datasets/v02/create_ht__primate_ai.py b/download_and_create_reference_datasets/v02/create_ht__primate_ai.py deleted file mode 100644 index 0c97a50d7..000000000 --- a/download_and_create_reference_datasets/v02/create_ht__primate_ai.py +++ /dev/null @@ -1,14 +0,0 @@ -#!/usr/bin/env python3 - -from kubernetes.shell_utils import simple_run as run - -for genome_version, vcf_path in [ - ("37", "gs://seqr-reference-data/GRCh37/primate_ai/PrimateAI_scores_v0.2.vcf.gz"), - ("38", "gs://seqr-reference-data/GRCh38/primate_ai/PrimateAI_scores_v0.2.liftover_grch38.vcf.gz"), -]: - run(("python3 gcloud_dataproc/v02/run_script.py " - "--cluster create-ht-primate-ai " - "hail_scripts/v02/convert_vcf_to_hail.py " - "--output-sites-only-ht " - f"--genome-version {genome_version} " - f"{vcf_path}")) diff --git a/download_and_create_reference_datasets/v02/create_ht__topmed.py b/download_and_create_reference_datasets/v02/create_ht__topmed.py deleted file mode 100755 index f06937d8e..000000000 --- a/download_and_create_reference_datasets/v02/create_ht__topmed.py +++ /dev/null @@ -1,14 +0,0 @@ -#!/usr/bin/env python3 - -from kubernetes.shell_utils import simple_run as run - -for genome_version, vcf_path in [ - ("37", "gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.vcf.gz"), - ("38", "gs://seqr-reference-data/GRCh38/TopMed/bravo-dbsnp-all.vcf.gz"), -]: - run(("python3 gcloud_dataproc/v02/run_script.py " - "--cluster create-ht-topmed " - "hail_scripts/v02/convert_vcf_to_hail.py " - "--output-sites-only-ht " - f"--genome-version {genome_version} " - f"{vcf_path}")) diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_1kg_ht.py b/download_and_create_reference_datasets/v02/hail_scripts/write_1kg_ht.py deleted file mode 100644 index aa4a216af..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_1kg_ht.py +++ /dev/null @@ -1,71 +0,0 @@ -import logging - -import hail as hl - -from hail_scripts.utils.hail_utils import import_vcf - -logger = logging.getLogger('v02.hail_scripts.create_1kg_ht') - -CONFIG= { - "37": "gs://seqr-reference-data/GRCh37/1kg/1kg.wgs.phase3.20130502.GRCh37_sites.vcf.gz", - "38": "gs://seqr-reference-data/GRCh38/1kg/1kg.wgs.phase3.20170504.GRCh38_sites.vcf.gz" -} - -def vcf_to_mt(path, genome_version): - ''' - Converts 1kg vcf to mt. The 1kg dataset has multi-allelic variants and duplicates. - This function independently filters the mutli-allelics to split, then unions with - the bi-allelics. - - :param path: vcf path - :param genome_version: genome version - :return: - ''' - # Import but do not split multis here. - mt = import_vcf(path, - genome_version=genome_version, - min_partitions=1000, - split_multi_alleles=False) - - multiallelic_mt = mt.filter_rows(hl.len(mt.alleles) > 2) - multiallelic_mt = hl.split_multi_hts(multiallelic_mt) - - # We annotate some rows manually to conform to the multiallelic_mt (after split). - # Calling split_multi_hts on biallelic to annotate the rows causes problems. - biallelic_mt = mt.filter_rows(hl.len(mt.alleles) == 2) - biallelic_mt = biallelic_mt.annotate_rows(a_index=1, was_split=False) - - all_mt = biallelic_mt.union_rows(multiallelic_mt) - all_mt = all_mt.key_rows_by(all_mt.locus, all_mt.alleles) - - # 37 is known to have some unneeded symbolic alleles, so we filter out. - all_mt = all_mt.filter_rows( - hl.allele_type(all_mt.alleles[0], all_mt.alleles[1]) == 'Symbolic', - keep=False - ) - - return all_mt - -def annotate_mt(mt): - # Annotate POPMAX_AF, which is max of respective fields using a_index for multi-allelics. - return mt.annotate_rows(POPMAX_AF=hl.max(mt.info.AFR_AF[mt.a_index-1], - mt.info.AMR_AF[mt.a_index - 1], - mt.info.EAS_AF[mt.a_index - 1], - mt.info.EUR_AF[mt.a_index - 1], - mt.info.SAS_AF[mt.a_index - 1])) - -def run(): - for genome_version, path in CONFIG.items(): - logger.info('reading from input path: %s' % path) - - mt = vcf_to_mt(path, genome_version) - mt = annotate_mt(mt) - - mt.describe() - - output_path = path.replace(".vcf", "").replace(".gz", "").replace(".bgz", "")\ - .replace(".*", "").replace("*", "") + ".ht" - logger.info('writing to output path: %s' % output_path) - mt.rows().write(output_path) - -run() diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_cadd_ht.py b/download_and_create_reference_datasets/v02/hail_scripts/write_cadd_ht.py deleted file mode 100644 index 1e3585ffc..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_cadd_ht.py +++ /dev/null @@ -1,49 +0,0 @@ -#!/usr/bin/env python3 - -# combine the pre-computed CADD .tsvs from https://cadd.gs.washington.edu/download into 1 Table for each genome build - -import logging -logging.basicConfig(format='%(asctime)s %(levelname)-8s %(message)s') -logger = logging.getLogger() -logger.setLevel(logging.INFO) - - -import hail as hl -from hail_scripts.utils.hail_utils import write_ht, import_table - -hl.init() - - -def import_cadd_table(path: str, genome_version: str) -> hl.Table: - if genome_version not in ("37", "38"): - raise ValueError(f"Invalid genome version: {genome_version}") - - column_names = {'f0': 'chrom', 'f1': 'pos', 'f2': 'ref', 'f3': 'alt', 'f4': 'RawScore', 'f5': 'PHRED'} - types = {'f0': hl.tstr, 'f1': hl.tint, 'f4': hl.tfloat32, 'f5': hl.tfloat32} - - cadd_ht = hl.import_table(path, force_bgz=True, comment="#", no_header=True, types=types, min_partitions=10000) - cadd_ht = cadd_ht.rename(column_names) - chrom = hl.format("chr%s", cadd_ht.chrom) if genome_version == "38" else cadd_ht.chrom - locus = hl.locus(chrom, cadd_ht.pos, reference_genome=hl.get_reference(f"GRCh{genome_version}")) - alleles = hl.array([cadd_ht.ref, cadd_ht.alt]) - cadd_ht = cadd_ht.transmute(locus=locus, alleles=alleles) - - cadd_union_ht = cadd_ht.head(0) - for contigs in (range(1, 10), list(range(10, 23)) + ["X", "Y", "MT"]): - contigs = ["chr%s" % contig for contig in contigs] if genome_version == "38" else contigs - cadd_ht_subset = cadd_ht.filter(hl.array(list(map(str, contigs))).contains(cadd_ht.locus.contig)) - cadd_union_ht = cadd_union_ht.union(cadd_ht_subset) - - cadd_union_ht = cadd_union_ht.key_by("locus", "alleles") - - cadd_union_ht.describe() - - return cadd_union_ht - -for genome_version in ["37", "38"]: - snvs_ht = import_cadd_table(f"gs://seqr-reference-data/GRCh{genome_version}/CADD/CADD_snvs.v1.6.tsv.gz", genome_version) - indel_ht = import_cadd_table(f"gs://seqr-reference-data/GRCh{genome_version}/CADD/InDels_v1.6.tsv.gz", genome_version) - - ht = snvs_ht.union(indel_ht) - - ht.naive_coalesce(10000).write(f"gs://seqr-reference-data/GRCh{genome_version}/CADD/CADD_snvs_and_indels.v1.6.ht", overwrite=True) diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_ccREs_ht.py b/download_and_create_reference_datasets/v02/hail_scripts/write_ccREs_ht.py deleted file mode 100644 index c210c10af..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_ccREs_ht.py +++ /dev/null @@ -1,59 +0,0 @@ -import logging - -import hail as hl - -logging.basicConfig(format="%(asctime)s %(levelname)-8s %(message)s") -logger = logging.getLogger() -logger.setLevel(logging.INFO) - -CONFIG = {"38": "gs://seqr-reference-data/GRCh38/ccREs/GRCh38-ccREs.bed"} - - -def make_interval_bed_table(ht, reference_genome): - """ - Remove the extra fields from the input ccREs file and mimic a bed import. - - :param ht: ccREs bed file. - :return: Hail table that mimics basic bed file table. - """ - ht = ht.select( - interval=hl.locus_interval( - ht["f0"], - ht["f1"]+1, - ht["f2"]+1, - reference_genome=f"GRCh{reference_genome}", - invalid_missing=True, - ), - target=ht["f5"], - ) - ht = ht.transmute(target=ht.target.split(",")) - return ht.key_by("interval") - - -def run(): - for genome_version, path in CONFIG.items(): - logger.info("Reading from input path: %s", path) - - ht = hl.import_table( - path, - no_header=True, - min_partitions=100, - types={ - "f0": hl.tstr, - "f1": hl.tint32, - "f2": hl.tint32, - "f3": hl.tstr, - "f4": hl.tstr, - "f5": hl.tstr, # Hail throws a JSON parse error when using tarray(hl.tstr) so split string later in function - }, - ) - ht = make_interval_bed_table(ht, genome_version) - - ht.describe() - - output_path = path.replace(".bed", "") + ".ht" - logger.info("Writing to output path: %s", output_path) - ht.write(output_path, overwrite=True) - - -run() diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_clinvar_ht.py b/download_and_create_reference_datasets/v02/hail_scripts/write_clinvar_ht.py deleted file mode 100644 index e0584112e..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_clinvar_ht.py +++ /dev/null @@ -1,29 +0,0 @@ -import tempfile - -import hail as hl - -from v03_pipeline.lib.model import ReferenceGenome -from v03_pipeline.lib.reference_data.clinvar import ( - download_and_import_latest_clinvar_vcf, - CLINVAR_GOLD_STARS_LOOKUP, -) -from hail_scripts.utils.hail_utils import write_ht - -CLINVAR_PATH = 'ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_{reference_genome}/clinvar.vcf.gz' -CLINVAR_HT_PATH = 'gs://seqr-reference-data/{reference_genome}/clinvar/clinvar.{reference_genome}.ht' - -for reference_genome in ReferenceGenome: - clinvar_url = CLINVAR_PATH.format(reference_genome=reference_genome.value) - ht = download_and_import_latest_clinvar_vcf(clinvar_url, reference_genome) - timestamp = hl.eval(ht.version) - ht = ht.annotate( - gold_stars=CLINVAR_GOLD_STARS_LOOKUP.get(hl.delimit(ht.info.CLNREVSTAT)) - ) - ht.describe() - ht = ht.repartition(100) - write_ht( - ht, - CLINVAR_HT_PATH.format(reference_genome=reference_genome.value).replace(".ht", ".") - + timestamp - + ".ht", - ) diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_combined_interval_ref_data.py b/download_and_create_reference_datasets/v02/hail_scripts/write_combined_interval_ref_data.py deleted file mode 100644 index 83e4c74b0..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_combined_interval_ref_data.py +++ /dev/null @@ -1,43 +0,0 @@ -import argparse -import logging - -import hail as hl - -from v03_pipeline.lib.reference_data.dataset_table_operations import join_hts - -VERSION = '2.0.5' -OUTPUT_PATH = "gs://seqr-reference-data/GRCh38/combined_interval_reference_data/combined_interval_reference_data.ht" - -logging.basicConfig(format="%(asctime)s %(levelname)-8s %(message)s", level="INFO") -logger = logging.getLogger(__name__) - - -def run(args): - hl.init(default_reference="GRCh38") - logger.info("Joining the interval reference datasets") - joined_ht = join_hts( - ["gnomad_non_coding_constraint", "screen"], VERSION, reference_genome="38" - ) - - output_path = args.output_path if args.output_path else OUTPUT_PATH - logger.info("Writing to %s", output_path) - joined_ht.write(output_path, overwrite=args.force_write) - logger.info("Done") - - -if __name__ == "__main__": - parser = argparse.ArgumentParser() - parser.add_argument( - "-f", - "--force-write", - help="Overwrite an existing output file", - action="store_true", - ) - parser.add_argument( - "-o", - "--output-path", - help=f"Output path for the combined reference dataset. Default is {OUTPUT_PATH}", - ) - args = parser.parse_args() - - run(args) diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_combined_reference_data_ht.py b/download_and_create_reference_datasets/v02/hail_scripts/write_combined_reference_data_ht.py deleted file mode 100644 index ae456aab5..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_combined_reference_data_ht.py +++ /dev/null @@ -1,30 +0,0 @@ -import argparse -import os - -import hail as hl - -from v03_pipeline.lib.reference_data.dataset_table_operations import join_hts -from v03_pipeline.lib.reference_data.config import CONFIG - -VERSION = '2.0.4' -OUTPUT_TEMPLATE = 'gs://seqr-reference-data/GRCh{genome_version}/' \ - 'all_reference_data/v2/combined_reference_data_grch{genome_version}-{version}.ht' - -def run(args): - hl._set_flags(no_whole_stage_codegen='1') # hail 0.2.78 hits an error on the join, this flag gets around it - joined_ht = join_hts(['cadd', 'mpc', 'eigen', 'dbnsfp', 'topmed', 'primate_ai', 'splice_ai', 'exac', - 'gnomad_genomes', 'gnomad_exomes', 'geno2mp', 'gnomad_genome_coverage', 'gnomad_exome_coverage'], - VERSION, - args.build,) - output_path = os.path.join(OUTPUT_TEMPLATE.format(genome_version=args.build, version=VERSION)) - print('Writing to %s' % output_path) - joined_ht.write(os.path.join(output_path)) - - -if __name__ == "__main__": - - parser = argparse.ArgumentParser() - parser.add_argument('-b', '--build', help='Reference build, 37 or 38', choices=["37", "38"], required=True) - args = parser.parse_args() - - run(args) diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_dbnsfp_ht.py b/download_and_create_reference_datasets/v02/hail_scripts/write_dbnsfp_ht.py deleted file mode 100644 index ede40223f..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_dbnsfp_ht.py +++ /dev/null @@ -1,147 +0,0 @@ -import hail as hl -from hail.expr import tint, tfloat, tstr - -DBNSFP_INFO = { - '2.9.3': { - 'reference_genome': '37', - 'source_path': 'gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.chr*.gz', - 'output_path': 'gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.with_new_scores.ht', - }, - '4.2': { - 'reference_genome': '38', - 'source_path': 'gs://seqr-reference-data/GRCh38/dbNSFP/v4.2/dbNSFP4.2a_variant.chr*.gz', - 'output_path': 'gs://seqr-reference-data/GRCh38/dbNSFP/v4.2/dbNSFP4.2a_variant.with_new_scores.ht', - }, -} - -# Fields from the dataset file. -DBNSFP_SCHEMA = { - '2.9.3': { - '#chr': tstr, - 'pos(1-coor)': tint, - 'ref': tstr, - 'alt': tstr, - 'SIFT_score': tstr, - 'Polyphen2_HDIV_pred': tstr, - 'Polyphen2_HVAR_score': tstr, - 'LRT_pred': tstr, - 'MutationTaster_pred': tstr, - 'MutationAssessor_pred': tstr, - 'FATHMM_pred': tstr, - 'MetaSVM_pred': tstr, - 'MetaLR_pred': tstr, - 'VEST3_score': tstr, - 'VEST3_rankscore': tstr, - 'PROVEAN_pred': tstr, - 'M-CAP_pred': tstr, - 'REVEL_score': tstr, - 'REVEL_rankscore': tstr, - 'MutPred_Top5features': tstr, - 'Eigen-phred': tstr, - 'Eigen-PC-phred': tstr, - 'GERP++_RS': tstr, - 'GERP++_RS_rankscore': tstr, - 'phyloP46way_primate': tstr, - 'phyloP46way_primate_rankscore': tstr, - 'phyloP46way_placental': tstr, - 'phyloP46way_placental_rankscore': tstr, - 'phyloP100way_vertebrate': tstr, - 'phyloP100way_vertebrate_rankscore': tstr, - 'phastCons46way_primate': tstr, - 'phastCons46way_primate_rankscore': tstr, - 'phastCons46way_placental': tstr, - 'phastCons46way_placental_rankscore': tstr, - 'phastCons100way_vertebrate': tstr, - 'phastCons100way_vertebrate_rankscore': tstr, - 'SiPhy_29way_pi': tstr, - 'SiPhy_29way_logOdds_rankscore': tstr, - 'ESP6500_AA_AF': tfloat, - # This space is intentional and in the file. - 'ESP6500_EA_AF ': tfloat, - 'ARIC5606_AA_AC': tint, - 'ARIC5606_AA_AF': tfloat, - 'ARIC5606_EA_AC': tint, - 'ARIC5606_EA_AF': tfloat, - }, - '4.2': { - '#chr': tstr, - 'pos(1-based)': tint, - 'ref': tstr, - 'alt': tstr, - 'SIFT_score': tstr, - 'Polyphen2_HVAR_score': tstr, - 'MutationTaster_pred': tstr, - 'FATHMM_pred': tstr, - 'VEST4_score': tstr, - 'MetaSVM_pred': tstr, - 'REVEL_score': tstr, - 'GERP++_RS': tstr, - 'phastCons100way_vertebrate': tstr, - 'fathmm-MKL_coding_score': tfloat, - 'MutPred_score': tstr, - } -} - -def generate_replacement_fields(ht, schema): - ''' - Hail Tables need to have a fields remapping. This function generates a dict from - the new transformed field name (whitespace stripped, dash to underscore) to original - field name. The original field name references the exact attribute of ht, per - hail construct so we can feed it to the select query. - - :param ht: Hail table to reference the original field attribute. - :param schema: schema mapping from original field name to type - :return: dict of new transformed name to old attr from ht - ''' - def transform(field_name): - return field_name.strip(" `#").replace("(1-coor)", "")\ - .replace("(1-based)", "").replace("-", "_").replace("+", "") - return { - transform(field_name): getattr(ht, field_name) for field_name in schema.keys() - } - -def dbnsfp_to_ht(source_path, output_path, reference_genome='37', dbnsfp_version="2.9.3"): - ''' - Runs the conversion from importing the table from the source path, proessing the - fields, and outputing as a matrix table to the output path. - - :param source_path: location of the dbnsfp data - :param output_path: location to put the matrix table - :param dbnsfp_version: version - :return: - ''' - # Import the table using the schema to define the types. - ht = hl.import_table(source_path, - types=DBNSFP_SCHEMA[dbnsfp_version], - missing='.', - force=True, - min_partitions=10000) - # get a attribute map to run a select and remap fields. - replacement_fields = generate_replacement_fields(ht, DBNSFP_SCHEMA[dbnsfp_version]) - ht = ht.select(**replacement_fields) - ht = ht.filter(ht.alt == ht.ref, keep=False) #Ask DBSNFP why ref = alt exists if cant find in docs - - # key_by locus and allele needed for matrix table conversion to denote variant data. - chr = ht.chr if reference_genome == '37' else hl.str('chr' + ht.chr) - locus = hl.locus(chr, ht.pos, reference_genome='GRCh%s'%reference_genome) - # We have to upper because 37 is known to have some non uppercases :( - ht = ht.key_by(locus=locus, alleles=[ht.ref.upper(), ht.alt.upper()]) - - - ht = ht.annotate_globals( - sourceFilePath=source_path, - version=dbnsfp_version, - ) - - ht.write(output_path, overwrite=True) - return ht - -def run(): - for dbnsfp_version, config in DBNSFP_INFO.items(): - ht = dbnsfp_to_ht(config["source_path"], - config["output_path"], - config['reference_genome'], - dbnsfp_version) - ht.describe() - -run() diff --git a/download_and_create_reference_datasets/v02/hail_scripts/write_splice_ai_ht.py b/download_and_create_reference_datasets/v02/hail_scripts/write_splice_ai_ht.py deleted file mode 100644 index 051edc2a5..000000000 --- a/download_and_create_reference_datasets/v02/hail_scripts/write_splice_ai_ht.py +++ /dev/null @@ -1,94 +0,0 @@ -import logging -import os - -import hail as hl - -from gnomad.resources.resource_utils import NO_CHR_TO_CHR_CONTIG_RECODING - -CONFIG = { - "37": ( - "gs://seqr-reference-data/GRCh37/spliceai/new-version-2019-10-11/spliceai_scores.masked.snv.hg19.vcf.gz", - "gs://seqr-reference-data/GRCh37/spliceai/new-version-2019-10-11/spliceai_scores.masked.indel.hg19.vcf.gz", - ), - "38": ( - "gs://seqr-reference-data/GRCh38/spliceai/new-version-2019-10-11/spliceai_scores.masked.snv.hg38.vcf.gz", - "gs://seqr-reference-data/GRCh38/spliceai/new-version-2019-10-11/spliceai_scores.masked.indel.hg38.vcf.gz", - ), -} - -logging.basicConfig(level=logging.INFO) -logger = logging.getLogger(__name__) - - -def vcf_to_mt(splice_ai_snvs_path, splice_ai_indels_path, genome_version): - """ - Loads the snv path and indels source path to a matrix table and returns the table. - - :param splice_ai_snvs_path: source location - :param splice_ai_indels_path: source location - :return: matrix table - """ - - logger.info( - "==> reading in splice_ai vcfs: %s, %s" - % (splice_ai_snvs_path, splice_ai_indels_path) - ) - - # for 37, extract to MT, for 38, MT not included. - interval = "1-MT" if genome_version == "37" else "chr1-chrY" - contig_dict = None - if genome_version == "38": - contig_dict = NO_CHR_TO_CHR_CONTIG_RECODING - - mt = hl.import_vcf( - [splice_ai_snvs_path, splice_ai_indels_path], - reference_genome=f"GRCh{genome_version}", - contig_recoding=contig_dict, - force_bgz=True, - min_partitions=10000, - skip_invalid_loci=True, - ) - interval = [ - hl.parse_locus_interval(interval, reference_genome=f"GRCh{genome_version}") - ] - mt = hl.filter_intervals(mt, interval) - - # Split SpliceAI field by | delimiter. Capture delta score entries and map to floats - delta_scores = mt.info.SpliceAI[0].split(delim="\\|")[2:6] - splice_split = mt.info.annotate( - SpliceAI=hl.map(lambda x: hl.float32(x), delta_scores) - ) - mt = mt.annotate_rows(info=splice_split) - - # Annotate info.max_DS with the max of DS_AG, DS_AL, DS_DG, DS_DL in info. - # delta_score array is |DS_AG|DS_AL|DS_DG|DS_DL - consequences = hl.literal( - ["Acceptor gain", "Acceptor loss", "Donor gain", "Donor loss"] - ) - mt = mt.annotate_rows(info=mt.info.annotate(max_DS=hl.max(mt.info.SpliceAI))) - mt = mt.annotate_rows( - info=mt.info.annotate( - splice_consequence=hl.if_else( - mt.info.max_DS > 0, - consequences[mt.info.SpliceAI.index(mt.info.max_DS)], - "No consequence", - ) - ) - ) - return mt - - -def run(): - for version, config in CONFIG.items(): - logger.info("===> Version %s" % version) - mt = vcf_to_mt(config[0], config[1], version) - - # Write mt as a ht to the same directory as the snv source. - dest = os.path.join(os.path.dirname(CONFIG[version][0]), "spliceai_scores.ht") - logger.info("===> Writing to %s" % dest) - ht = mt.rows() - ht.write(dest) - ht.describe() - - -run() diff --git a/download_and_create_reference_datasets/v02/mito/utils.py b/download_and_create_reference_datasets/v02/mito/utils.py deleted file mode 100644 index f7517cc88..000000000 --- a/download_and_create_reference_datasets/v02/mito/utils.py +++ /dev/null @@ -1,92 +0,0 @@ -import argparse -import logging -import json -import tqdm -import tempfile -import os -import zipfile -import requests - -import hail as hl - -logging.basicConfig(format='%(asctime)s %(levelname)-8s %(message)s', level='INFO') -logger = logging.getLogger(__name__) - - -def _download_file(url, to_dir=tempfile.gettempdir(), skip_verify=False): - if not (url and url.startswith(("http://", "https://"))): - raise ValueError("Invalid url: {}".format(url)) - - local_file_path = os.path.join(to_dir, os.path.basename(url.rstrip('/'))) - - if not skip_verify: - response = requests.head(url) - size = int(response.headers.get('Content-Length', '0')) - if os.path.isfile(local_file_path) and os.path.getsize(local_file_path) == size: - logger.info("Re-using {} previously downloaded from {}".format(local_file_path, url)) - return local_file_path - - is_gz = url.endswith(".gz") or url.endswith(".zip") - response = requests.get(url, stream=is_gz, verify=not skip_verify) - input_iter = response if is_gz else response.iter_content() - - logger.info("Downloading {} to {}".format(url, local_file_path)) - input_iter = tqdm.tqdm(input_iter, unit=" data" if is_gz else " lines") - - with open(local_file_path, 'wb') as f: - f.writelines(input_iter) - - input_iter.close() - - return local_file_path - - -def _convert_json_to_tsv(json_path): - with open(json_path, 'r') as f: - data = json.load(f) - tsv_path = f'{json_path[:-5]}.tsv' if json_path.endswith('.json') else f'{json_path}.tsv' - with open(tsv_path, 'w') as f: - header = '\t'.join(data[0].keys()) - f.write(header + '\n') - for row in data: - f.write('\t'.join([str(v) for v in row.values()]) + '\n') - return tsv_path - - -def _load_mito_ht(config, force_write=True): - logger.info(f'Downloading dataset from {config["input_path"]}.') - dn_path = _download_file(config['input_path'], skip_verify=config.get('skip_verify_ssl')) - - if dn_path.endswith('.zip'): - with zipfile.ZipFile(dn_path, 'r') as zip: - zip.extractall(path=os.path.dirname(dn_path)) - dn_path = dn_path[:-4] - - logger.info(f'Loading hail table from {dn_path}.') - types = config['field_types'] if config.get('field_types') else {} - if config['input_type'] == 'json': - tsv_path = _convert_json_to_tsv(dn_path) - ht = hl.import_table(tsv_path, types=types) - else: - ht = hl.import_table(dn_path, types=types) - - if config.get('annotate'): - ht = ht.annotate(**{field: func(ht) for field, func in config['annotate'].items()}) - - ht = ht.filter(ht.locus.contig == 'chrM') - - ht = ht.key_by('locus', 'alleles') - - logger.info(f'Writing hail table to {config["output_path"]}.') - ht.write(config['output_path'], overwrite=force_write) - logger.info('Done') - - -def load(config): - parser = argparse.ArgumentParser() - parser.add_argument('-f', '--force-write', help='Force write to an existing output file', action='store_true') - args = parser.parse_args() - - hl.init(default_reference='GRCh38') - - _load_mito_ht(config, args.force_write) diff --git a/download_and_create_reference_datasets/v02/mito/write_combined_mito_reference_data_hts.py b/download_and_create_reference_datasets/v02/mito/write_combined_mito_reference_data_hts.py deleted file mode 100644 index 185d352d6..000000000 --- a/download_and_create_reference_datasets/v02/mito/write_combined_mito_reference_data_hts.py +++ /dev/null @@ -1,44 +0,0 @@ -#!/usr/bin/env python3 -import argparse -import logging - -import hail as hl - -from v03_pipeline.lib.reference_data.dataset_table_operations import join_hts - -VERSION = '2.0.4' -OUTPUT_PATH = 'gs://seqr-reference-data/GRCh38/mitochondrial/all_mito_reference_data/combined_reference_data_chrM.ht' - -logging.basicConfig(format='%(asctime)s %(levelname)-8s %(message)s', level='INFO') -logger = logging.getLogger(__name__) - - -def run(args): - # If there are out-of-memory errors, such as "java.lang.OutOfMemoryError: GC overhead limit exceeded" - # then you may need to set the environment variable with the following command - # $ export PYSPARK_SUBMIT_ARGS="--driver-memory 4G pyspark-shell" - # "4G" in the environment variable can be bigger if your computer has a larger memory. - # See more information in https://discuss.hail.is/t/java-heap-space-out-of-memory/1575/6 - hl.init(default_reference='GRCh38', min_block_size=128, master='local[32]') - - logger.info('Joining the mitochondrial reference datasets') - joined_ht = join_hts( - ['gnomad_mito', 'mitomap', 'mitimpact', 'hmtvar', 'helix_mito', 'dbnsfp_mito'], - VERSION, - reference_genome='38' - ) - - joined_ht = joined_ht.rename({'dbnsfp_mito': 'dbnsfp'}) - output_path = args.output_path if args.output_path else OUTPUT_PATH - logger.info(f'Writing to {output_path}') - joined_ht.write(output_path, overwrite=args.force_write) - logger.info('Done') - - -if __name__ == "__main__": - parser = argparse.ArgumentParser() - parser.add_argument('-f', '--force-write', help='Force write to an existing output file', action='store_true') - parser.add_argument('-o', '--output-path', help=f'Output path for the combined reference dataset. Default is {OUTPUT_PATH}') - args = parser.parse_args() - - run(args) diff --git a/download_and_create_reference_datasets/v02/mito/write_mito_helix_ht.py b/download_and_create_reference_datasets/v02/mito/write_mito_helix_ht.py deleted file mode 100644 index f0f14a48e..000000000 --- a/download_and_create_reference_datasets/v02/mito/write_mito_helix_ht.py +++ /dev/null @@ -1,19 +0,0 @@ -import hail as hl - -from download_and_create_reference_datasets.v02.mito.utils import load - -CONFIG = { - 'input_path': 'https://helix-research-public.s3.amazonaws.com/mito/HelixMTdb_20200327.tsv', - 'input_type': 'tsv', - 'output_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/Helix/HelixMTdb_20200327.ht', - 'field_types': {'counts_hom': hl.tint32, 'AF_hom': hl.tfloat64, 'counts_het': hl.tint32, - 'AF_het': hl.tfloat64, 'max_ARF': hl.tfloat64, 'alleles': hl.tarray(hl.tstr)}, - 'annotate': { - 'locus': lambda ht: hl.locus('chrM', hl.parse_int32(ht.locus.split(':')[1])), - 'AN': lambda ht: hl.if_else(ht.AF_hom > 0, hl.int32(ht.counts_hom/ht.AF_hom), hl.int32(ht.counts_het/ht.AF_het)) - }, -} - - -if __name__ == "__main__": - load(CONFIG) diff --git a/download_and_create_reference_datasets/v02/mito/write_mito_hmtvar_ht.py b/download_and_create_reference_datasets/v02/mito/write_mito_hmtvar_ht.py deleted file mode 100644 index 05a9974bb..000000000 --- a/download_and_create_reference_datasets/v02/mito/write_mito_hmtvar_ht.py +++ /dev/null @@ -1,19 +0,0 @@ -import hail as hl - -from download_and_create_reference_datasets.v02.mito.utils import load - -CONFIG = { - 'input_path': 'https://www.hmtvar.uniba.it/api/main/', - 'input_type': 'json', - 'skip_verify_ssl': True, # The certificate of the website has expired. - 'output_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/HmtVar/HmtVar Jan. 10 2022.ht', - 'annotate': { - 'locus': lambda ht: hl.locus('chrM', hl.parse_int32(ht.nt_start)), - 'alleles': lambda ht: [ht.ref_rCRS, ht.alt], - 'disease_score': lambda ht: hl.parse_float(ht.disease_score), - }, -} - - -if __name__ == "__main__": - load(CONFIG) diff --git a/download_and_create_reference_datasets/v02/mito/write_mito_mitimpact_ht.py b/download_and_create_reference_datasets/v02/mito/write_mito_mitimpact_ht.py deleted file mode 100644 index 0e158b018..000000000 --- a/download_and_create_reference_datasets/v02/mito/write_mito_mitimpact_ht.py +++ /dev/null @@ -1,18 +0,0 @@ -import hail as hl - -from download_and_create_reference_datasets.v02.mito.utils import load - -CONFIG = { - 'input_path': 'https://mitimpact.css-mendel.it/cdn/MitImpact_db_3.1.3.txt.zip', - 'input_type': 'tsv', - 'output_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/MitImpact/MitImpact_db_3.1.3.ht', - 'annotate': { - 'locus': lambda ht: hl.locus('chrM', hl.parse_int32(ht.Start)), - 'alleles': lambda ht: [ht.Ref, ht.Alt], - 'APOGEE2_score': lambda ht: hl.parse_float(ht.APOGEE2_score), - }, -} - - -if __name__ == "__main__": - load(CONFIG) diff --git a/download_and_create_reference_datasets/v02/mito/write_mito_mitomap_ht.py b/download_and_create_reference_datasets/v02/mito/write_mito_mitomap_ht.py deleted file mode 100644 index 7ade39494..000000000 --- a/download_and_create_reference_datasets/v02/mito/write_mito_mitomap_ht.py +++ /dev/null @@ -1,20 +0,0 @@ -import hail as hl - -from download_and_create_reference_datasets.v02.mito.utils import load - -CONFIG = { - # The data source is https://www.mitomap.org/foswiki/bin/view/MITOMAP/ConfirmedMutations and it is a regular web - # page. So we download it manually and save the data to a file in tsv format. - 'input_path': 'https://storage.googleapis.com/seqr-reference-data/GRCh38/mitochondrial/MITOMAP/Mitomap%20Confirmed%20Mutations%20Feb.%2004%202022.tsv', - 'input_type': 'tsv', - 'output_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/MITOMAP/Mitomap Confirmed Mutations Feb. 04 2022.ht', - 'annotate': { - 'locus': lambda ht: hl.locus('chrM', hl.parse_int32(ht.Allele.first_match_in('m.([0-9]+)')[0])), - 'alleles': lambda ht: ht.Allele.first_match_in('m.[0-9]+([ATGC]+)>([ATGC]+)'), - 'pathogenic': lambda ht: True - }, -} - - -if __name__ == "__main__": - load(CONFIG) diff --git a/requirements-dev.in b/requirements-dev.in index 79ef6c802..0c76ef725 100644 --- a/requirements-dev.in +++ b/requirements-dev.in @@ -6,3 +6,4 @@ pip-tools>=6.12.3 responses>=0.23.1 ruff>=0.1.8 shellcheck-py>=0.10.0 +pysam==0.22.1 diff --git a/requirements-dev.txt b/requirements-dev.txt index 20fac8892..65e628a44 100644 --- a/requirements-dev.txt +++ b/requirements-dev.txt @@ -12,11 +12,11 @@ babel==2.13.1 # via sphinx build==1.0.3 # via pip-tools -certifi==2023.11.17 +certifi==2024.8.30 # via # -c requirements.txt # requests -charset-normalizer==3.3.2 +charset-normalizer==3.4.0 # via # -c requirements.txt # requests @@ -29,20 +29,18 @@ coverage==7.3.2 # -r requirements-dev.in # nose-py3 docutils==0.20.1 - # via - # -c requirements.txt - # sphinx -idna==3.6 + # via sphinx +idna==3.10 # via # -c requirements.txt # requests imagesize==1.4.1 # via sphinx -jinja2==3.1.3 +jinja2==3.1.4 # via # -c requirements.txt # sphinx -markupsafe==2.1.3 +markupsafe==3.0.2 # via # -c requirements.txt # jinja2 @@ -52,28 +50,30 @@ mypy-extensions==1.0.0 # via mypy nose-py3==1.6.3 # via -r requirements-dev.in -numpy==1.26.2 +numpy==1.26.4 # via # -c requirements.txt # nose-py3 -packaging==23.2 +packaging==24.1 # via # -c requirements.txt # build # sphinx pip-tools==7.3.0 # via -r requirements-dev.in -pygments==2.17.2 +pygments==2.18.0 # via # -c requirements.txt # sphinx pyproject-hooks==1.0.0 # via build -pyyaml==6.0.1 +pysam==0.22.1 + # via -r requirements-dev.in +pyyaml==6.0.2 # via # -c requirements.txt # responses -requests==2.31.0 +requests==2.32.3 # via # -c requirements.txt # responses @@ -116,11 +116,11 @@ tomli==2.0.1 # mypy # pip-tools # pyproject-hooks -typing-extensions==4.8.0 +typing-extensions==4.12.2 # via # -c requirements.txt # mypy -urllib3==2.0.7 +urllib3==2.2.3 # via # -c requirements.txt # requests diff --git a/v03_pipeline/bin/rsync_reference_data.bash b/v03_pipeline/bin/rsync_reference_data.bash index 825c583e5..db937c132 100755 --- a/v03_pipeline/bin/rsync_reference_data.bash +++ b/v03_pipeline/bin/rsync_reference_data.bash @@ -39,7 +39,7 @@ else fi fi -gsutil -m rsync -rd "gs://seqr-reference-data/v03/$REFERENCE_GENOME" $REFERENCE_DATASETS_DIR/$REFERENCE_GENOME +gsutil -m rsync -rd "gs://seqr-reference-data/v3.1/$REFERENCE_GENOME" $REFERENCE_DATASETS_DIR/$REFERENCE_GENOME if ! [[ $REFERENCE_DATASETS_DIR =~ gs://* ]]; then touch "$REFERENCE_DATASETS_DIR"/"$REFERENCE_GENOME"/_SUCCESS else diff --git a/v03_pipeline/lib/annotations/enums.py b/v03_pipeline/lib/annotations/enums.py index 16033214f..35a61179e 100644 --- a/v03_pipeline/lib/annotations/enums.py +++ b/v03_pipeline/lib/annotations/enums.py @@ -206,6 +206,21 @@ 'NEAREST_TSS', ] +CLINVAR_ASSERTIONS = [ + 'Affects', + 'association', + 'association_not_found', + 'confers_sensitivity', + 'drug_response', + 'low_penetrance', + 'not_provided', + 'other', + 'protective', + 'risk_factor', + 'no_classification_for_the_single_variant', + 'no_classifications_from_unflagged_records', +] + CLINVAR_DEFAULT_PATHOGENICITY = 'No_pathogenic_assertion' # NB: sorted by pathogenicity diff --git a/v03_pipeline/lib/annotations/fields_test.py b/v03_pipeline/lib/annotations/fields_test.py index 904369e23..fcb8a11d3 100644 --- a/v03_pipeline/lib/annotations/fields_test.py +++ b/v03_pipeline/lib/annotations/fields_test.py @@ -6,32 +6,39 @@ from v03_pipeline.lib.annotations.fields import get_fields from v03_pipeline.lib.model import ( DatasetType, - ReferenceDatasetCollection, ReferenceGenome, ) -from v03_pipeline.lib.paths import valid_reference_dataset_collection_path +from v03_pipeline.lib.paths import valid_reference_dataset_path +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase from v03_pipeline.lib.vep import run_vep from v03_pipeline.var.test.vep.mock_vep_data import MOCK_37_VEP_DATA, MOCK_38_VEP_DATA -TEST_INTERVAL_1 = 'v03_pipeline/var/test/reference_data/test_interval_1.ht' GRCH37_TO_GRCH38_LIFTOVER_REF_PATH = ( 'v03_pipeline/var/test/liftover/grch37_to_grch38.over.chain.gz' ) GRCH38_TO_GRCH37_LIFTOVER_REF_PATH = ( 'v03_pipeline/var/test/liftover/grch38_to_grch37.over.chain.gz' ) +TEST_GNOMAD_NONCODING_CONSTRAINT_38_HT = 'v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht' +TEST_SCREEN_38_HT = 'v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht' class FieldsTest(MockedDatarootTestCase): def setUp(self) -> None: super().setUp() shutil.copytree( - TEST_INTERVAL_1, - valid_reference_dataset_collection_path( + TEST_GNOMAD_NONCODING_CONSTRAINT_38_HT, + valid_reference_dataset_path( ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.INTERVAL, + ReferenceDataset.gnomad_non_coding_constraint, + ), + ) + shutil.copytree( + TEST_SCREEN_38_HT, + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.screen, ), ) @@ -120,18 +127,17 @@ def test_get_formatting_fields(self, mock_vep: Mock) -> None: reference_genome, ), **{ - f'{rdc.value}_ht': hl.read_table( - valid_reference_dataset_collection_path( + f'{reference_dataset}_ht': hl.read_table( + valid_reference_dataset_path( reference_genome, - DatasetType.SNV_INDEL, - rdc, + reference_dataset, ), ) - for rdc in ReferenceDatasetCollection.for_reference_genome_dataset_type( + for reference_dataset in ReferenceDataset.for_reference_genome_dataset_type_annotations( reference_genome, DatasetType.SNV_INDEL, ) - if rdc.requires_annotation + if reference_dataset.is_keyed_by_interval }, **( { diff --git a/v03_pipeline/lib/annotations/mito.py b/v03_pipeline/lib/annotations/mito.py index 49ed0c108..dbf483685 100644 --- a/v03_pipeline/lib/annotations/mito.py +++ b/v03_pipeline/lib/annotations/mito.py @@ -47,14 +47,6 @@ def HL(mt: hl.MatrixTable, **_: Any) -> hl.Expression: # noqa: N802 return hl.if_else(is_called, mt.HL, 0) -def high_constraint_region_mito( - ht: hl.Table, - interval_ht: hl.Table, - **_: Any, -) -> hl.Expression: - return hl.is_defined(interval_ht[ht.locus]) - - def mito_cn(mt: hl.MatrixTable, **_: Any) -> hl.Expression: return hl.int32(mt.mito_cn) diff --git a/v03_pipeline/lib/annotations/rdc_dependencies.py b/v03_pipeline/lib/annotations/rdc_dependencies.py deleted file mode 100644 index 1f1e13d7a..000000000 --- a/v03_pipeline/lib/annotations/rdc_dependencies.py +++ /dev/null @@ -1,29 +0,0 @@ -import hail as hl - -from v03_pipeline.lib.model import ( - DatasetType, - ReferenceDatasetCollection, - ReferenceGenome, -) -from v03_pipeline.lib.paths import ( - valid_reference_dataset_collection_path, -) - - -def get_rdc_annotation_dependencies( - dataset_type: DatasetType, - reference_genome: ReferenceGenome, -) -> dict[str, hl.Table]: - deps = {} - for rdc in ReferenceDatasetCollection.for_reference_genome_dataset_type( - reference_genome, - dataset_type, - ): - deps[f'{rdc.value}_ht'] = hl.read_table( - valid_reference_dataset_collection_path( - reference_genome, - dataset_type, - rdc, - ), - ) - return deps diff --git a/v03_pipeline/lib/annotations/snv_indel.py b/v03_pipeline/lib/annotations/snv_indel.py index 44a26b044..de1a561c4 100644 --- a/v03_pipeline/lib/annotations/snv_indel.py +++ b/v03_pipeline/lib/annotations/snv_indel.py @@ -73,16 +73,16 @@ def gt_stats( def gnomad_non_coding_constraint( ht: hl.Table, - interval_ht: hl.Table, + gnomad_non_coding_constraint_ht: hl.Table, **_: Any, ) -> hl.Expression: return hl.Struct( z_score=( - interval_ht.index(ht.locus, all_matches=True) + gnomad_non_coding_constraint_ht.index(ht.locus, all_matches=True) .filter( - lambda x: hl.is_defined(x.gnomad_non_coding_constraint['z_score']), + lambda x: hl.is_defined(x['z_score']), ) - .gnomad_non_coding_constraint.z_score.first() + .z_score.first() ), ) @@ -98,16 +98,16 @@ def rg38_locus( def screen( ht: hl.Table, - interval_ht: hl.Table, + screen_ht: hl.Table, **_: Any, ) -> hl.Expression: return hl.Struct( region_type_ids=( - interval_ht.index( + screen_ht.index( ht.locus, all_matches=True, ).flatmap( - lambda x: x.screen['region_type_ids'], + lambda x: x['region_type_ids'], ) ), ) diff --git a/v03_pipeline/lib/misc/validation.py b/v03_pipeline/lib/misc/validation.py index 5f5e4400c..77448bb03 100644 --- a/v03_pipeline/lib/misc/validation.py +++ b/v03_pipeline/lib/misc/validation.py @@ -3,17 +3,11 @@ import hail as hl from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, DatasetType, - Env, ReferenceGenome, SampleType, Sex, ) -from v03_pipeline.lib.paths import ( - cached_reference_dataset_query_path, - sex_check_table_path, -) AMBIGUOUS_THRESHOLD_PERC: float = 0.01 # Fraction of samples identified as "ambiguous_sex" above which an error will be thrown. MIN_ROWS_PER_CONTIG = 100 @@ -24,36 +18,6 @@ class SeqrValidationError(Exception): pass -def get_validation_dependencies( - dataset_type: DatasetType, - reference_genome: ReferenceGenome, - callset_path: str, - skip_check_sex_and_relatedness: bool, - **_: Any, -) -> dict[str, hl.Table]: - deps = {} - deps['coding_and_noncoding_variants_ht'] = hl.read_table( - cached_reference_dataset_query_path( - reference_genome, - dataset_type, - CachedReferenceDatasetQuery.GNOMAD_CODING_AND_NONCODING_VARIANTS, - ), - ) - if ( - Env.CHECK_SEX_AND_RELATEDNESS - and dataset_type.check_sex_and_relatedness - and not skip_check_sex_and_relatedness - ): - deps['sex_check_ht'] = hl.read_table( - sex_check_table_path( - reference_genome, - dataset_type, - callset_path, - ), - ) - return deps - - def validate_allele_type( mt: hl.MatrixTable, dataset_type: DatasetType, diff --git a/v03_pipeline/lib/misc/validation_test.py b/v03_pipeline/lib/misc/validation_test.py index d7b318f18..f32900a99 100644 --- a/v03_pipeline/lib/misc/validation_test.py +++ b/v03_pipeline/lib/misc/validation_test.py @@ -1,5 +1,4 @@ import unittest -from unittest.mock import Mock, patch import hail as hl @@ -111,8 +110,7 @@ def test_validate_allele_type(self) -> None: DatasetType.SNV_INDEL, ) - @patch('v03_pipeline.lib.misc.validation.Env') - def test_validate_imputed_sex_ploidy(self, mock_env: Mock) -> None: + def test_validate_imputed_sex_ploidy(self) -> None: female_sample = 'HG00731_1' male_sample_1 = 'HG00732_1' male_sample_2 = 'HG00732_1' @@ -121,7 +119,6 @@ def test_validate_imputed_sex_ploidy(self, mock_env: Mock) -> None: xyy_sample = 'NA20891_1' xxx_sample = 'NA20892_1' - mock_env.CHECK_SEX_AND_RELATEDNESS = True sex_check_ht = hl.read_table(TEST_SEX_CHECK_1) # All calls on X chromosome are valid diff --git a/v03_pipeline/lib/model/__init__.py b/v03_pipeline/lib/model/__init__.py index 1e6605876..bb4325d47 100644 --- a/v03_pipeline/lib/model/__init__.py +++ b/v03_pipeline/lib/model/__init__.py @@ -1,6 +1,3 @@ -from v03_pipeline.lib.model.cached_reference_dataset_query import ( - CachedReferenceDatasetQuery, -) from v03_pipeline.lib.model.dataset_type import DatasetType from v03_pipeline.lib.model.definitions import ( AccessControl, @@ -10,18 +7,13 @@ Sex, ) from v03_pipeline.lib.model.environment import Env -from v03_pipeline.lib.model.reference_dataset_collection import ( - ReferenceDatasetCollection, -) __all__ = [ 'AccessControl', - 'CachedReferenceDatasetQuery', 'DatasetType', 'Env', 'Sex', 'PipelineVersion', - 'ReferenceDatasetCollection', 'ReferenceGenome', 'SampleType', ] diff --git a/v03_pipeline/lib/model/cached_reference_dataset_query.py b/v03_pipeline/lib/model/cached_reference_dataset_query.py deleted file mode 100644 index b950be51b..000000000 --- a/v03_pipeline/lib/model/cached_reference_dataset_query.py +++ /dev/null @@ -1,65 +0,0 @@ -from collections.abc import Callable -from enum import Enum - -import hail as hl - -from v03_pipeline.lib.model.dataset_type import DatasetType -from v03_pipeline.lib.model.definitions import ReferenceGenome -from v03_pipeline.lib.model.reference_dataset_collection import ( - ReferenceDatasetCollection, -) -from v03_pipeline.lib.reference_data.queries import ( - clinvar_path_variants, - gnomad_coding_and_noncoding_variants, - gnomad_qc, - high_af_variants, -) - - -class CachedReferenceDatasetQuery(str, Enum): - CLINVAR_PATH_VARIANTS = 'clinvar_path_variants' - GNOMAD_CODING_AND_NONCODING_VARIANTS = 'gnomad_coding_and_noncoding_variants' - GNOMAD_QC = 'gnomad_qc' - HIGH_AF_VARIANTS = 'high_af_variants' - - def dataset(self, dataset_type: DatasetType) -> str | None: - return { - CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS: 'clinvar_mito' - if dataset_type == DatasetType.MITO - else 'clinvar', - CachedReferenceDatasetQuery.GNOMAD_CODING_AND_NONCODING_VARIANTS: 'gnomad_genomes', - CachedReferenceDatasetQuery.GNOMAD_QC: 'gnomad_qc', - CachedReferenceDatasetQuery.HIGH_AF_VARIANTS: 'gnomad_genomes', - }.get(self) - - @property - def reference_dataset_collection(self) -> ReferenceDatasetCollection: - return { - CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS: ReferenceDatasetCollection.COMBINED, - CachedReferenceDatasetQuery.GNOMAD_CODING_AND_NONCODING_VARIANTS: None, - CachedReferenceDatasetQuery.GNOMAD_QC: None, - CachedReferenceDatasetQuery.HIGH_AF_VARIANTS: ReferenceDatasetCollection.COMBINED, - }[self] - - @property - def query(self) -> Callable[[hl.Table, ReferenceGenome], hl.Table]: - return { - CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS: clinvar_path_variants, - CachedReferenceDatasetQuery.GNOMAD_CODING_AND_NONCODING_VARIANTS: gnomad_coding_and_noncoding_variants, - CachedReferenceDatasetQuery.GNOMAD_QC: gnomad_qc, - CachedReferenceDatasetQuery.HIGH_AF_VARIANTS: high_af_variants, - }[self] - - @classmethod - def for_reference_genome_dataset_type( - cls, - reference_genome: ReferenceGenome, - dataset_type: DatasetType, - ) -> list['CachedReferenceDatasetQuery']: - return { - (ReferenceGenome.GRCh38, DatasetType.SNV_INDEL): list(cls), - (ReferenceGenome.GRCh38, DatasetType.MITO): [ - CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS, - ], - (ReferenceGenome.GRCh37, DatasetType.SNV_INDEL): list(cls), - }.get((reference_genome, dataset_type), []) diff --git a/v03_pipeline/lib/model/dataset_type.py b/v03_pipeline/lib/model/dataset_type.py index 8bad768bb..281cffd9b 100644 --- a/v03_pipeline/lib/model/dataset_type.py +++ b/v03_pipeline/lib/model/dataset_type.py @@ -246,7 +246,6 @@ def formatting_annotation_fns( DatasetType.MITO: [ mito.common_low_heteroplasmy, mito.haplogroup, - mito.high_constraint_region_mito, mito.mitotip, mito.rsid, shared.variant_id, diff --git a/v03_pipeline/lib/model/definitions.py b/v03_pipeline/lib/model/definitions.py index 71e8b2821..a87437c8f 100644 --- a/v03_pipeline/lib/model/definitions.py +++ b/v03_pipeline/lib/model/definitions.py @@ -63,6 +63,13 @@ def optional_contigs(self) -> set[str]: }, }[self] + @property + def mito_contig(self) -> str: + return { + ReferenceGenome.GRCh37: 'MT', + ReferenceGenome.GRCh38: 'chrM', + }[self] + def contig_recoding(self, include_mt: bool = False) -> dict[str, str]: recode = { ReferenceGenome.GRCh37: { diff --git a/v03_pipeline/lib/model/reference_dataset_collection.py b/v03_pipeline/lib/model/reference_dataset_collection.py deleted file mode 100644 index d05e1e328..000000000 --- a/v03_pipeline/lib/model/reference_dataset_collection.py +++ /dev/null @@ -1,110 +0,0 @@ -from enum import Enum - -import hail as hl - -from v03_pipeline.lib.model.dataset_type import DatasetType -from v03_pipeline.lib.model.definitions import AccessControl, ReferenceGenome -from v03_pipeline.lib.model.environment import Env - - -class ReferenceDatasetCollection(str, Enum): - COMBINED = 'combined' - HGMD = 'hgmd' - INTERVAL = 'interval' - - @property - def access_control(self) -> AccessControl: - if self == ReferenceDatasetCollection.HGMD: - return AccessControl.PRIVATE - return AccessControl.PUBLIC - - @property - def requires_annotation(self) -> bool: - return self == ReferenceDatasetCollection.INTERVAL - - def datasets(self, dataset_type: DatasetType) -> list[str]: - return { - (ReferenceDatasetCollection.COMBINED, DatasetType.SNV_INDEL): [ - 'cadd', - 'clinvar', - 'dbnsfp', - 'eigen', - 'exac', - 'gnomad_exomes', - 'gnomad_genomes', - 'mpc', - 'primate_ai', - 'splice_ai', - 'topmed', - ], - (ReferenceDatasetCollection.COMBINED, DatasetType.MITO): [ - 'clinvar_mito', - 'dbnsfp_mito', - 'gnomad_mito', - 'helix_mito', - 'hmtvar', - 'mitomap', - 'mitimpact', - 'local_constraint_mito', - ], - (ReferenceDatasetCollection.HGMD, DatasetType.SNV_INDEL): ['hgmd'], - (ReferenceDatasetCollection.INTERVAL, DatasetType.SNV_INDEL): [ - 'gnomad_non_coding_constraint', - 'screen', - ], - (ReferenceDatasetCollection.INTERVAL, DatasetType.MITO): [ - 'high_constraint_region_mito', - ], - }.get((self, dataset_type), []) - - def table_key_type( - self, - reference_genome: ReferenceGenome, - ) -> hl.tstruct: - default_key = hl.tstruct( - locus=hl.tlocus(reference_genome.value), - alleles=hl.tarray(hl.tstr), - ) - return { - ReferenceDatasetCollection.INTERVAL: hl.tstruct( - interval=hl.tinterval(hl.tlocus(reference_genome.value)), - ), - }.get(self, default_key) - - @classmethod - def for_reference_genome_dataset_type( - cls, - reference_genome: ReferenceGenome, - dataset_type: DatasetType, - ) -> list['ReferenceDatasetCollection']: - rdcs = { - (ReferenceGenome.GRCh38, DatasetType.SNV_INDEL): [ - ReferenceDatasetCollection.COMBINED, - ReferenceDatasetCollection.INTERVAL, - ReferenceDatasetCollection.HGMD, - ], - (ReferenceGenome.GRCh38, DatasetType.MITO): [ - ReferenceDatasetCollection.COMBINED, - ReferenceDatasetCollection.INTERVAL, - ], - (ReferenceGenome.GRCh37, DatasetType.SNV_INDEL): [ - ReferenceDatasetCollection.COMBINED, - ReferenceDatasetCollection.HGMD, - ], - }.get((reference_genome, dataset_type), []) - if not Env.ACCESS_PRIVATE_REFERENCE_DATASETS: - return [rdc for rdc in rdcs if rdc.access_control == AccessControl.PUBLIC] - return rdcs - - @classmethod - def for_dataset( - cls, - dataset: str, - dataset_type: DatasetType, - ) -> 'ReferenceDatasetCollection': - for rdc in cls: - if dataset in rdc.datasets(dataset_type): - return rdc - - err_msg = f'Dataset "{dataset}" not found in any reference dataset collection' - raise ValueError(err_msg) diff --git a/v03_pipeline/lib/paths.py b/v03_pipeline/lib/paths.py index 0ae158866..68d27bba2 100644 --- a/v03_pipeline/lib/paths.py +++ b/v03_pipeline/lib/paths.py @@ -4,14 +4,16 @@ from v03_pipeline.lib.model import ( AccessControl, - CachedReferenceDatasetQuery, DatasetType, Env, PipelineVersion, - ReferenceDatasetCollection, ReferenceGenome, SampleType, ) +from v03_pipeline.lib.reference_datasets.reference_dataset import ( + ReferenceDataset, + ReferenceDatasetQuery, +) def _pipeline_prefix( @@ -57,19 +59,24 @@ def _v03_reference_data_prefix( ) -def cached_reference_dataset_query_path( +def _v03_reference_dataset_prefix( + access_control: AccessControl, reference_genome: ReferenceGenome, - dataset_type: DatasetType, - cached_reference_dataset_query: CachedReferenceDatasetQuery, ) -> str: + root = ( + Env.PRIVATE_REFERENCE_DATASETS_DIR + if access_control == AccessControl.PRIVATE + else Env.REFERENCE_DATASETS_DIR + ) + if Env.INCLUDE_PIPELINE_VERSION_IN_PREFIX: + return os.path.join( + root, + PipelineVersion.V3_1.value, + reference_genome.value, + ) return os.path.join( - _v03_reference_data_prefix( - AccessControl.PUBLIC, - reference_genome, - dataset_type, - ), - 'cached_reference_dataset_queries', - f'{cached_reference_dataset_query.value}.ht', + root, + reference_genome.value, ) @@ -283,24 +290,32 @@ def valid_filters_path( ) -def valid_reference_dataset_collection_path( +def valid_reference_dataset_path( + reference_genome: ReferenceGenome, + reference_dataset: ReferenceDataset, +) -> str | None: + return os.path.join( + _v03_reference_dataset_prefix( + reference_dataset.access_control, + reference_genome, + ), + f'{reference_dataset.value}', + f'{reference_dataset.version(reference_genome)}.ht', + ) + + +def valid_reference_dataset_query_path( reference_genome: ReferenceGenome, dataset_type: DatasetType, - reference_dataset_collection: ReferenceDatasetCollection, + reference_dataset_query: ReferenceDatasetQuery, ) -> str | None: - if ( - not Env.ACCESS_PRIVATE_REFERENCE_DATASETS - and reference_dataset_collection.access_control == AccessControl.PRIVATE - ): - return None return os.path.join( - _v03_reference_data_prefix( - reference_dataset_collection.access_control, + _v03_reference_dataset_prefix( + reference_dataset_query.access_control, reference_genome, - dataset_type, ), - 'reference_datasets', - f'{reference_dataset_collection.value}.ht', + dataset_type.value, + f'{reference_dataset_query.value}.ht', ) diff --git a/v03_pipeline/lib/paths_test.py b/v03_pipeline/lib/paths_test.py index 28de9567b..f49e62b61 100644 --- a/v03_pipeline/lib/paths_test.py +++ b/v03_pipeline/lib/paths_test.py @@ -2,14 +2,11 @@ from unittest.mock import patch from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, DatasetType, - ReferenceDatasetCollection, ReferenceGenome, SampleType, ) from v03_pipeline.lib.paths import ( - cached_reference_dataset_query_path, family_table_path, imported_callset_path, imputed_sex_path, @@ -23,23 +20,12 @@ remapped_and_subsetted_callset_path, sex_check_table_path, valid_filters_path, - valid_reference_dataset_collection_path, validation_errors_for_run_path, variant_annotations_table_path, ) class TestPaths(unittest.TestCase): - def test_cached_reference_dataset_query_path(self) -> None: - self.assertEqual( - cached_reference_dataset_query_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS, - ), - '/var/seqr/seqr-reference-data/v03/GRCh38/SNV_INDEL/cached_reference_dataset_queries/clinvar_path_variants.ht', - ) - def test_family_table_path(self) -> None: self.assertEqual( family_table_path( @@ -103,26 +89,6 @@ def test_project_table_path(self) -> None: '/var/seqr/seqr-hail-search-data/v3.1/GRCh38/MITO/projects/WES/R0652_pipeline_test.ht', ) - def test_valid_reference_dataset_collection_path(self) -> None: - with patch('v03_pipeline.lib.paths.Env') as mock_env: - mock_env.ACCESS_PRIVATE_REFERENCE_DATASETS = False - self.assertEqual( - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh37, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, - ), - None, - ) - self.assertEqual( - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, - ), - '/var/seqr/seqr-reference-data-private/v03/GRCh38/SNV_INDEL/reference_datasets/hgmd.ht', - ) - def test_lookup_table_path(self) -> None: self.assertEqual( lookup_table_path( diff --git a/v03_pipeline/lib/reference_data/clinvar.py b/v03_pipeline/lib/reference_data/clinvar.py deleted file mode 100644 index 3e482e0b6..000000000 --- a/v03_pipeline/lib/reference_data/clinvar.py +++ /dev/null @@ -1,214 +0,0 @@ -import gzip -import os -import shutil -import tempfile -import urllib - -import hail as hl -import hailtop.fs as hfs -import requests - -from v03_pipeline.lib.annotations.enums import CLINVAR_PATHOGENICITIES_LOOKUP -from v03_pipeline.lib.logger import get_logger -from v03_pipeline.lib.misc.io import write -from v03_pipeline.lib.model import Env -from v03_pipeline.lib.model.definitions import ReferenceGenome -from v03_pipeline.lib.paths import clinvar_dataset_path - -CLINVAR_ASSERTIONS = [ - 'Affects', - 'association', - 'association_not_found', - 'confers_sensitivity', - 'drug_response', - 'low_penetrance', - 'not_provided', - 'other', - 'protective', - 'risk_factor', - 'no_classification_for_the_single_variant', - 'no_classifications_from_unflagged_records', -] -CLINVAR_GOLD_STARS_LOOKUP = hl.dict( - { - 'no_classification_for_the_single_variant': 0, - 'no_classification_provided': 0, - 'no_assertion_criteria_provided': 0, - 'no_classifications_from_unflagged_records': 0, - 'criteria_provided,_single_submitter': 1, - 'criteria_provided,_conflicting_classifications': 1, - 'criteria_provided,_multiple_submitters,_no_conflicts': 2, - 'reviewed_by_expert_panel': 3, - 'practice_guideline': 4, - }, -) -CLINVAR_SUBMISSION_SUMMARY_URL = ( - 'ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/tab_delimited/submission_summary.txt.gz' -) -MIN_HT_PARTITIONS = 2000 -logger = get_logger(__name__) - - -def parsed_clnsig(ht: hl.Table): - return ( - hl.delimit(ht.info.CLNSIG) - .replace( - 'Likely_pathogenic,_low_penetrance', - 'Likely_pathogenic|low_penetrance', - ) - .replace( - '/Pathogenic,_low_penetrance/Established_risk_allele', - '/Established_risk_allele|low_penetrance', - ) - .replace( - '/Pathogenic,_low_penetrance', - '|low_penetrance', - ) - .split(r'\|') - ) - - -def parse_to_count(entry: str): - splt = entry.split( - r'\(', - ) # pattern, count = entry... if destructuring worked on a hail expression! - return hl.Struct( - pathogenicity_id=CLINVAR_PATHOGENICITIES_LOOKUP[splt[0]], - count=hl.int32(splt[1][:-1]), - ) - - -def parsed_and_mapped_clnsigconf(ht: hl.Table): - return ( - hl.delimit(ht.info.CLNSIGCONF) - .replace(',_low_penetrance', '') - .split(r'\|') - .map(parse_to_count) - .group_by(lambda x: x.pathogenicity_id) - .map_values( - lambda values: ( - values.fold( - lambda x, y: x + y.count, - 0, - ) - ), - ) - .items() - .map(lambda x: hl.Struct(pathogenicity_id=x[0], count=x[1])) - ) - - -def get_clinvar_ht( - clinvar_url: str, - reference_genome: ReferenceGenome, -): - etag = requests.head(clinvar_url, timeout=10).headers.get('ETag').strip('"') - clinvar_ht_path = clinvar_dataset_path(reference_genome, etag) - if hfs.exists(clinvar_ht_path): - logger.info(f'Try using cached clinvar ht with etag {etag}') - ht = hl.read_table(clinvar_ht_path) - else: - logger.info('Cached clinvar ht not found, downloading latest clinvar vcf') - hl._set_flags(use_new_shuffle=None, no_whole_stage_codegen='1') # noqa: SLF001 - ht = download_and_import_latest_clinvar_vcf(clinvar_url, reference_genome) - write(ht, clinvar_ht_path, repartition=False) - hl._set_flags(use_new_shuffle='1', no_whole_stage_codegen='1') # noqa: SLF001 - return ht - - -def download_and_import_latest_clinvar_vcf( - clinvar_url: str, - reference_genome: ReferenceGenome, -) -> hl.Table: - version = parse_clinvar_release_date(clinvar_url) - with tempfile.NamedTemporaryFile(suffix='.vcf.gz', delete=False) as tmp_file: - urllib.request.urlretrieve(clinvar_url, tmp_file.name) # noqa: S310 - cached_tmp_file_name = os.path.join( - Env.HAIL_TMP_DIR, - os.path.basename(tmp_file.name), - ) - # In cases where HAIL_TMP_DIR is a remote path, copy the - # file there. If it's local, do nothing. - if tmp_file.name != cached_tmp_file_name: - hfs.copy(tmp_file.name, cached_tmp_file_name) - mt = hl.import_vcf( - cached_tmp_file_name, - reference_genome=reference_genome.value, - drop_samples=True, - skip_invalid_loci=True, - contig_recoding=reference_genome.contig_recoding(include_mt=True), - min_partitions=MIN_HT_PARTITIONS, - force_bgz=True, - ) - mt = mt.annotate_globals(version=version) - return join_to_submission_summary_ht(mt.rows()) - - -def parse_clinvar_release_date(clinvar_url: str) -> str: - response = requests.get(clinvar_url, stream=True, timeout=10) - for byte_line in gzip.GzipFile(fileobj=response.raw): - line = byte_line.decode('ascii').strip() - if not line: - continue - if line.startswith('##fileDate='): - return line.split('=')[-1].strip() - if not line.startswith('#'): - return None - return None - - -def join_to_submission_summary_ht(vcf_ht: hl.Table) -> hl.Table: - # https://ftp.ncbi.nlm.nih.gov/pub/clinvar/tab_delimited/README - submission_summary.txt - logger.info('Getting clinvar submission summary from NCBI FTP server') - ht = download_and_import_clinvar_submission_summary() - return vcf_ht.annotate( - submitters=ht[vcf_ht.rsid].Submitters, - conditions=ht[vcf_ht.rsid].Conditions, - ) - - -def download_and_import_clinvar_submission_summary() -> hl.Table: - with tempfile.NamedTemporaryFile( - suffix='.txt.gz', - delete=False, - ) as tmp_file, tempfile.NamedTemporaryFile( - suffix='.txt', - delete=False, - ) as unzipped_tmp_file: - urllib.request.urlretrieve(CLINVAR_SUBMISSION_SUMMARY_URL, tmp_file.name) # noqa: S310 - # Unzip the gzipped file first to fix gzip files being read by hail with single partition - with gzip.open(tmp_file.name, 'rb') as f_in, open( - unzipped_tmp_file.name, - 'wb', - ) as f_out: - shutil.copyfileobj(f_in, f_out) - cached_tmp_file_name = os.path.join( - Env.HAIL_TMP_DIR, - os.path.basename(unzipped_tmp_file.name), - ) - # In cases where HAIL_TMP_DIR is a remote path, copy the - # file there. If it's local, do nothing. - if unzipped_tmp_file.name != cached_tmp_file_name: - hfs.copy(unzipped_tmp_file.name, cached_tmp_file_name) - return import_submission_table(cached_tmp_file_name) - - -def import_submission_table(file_name: str) -> hl.Table: - ht = hl.import_table( - file_name, - force=True, - filter='^(#[^:]*:|^##).*$', # removes all comments except for the header line - types={ - '#VariationID': hl.tstr, - 'Submitter': hl.tstr, - 'ReportedPhenotypeInfo': hl.tstr, - }, - missing='-', - min_partitions=MIN_HT_PARTITIONS, - ) - ht = ht.rename({'#VariationID': 'VariationID'}) - ht = ht.select('VariationID', 'Submitter', 'ReportedPhenotypeInfo') - return ht.group_by('VariationID').aggregate( - Submitters=hl.agg.collect(ht.Submitter), - Conditions=hl.agg.collect(ht.ReportedPhenotypeInfo), - ) diff --git a/v03_pipeline/lib/reference_data/clinvar_test.py b/v03_pipeline/lib/reference_data/clinvar_test.py deleted file mode 100644 index fd8d4e832..000000000 --- a/v03_pipeline/lib/reference_data/clinvar_test.py +++ /dev/null @@ -1,281 +0,0 @@ -import gzip -import unittest -from unittest import mock - -import hail as hl -import responses - -from v03_pipeline.lib.reference_data.clinvar import ( - import_submission_table, - join_to_submission_summary_ht, - parse_clinvar_release_date, - parsed_and_mapped_clnsigconf, - parsed_clnsig, -) - -CLINVAR_VCF_DATA = b""" -##fileformat=VCFv4.1 -##fileDate=2024-10-27 -##source=ClinVar -##reference=GRCh37 -##ID= -##INFO= -""" - - -class ClinvarTest(unittest.TestCase): - @responses.activate - def test_parse_clinvar_release_date(self): - clinvar_url = ( - 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz' - ) - responses.get( - clinvar_url, - body=gzip.compress(CLINVAR_VCF_DATA), - ) - self.assertEqual( - parse_clinvar_release_date(clinvar_url), - '2024-10-27', - ) - - def test_parsed_clnsig(self): - ht = hl.Table.parallelize( - [ - {'info': hl.Struct(CLNSIG=['Pathogenic|Affects'])}, - { - 'info': hl.Struct( - CLNSIG=[ - 'Pathogenic/Likely_pathogenic/Pathogenic', - '_low_penetrance', - ], - ), - }, - { - 'info': hl.Struct( - CLNSIG=[ - 'Likely_pathogenic/Pathogenic', - '_low_penetrance|association|protective', - ], - ), - }, - {'info': hl.Struct(CLNSIG=['Likely_pathogenic', '_low_penetrance'])}, - {'info': hl.Struct(CLNSIG=['association|protective'])}, - { - 'info': hl.Struct( - CLNSIG=[ - 'Pathogenic/Likely_pathogenic/Pathogenic', - '_low_penetrance/Established_risk_allele', - ], - ), - }, - ], - hl.tstruct(info=hl.tstruct(CLNSIG=hl.tarray(hl.tstr))), - ) - self.assertListEqual( - parsed_clnsig(ht).collect(), - [ - ['Pathogenic', 'Affects'], - ['Pathogenic/Likely_pathogenic', 'low_penetrance'], - ['Likely_pathogenic', 'low_penetrance', 'association', 'protective'], - ['Likely_pathogenic', 'low_penetrance'], - ['association', 'protective'], - [ - 'Pathogenic/Likely_pathogenic/Established_risk_allele', - 'low_penetrance', - ], - ], - ) - - def test_parsed_and_mapped_clnsigconf(self): - ht = hl.Table.parallelize( - [ - {'info': hl.Struct(CLNSIGCONF=hl.missing(hl.tarray(hl.tstr)))}, - { - 'info': hl.Struct( - CLNSIGCONF=[ - 'Pathogenic(8)|Likely_pathogenic(2)|Pathogenic', - '_low_penetrance(1)|Uncertain_significance(1)', - ], - ), - }, - ], - hl.tstruct(info=hl.tstruct(CLNSIGCONF=hl.tarray(hl.tstr))), - ) - self.assertListEqual( - parsed_and_mapped_clnsigconf(ht).collect(), - [ - None, - [ - hl.Struct(count=9, pathogenicity_id=0), - hl.Struct(count=2, pathogenicity_id=5), - hl.Struct(count=1, pathogenicity_id=12), - ], - ], - ) - - @mock.patch( - 'v03_pipeline.lib.reference_data.clinvar.hl.import_table', - ) - def test_import_submission_table(self, mock_import_table): - mock_import_table.return_value = hl.Table.parallelize( - [ - { - '#VariationID': '5', - 'Submitter': 'OMIM', - 'ReportedPhenotypeInfo': 'C3661900:not provided', - }, - { - '#VariationID': '5', - 'Submitter': 'Broad Institute Rare Disease Group, Broad Institute', - 'ReportedPhenotypeInfo': 'C0023264:Leigh syndrome', - }, - { - '#VariationID': '5', - 'Submitter': 'PreventionGenetics, part of Exact Sciences', - 'ReportedPhenotypeInfo': 'na:FOXRED1-related condition', - }, - { - '#VariationID': '5', - 'Submitter': 'Invitae', - 'ReportedPhenotypeInfo': 'C4748791:Mitochondrial complex 1 deficiency, nuclear type 19', - }, - { - '#VariationID': '6', - 'Submitter': 'A', - 'ReportedPhenotypeInfo': 'na:B', - }, - ], - ) - ht = import_submission_table('mock_file_name') - self.assertEqual( - ht.collect(), - [ - hl.Struct( - VariationID='5', - Submitters=[ - 'OMIM', - 'Broad Institute Rare Disease Group, Broad Institute', - 'PreventionGenetics, part of Exact Sciences', - 'Invitae', - ], - Conditions=[ - 'C3661900:not provided', - 'C0023264:Leigh syndrome', - 'na:FOXRED1-related condition', - 'C4748791:Mitochondrial complex 1 deficiency, nuclear type 19', - ], - ), - hl.Struct( - VariationID='6', - Submitters=['A'], - Conditions=['na:B'], - ), - ], - ) - - @mock.patch( - 'v03_pipeline.lib.reference_data.clinvar.download_and_import_clinvar_submission_summary', - ) - def test_join_to_submission_summary_ht( - self, - mock_download, - ): - vcf_ht = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'rsid': '5', - 'info': hl.Struct(ALLELEID=1), - }, - { - 'locus': hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'AC'], - 'rsid': '7', - 'info': hl.Struct(ALLELEID=1), - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - rsid=hl.tstr, - info=hl.tstruct(ALLELEID=hl.tint32), - ), - ) - submitters_ht = hl.Table.parallelize( - [ - { - 'VariationID': '5', - 'Submitters': [ - 'OMIM', - 'Broad Institute Rare Disease Group, Broad Institute', - 'PreventionGenetics, part of Exact Sciences', - 'Invitae', - ], - 'Conditions': [ - 'C3661900:not provided', - 'C0023264:Leigh syndrome', - 'na:FOXRED1-related condition', - 'C4748791:Mitochondrial complex 1 deficiency, nuclear type 19', - ], - }, - {'VariationID': '6', 'Submitters': ['A'], 'Conditions': ['na:B']}, - ], - hl.tstruct( - VariationID=hl.tstr, - Submitters=hl.tarray(hl.tstr), - Conditions=hl.tarray(hl.tstr), - ), - key='VariationID', - ) - expected_clinvar_ht_rows = [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - rsid='5', - info=hl.Struct(ALLELEID=1), - submitters=[ - 'OMIM', - 'Broad Institute Rare Disease Group, Broad Institute', - 'PreventionGenetics, part of Exact Sciences', - 'Invitae', - ], - conditions=[ - 'C3661900:not provided', - 'C0023264:Leigh syndrome', - 'na:FOXRED1-related condition', - 'C4748791:Mitochondrial complex 1 deficiency, nuclear type 19', - ], - ), - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'AC'], - rsid='7', - info=hl.Struct(ALLELEID=1), - submitters=None, - conditions=None, - ), - ] - - mock_download.return_value = submitters_ht - ht = join_to_submission_summary_ht(vcf_ht) - self.assertEqual( - ht.collect(), - expected_clinvar_ht_rows, - ) diff --git a/v03_pipeline/lib/reference_data/compare_globals.py b/v03_pipeline/lib/reference_data/compare_globals.py deleted file mode 100644 index c295b3a35..000000000 --- a/v03_pipeline/lib/reference_data/compare_globals.py +++ /dev/null @@ -1,137 +0,0 @@ -import dataclasses - -import hail as hl - -from v03_pipeline.lib.logger import get_logger -from v03_pipeline.lib.model import ( - DatasetType, - ReferenceGenome, -) -from v03_pipeline.lib.reference_data.clinvar import parse_clinvar_release_date -from v03_pipeline.lib.reference_data.config import CONFIG -from v03_pipeline.lib.reference_data.dataset_table_operations import ( - get_all_select_fields, - get_enum_select_fields, - import_ht_from_config_path, -) - -logger = get_logger(__name__) - - -def clinvar_versions_equal( - ht: hl.Table, - reference_genome: ReferenceGenome, - dataset_type: DatasetType, -) -> bool: - dataset = 'clinvar_mito' if dataset_type == DatasetType.MITO else 'clinvar' - return hl.eval(ht.globals.versions[dataset]) == parse_clinvar_release_date( - CONFIG[dataset][reference_genome.v02_value], - ) - - -@dataclasses.dataclass -class Globals: - paths: dict[str, str] - versions: dict[str, str] - enums: dict[str, dict[str, list[str]]] - selects: dict[str, dict[str, hl.dtype]] - - def __getitem__(self, name: str): - return getattr(self, name) - - @classmethod - def from_dataset_configs( - cls, - reference_genome: ReferenceGenome, - datasets: list[str], - ): - paths, versions, enums, selects = {}, {}, {}, {} - for dataset in datasets: - dataset_config = CONFIG[dataset][reference_genome.v02_value] - dataset_ht = import_ht_from_config_path( - dataset_config, - dataset, - reference_genome, - ) - dataset_ht_globals = hl.eval(dataset_ht.globals) - paths[dataset] = dataset_ht_globals.path - versions[dataset] = dataset_ht_globals.version - enums[dataset] = dict(dataset_ht_globals.enums) - dataset_ht = dataset_ht.select( - **get_all_select_fields(dataset_ht, dataset_config), - ) - dataset_ht = dataset_ht.transmute( - **get_enum_select_fields(dataset_ht, dataset_config), - ) - selects[dataset] = { - k: v.dtype - for k, v in dict(dataset_ht.row).items() - if k not in set(dataset_ht.key) - } - return cls(paths, versions, enums, selects) - - @classmethod - def from_ht( - cls, - ht: hl.Table, - datasets: list[str], - ): - rdc_globals_struct = hl.eval(ht.globals) - paths = dict(rdc_globals_struct.paths) - versions = dict(rdc_globals_struct.versions) - # enums are nested structs - enums = {k: dict(v) for k, v in rdc_globals_struct.enums.items() if k in paths} - selects = {} - for dataset in datasets: - if dataset in ht.row: - # NB: handle an edge case (mito high constraint) where we annotate a bool from the reference dataset collection - selects[dataset] = ( - {k: v.dtype for k, v in dict(ht[dataset]).items()} - if isinstance(ht[dataset], hl.StructExpression) - else {} - ) - return cls(paths, versions, enums, selects) - - -def validate_selects_types( - ht1_globals: Globals, - ht2_globals: Globals, - dataset: str, -) -> None: - # Assert that all shared annotations have identical types - shared_selects = ( - ht1_globals['selects'][dataset].keys() - & ht2_globals['selects'].get(dataset).keys() - ) - mismatched_select_types = [ - (select, ht2_globals['selects'][dataset][select]) - for select in shared_selects - if ( - ht1_globals['selects'][dataset][select] - != ht2_globals['selects'][dataset][select] - ) - ] - if mismatched_select_types: - msg = f'Unexpected field types detected in {dataset}: {mismatched_select_types}' - raise ValueError(msg) - - -def get_datasets_to_update( - ht1_globals: Globals, - ht2_globals: Globals, - validate_selects: bool = True, -) -> list[str]: - datasets_to_update = set() - for field in dataclasses.fields(Globals): - if field.name == 'selects' and not validate_selects: - continue - datasets_to_update.update( - ht1_globals[field.name].keys() ^ ht2_globals[field.name].keys(), - ) - for dataset in ht1_globals[field.name].keys() & ht2_globals[field.name].keys(): - if field.name == 'selects': - validate_selects_types(ht1_globals, ht2_globals, dataset) - if ht1_globals[field.name][dataset] != ht2_globals[field.name][dataset]: - logger.info(f'{field.name} mismatch for {dataset}') - datasets_to_update.add(dataset) - return sorted(datasets_to_update) diff --git a/v03_pipeline/lib/reference_data/compare_globals_test.py b/v03_pipeline/lib/reference_data/compare_globals_test.py deleted file mode 100644 index 786964fcb..000000000 --- a/v03_pipeline/lib/reference_data/compare_globals_test.py +++ /dev/null @@ -1,321 +0,0 @@ -import unittest -from unittest import mock - -import hail as hl - -from v03_pipeline.lib.model import ( - ReferenceGenome, -) -from v03_pipeline.lib.reference_data.compare_globals import ( - Globals, - get_datasets_to_update, -) - -CONFIG = { - 'a': { - '38': { - 'custom_import': None, - 'source_path': 'a_path', # 'a' has a custom import - 'select': { - 'test_select': 'info.test_select', - 'test_enum': 'test_enum', - }, - 'version': 'a_version', - 'enum_select': {'test_enum': ['A', 'B']}, - }, - }, - 'b': { # b is missing version - '38': { - 'path': 'b_path', - 'select': { - 'test_select': 'info.test_select', - 'test_enum': 'test_enum', - }, - 'enum_select': {'test_enum': ['C', 'D']}, - 'custom_select': lambda ht: {'field2': ht.info.test_select_2}, - }, - }, -} - -B_TABLE = hl.Table.parallelize( - [], - schema=hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - test_select=hl.tint, - test_select_2=hl.tint, - ), - test_enum=hl.tstr, - ), - globals=hl.Struct( - version='b_version', - path='b_path', - enums=hl.Struct(test_enum=['C', 'D']), - ), - key=['locus', 'alleles'], -) - - -class CompareGlobalsTest(unittest.TestCase): - @mock.patch.dict('v03_pipeline.lib.reference_data.compare_globals.CONFIG', CONFIG) - @mock.patch( - 'v03_pipeline.lib.reference_data.compare_globals.import_ht_from_config_path', - ) - def test_create_globals_from_dataset_configs( - self, - mock_import_dataset_ht, - ): - mock_import_dataset_ht.side_effect = [ - hl.Table.parallelize( - [], - schema=hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - test_select=hl.tint, - ), - test_enum=hl.tstr, - ), - globals=hl.Struct( - version='a_version', - path='a_path', - enums=hl.Struct(test_enum=['A', 'B']), - ), - key=['locus', 'alleles'], - ), - B_TABLE, - ] - dataset_config_globals = Globals.from_dataset_configs( - reference_genome=ReferenceGenome.GRCh38, - datasets=['a', 'b'], - ) - self.assertTrue( - dataset_config_globals.versions == {'a': 'a_version', 'b': 'b_version'}, - ) - self.assertTrue( - dataset_config_globals.paths == {'a': 'a_path', 'b': 'b_path'}, - ) - self.assertTrue( - dataset_config_globals.enums - == {'a': {'test_enum': ['A', 'B']}, 'b': {'test_enum': ['C', 'D']}}, - ) - self.assertTrue( - dataset_config_globals.selects - == { - 'a': { - 'test_select': hl.tint32, - 'test_enum_id': hl.tint32, - }, - 'b': { - 'test_select': hl.tint32, - 'field2': hl.tint32, - 'test_enum_id': hl.tint32, - }, - }, - ) - - @mock.patch.dict('v03_pipeline.lib.reference_data.compare_globals.CONFIG', CONFIG) - @mock.patch( - 'v03_pipeline.lib.reference_data.dataset_table_operations.hl.read_table', - ) - def test_create_globals_from_dataset_configs_single_dataset(self, mock_read_table): - # by mocking hl.read_table() (only possible for a dataset without a custom import), - # we can test the code inside import_ht_from_config_path() - mock_read_table.return_value = B_TABLE - - dataset_config_globals = Globals.from_dataset_configs( - reference_genome=ReferenceGenome.GRCh38, - datasets=['b'], - ) - - self.assertTrue( - dataset_config_globals.versions == {'b': 'b_version'}, - ) - self.assertTrue( - dataset_config_globals.paths == {'b': 'b_path'}, - ) - self.assertTrue( - dataset_config_globals.enums == {'b': {'test_enum': ['C', 'D']}}, - ) - self.assertTrue( - dataset_config_globals.selects - == { - 'b': { - 'test_select': hl.tint32, - 'field2': hl.tint32, - 'test_enum_id': hl.tint32, - }, - }, - ) - - def test_from_rdc_or_annotations_ht(self): - rdc_ht = hl.Table.parallelize( - [], - schema=hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - gnomad_non_coding_constraint=hl.tstruct( - z_score=hl.tfloat32, - ), - screen=hl.tstruct( - region_type_ids=hl.tarray(hl.tint32), - ), - ), - globals=hl.Struct( - paths=hl.Struct( - gnomad_non_coding_constraint='gnomad_non_coding_constraint_path', - screen='screen_path', - ), - versions=hl.Struct( - gnomad_non_coding_constraint='v1', - screen='v2', - ), - enums=hl.Struct( - screen=hl.Struct(region_type=['C', 'D']), - ), - ), - ) - rdc_globals = Globals.from_ht( - rdc_ht, - ['gnomad_non_coding_constraint', 'screen'], - ) - self.assertTrue( - rdc_globals.versions - == {'gnomad_non_coding_constraint': 'v1', 'screen': 'v2'}, - ) - self.assertTrue( - rdc_globals.paths - == { - 'gnomad_non_coding_constraint': 'gnomad_non_coding_constraint_path', - 'screen': 'screen_path', - }, - ) - self.assertTrue( - rdc_globals.enums == {'screen': {'region_type': ['C', 'D']}}, - ) - self.assertTrue( - rdc_globals.selects - == { - 'gnomad_non_coding_constraint': {'z_score': hl.tfloat32}, - 'screen': {'region_type_ids': hl.tarray(hl.tint32)}, - }, - ) - - def test_get_datasets_to_update_version_different(self): - result = get_datasets_to_update( - ht1_globals=Globals( - paths={'a': 'a_path', 'b': 'b_path'}, - # 'a' has a different version, 'c' is missing version in ht2_globals - versions={'a': 'v2', 'b': 'v2', 'c': 'v1'}, - enums={'a': {}, 'b': {}, 'c': {}}, - selects={'a': {}, 'b': {}}, - ), - ht2_globals=Globals( - paths={'a': 'a_path', 'b': 'b_path'}, - versions={'a': 'v1', 'b': 'v2'}, - enums={'a': {}, 'b': {}}, - selects={'a': {}, 'b': {}}, - ), - ) - self.assertTrue(result == ['a', 'c']) - - def test_get_datasets_to_update_path_different(self): - result = get_datasets_to_update( - ht1_globals=Globals( - # 'b' has a different path, 'c' is missing path in ht2_globals - paths={'a': 'a_path', 'b': 'old_b_path', 'c': 'extra_c_path'}, - versions={'a': 'v1', 'b': 'v2'}, - enums={'a': {}, 'b': {}}, - selects={'a': {}, 'b': {}}, - ), - ht2_globals=Globals( - paths={'a': 'a_path', 'b': 'b_path'}, - versions={'a': 'v1', 'b': 'v2'}, - enums={'a': {}, 'b': {}}, - selects={'a': {}, 'b': {}}, - ), - ) - self.assertTrue(result == ['b', 'c']) - - def test_get_datasets_to_update_enum_different(self): - result = get_datasets_to_update( - ht1_globals=Globals( - paths={'a': 'a_path', 'b': 'b_path'}, - versions={'a': 'v1', 'b': 'v2'}, - # 'a' has different enum values, 'b' has different enum key, 'c' is missing enum in ht2_globals - enums={ - 'a': {'test_enum': ['A', 'B']}, - 'b': {'enum_key_1': []}, - 'c': {}, - }, - selects={'a': {}, 'b': {}}, - ), - ht2_globals=Globals( - paths={'a': 'a_path', 'b': 'b_path'}, - versions={'a': 'v1', 'b': 'v2'}, - enums={'a': {'test_enum': ['C', 'D']}, 'b': {'enum_key_2': []}}, - selects={'a': {}, 'b': {}}, - ), - ) - self.assertTrue(result == ['a', 'b', 'c']) - - def test_get_datasets_to_update_select_different(self): - result = get_datasets_to_update( - ht1_globals=Globals( - paths={'a': 'a_path', 'b': 'b_path'}, - versions={'a': 'v1', 'b': 'v2'}, - enums={'a': {}, 'b': {}}, - # 'a' has extra select, 'b' has different select, 'c' is missing select in ht2_globals - selects={ - 'a': {'field1': hl.tint32, 'field2': hl.tint32}, - 'b': {'test_select': hl.tint32}, - 'c': {'test_select': hl.tint32}, - }, - ), - ht2_globals=Globals( - paths={'a': 'a_path', 'b': 'b_path'}, - versions={'a': 'v1', 'b': 'v2'}, - enums={'a': {}, 'b': {}}, - selects={'a': {'field1': hl.tint32}, 'b': {'test_select_2': hl.tint32}}, - ), - ) - self.assertTrue(result == ['a', 'b', 'c']) - - def test_get_datasets_to_update_select_type_validation(self): - self.assertRaisesRegex( - ValueError, - "Unexpected field types detected in a: \\[\\('field1', dtype\\('int32'\\)\\)\\]", - get_datasets_to_update, - ht1_globals=Globals( - paths={'a': 'a_path'}, - versions={'a': 'v1'}, - enums={'a': {}}, - selects={ - 'a': {'field1': hl.tarray(hl.tint32)}, - }, - ), - ht2_globals=Globals( - paths={'a': 'a_path'}, - versions={'a': 'v1'}, - enums={'a': {}}, - selects={'a': {'field1': hl.tint32, 'field2': hl.tint32}}, - ), - ) - result = get_datasets_to_update( - ht1_globals=Globals( - paths={'a': 'a_path'}, - versions={'a': 'v1'}, - enums={'a': {}}, - selects={ - 'a': {'field1': hl.tarray(hl.tint32)}, - }, - ), - ht2_globals=Globals( - paths={'a': 'a_path'}, - versions={'a': 'v1'}, - enums={'a': {}}, - selects={'a': {'field1': hl.tarray(hl.tint32), 'field2': hl.tint32}}, - ), - ) - self.assertTrue(result == ['a']) diff --git a/v03_pipeline/lib/reference_data/config.py b/v03_pipeline/lib/reference_data/config.py deleted file mode 100644 index 047532c0c..000000000 --- a/v03_pipeline/lib/reference_data/config.py +++ /dev/null @@ -1,549 +0,0 @@ -from typing import Any - -import hail as hl - -from v03_pipeline.lib.annotations.enums import ( - CLINVAR_DEFAULT_PATHOGENICITY, - CLINVAR_PATHOGENICITIES, - CLINVAR_PATHOGENICITIES_LOOKUP, -) -from v03_pipeline.lib.model.definitions import ReferenceGenome -from v03_pipeline.lib.reference_data.clinvar import ( - CLINVAR_ASSERTIONS, - CLINVAR_GOLD_STARS_LOOKUP, - get_clinvar_ht, - parsed_and_mapped_clnsigconf, - parsed_clnsig, -) -from v03_pipeline.lib.reference_data.hgmd import download_and_import_hgmd_vcf -from v03_pipeline.lib.reference_data.mito import ( - download_and_import_local_constraint_tsv, -) - - -def import_locus_intervals( - url: str, - reference_genome: ReferenceGenome, -) -> hl.Table: - return hl.import_locus_intervals(url, reference_genome.value) - - -def import_matrix_table( - url: str, - _: Any, -) -> hl.Table: - return hl.read_matrix_table(url).rows() - - -def predictor_parse(field: hl.StringExpression): - return field.split(';').find(lambda p: p != '.') - - -def clinvar_custom_select(ht): - selects = {} - clnsigs = parsed_clnsig(ht) - selects['pathogenicity'] = hl.if_else( - CLINVAR_PATHOGENICITIES_LOOKUP.contains(clnsigs[0]), - clnsigs[0], - CLINVAR_DEFAULT_PATHOGENICITY, - ) - selects['assertion'] = hl.if_else( - CLINVAR_PATHOGENICITIES_LOOKUP.contains(clnsigs[0]), - clnsigs[1:], - clnsigs, - ) - # NB: the `enum_select` does not support mapping a list of tuples - # so there's a hidden enum-mapping inside this clinvar function. - selects['conflictingPathogenicities'] = parsed_and_mapped_clnsigconf(ht) - selects['goldStars'] = CLINVAR_GOLD_STARS_LOOKUP.get(hl.delimit(ht.info.CLNREVSTAT)) - selects['submitters'] = ht.submitters - selects['conditions'] = hl.map( - lambda p: p.split(r':')[1], - ht.conditions, - ) # assumes the format 'MedGen#:condition', e.g.'C0023264:Leigh syndrome' - return selects - - -def dbnsfp_custom_select(ht): - selects = {} - selects['REVEL_score'] = hl.parse_float32(ht.REVEL_score) - selects['SIFT_score'] = hl.parse_float32(predictor_parse(ht.SIFT_score)) - selects['Polyphen2_HVAR_score'] = hl.parse_float32( - predictor_parse(ht.Polyphen2_HVAR_score), - ) - selects['MutationTaster_pred'] = predictor_parse(ht.MutationTaster_pred) - return selects - - -def dbnsfp_custom_select_38(ht): - selects = dbnsfp_custom_select(ht) - selects['VEST4_score'] = hl.parse_float32(predictor_parse(ht.VEST4_score)) - selects['MutPred_score'] = hl.parse_float32(ht.MutPred_score) - selects['fathmm_MKL_coding_score'] = hl.float32(ht.fathmm_MKL_coding_score) - return selects - - -def dbnsfp_mito_custom_select(ht): - selects = {} - selects['SIFT_score'] = hl.parse_float32(predictor_parse(ht.SIFT_score)) - selects['MutationTaster_pred'] = predictor_parse(ht.MutationTaster_pred) - return selects - - -def custom_gnomad_mito(ht): - selects = {} - selects['AN'] = hl.int32(ht.AN) - selects['AC_hom'] = hl.int32(ht.AC_hom) - selects['AC_het'] = hl.int32(ht.AC_het) - selects['AF_hom'] = ht.AF_hom - selects['AF_het'] = ht.AF_het - selects['max_hl'] = ht.max_hl - return selects - - -def custom_gnomad_select_v2(ht): - """ - Custom select for public gnomad v2 dataset (which we did not generate). Extracts fields like - 'AF', 'AN', and generates 'hemi'. - :param ht: hail table - :return: select expression dict - """ - selects = {} - global_idx = hl.eval(ht.globals.freq_index_dict['gnomad']) - selects['AF'] = hl.float32(ht.freq[global_idx].AF) - selects['AN'] = ht.freq[global_idx].AN - selects['AC'] = ht.freq[global_idx].AC - selects['Hom'] = ht.freq[global_idx].homozygote_count - - selects['AF_POPMAX_OR_GLOBAL'] = hl.float32( - hl.or_else( - ht.popmax[ht.globals.popmax_index_dict['gnomad']].AF, - ht.freq[global_idx].AF, - ), - ) - selects['FAF_AF'] = hl.float32(ht.faf[ht.globals.popmax_index_dict['gnomad']].faf95) - selects['Hemi'] = hl.if_else( - ht.locus.in_autosome_or_par(), - 0, - ht.freq[ht.globals.freq_index_dict['gnomad_male']].AC, - ) - return selects - - -def custom_gnomad_select_v4(ht): - """ - Custom select for public gnomad v4 dataset (which we did not generate). Extracts fields like - 'AF', 'AN', and generates 'hemi'. - :param ht: hail table - :return: select expression dict - """ - selects = {} - global_idx = hl.eval(ht.globals.freq_index_dict['adj']) - selects['AF'] = hl.float32(ht.freq[global_idx].AF) - selects['AN'] = ht.freq[global_idx].AN - selects['AC'] = ht.freq[global_idx].AC - selects['Hom'] = ht.freq[global_idx].homozygote_count - - grpmax_af = ht.grpmax['gnomad'].AF if hasattr(ht.grpmax, 'gnomad') else ht.grpmax.AF - selects['AF_POPMAX_OR_GLOBAL'] = hl.float32( - hl.or_else(grpmax_af, ht.freq[global_idx].AF), - ) - selects['FAF_AF'] = hl.float32(ht.faf[ht.globals.faf_index_dict['adj']].faf95) - selects['Hemi'] = hl.if_else( - ht.locus.in_autosome_or_par(), - 0, - ht.freq[ht.globals.freq_index_dict['XY_adj']].AC, - ) - return selects - - -def custom_mpc_select(ht): - selects = {} - selects['MPC'] = hl.parse_float32(ht.info.MPC) - return selects - - -""" -Configurations of dataset to combine. -Format: -'': { - '': { - 'path': 'gs://path/to/hailtable.ht', - 'select': '', - 'custom_select': '', - 'enum_select': '' - 'custom_import': '', - 'source_path': '' - }, -""" -CONFIG = { - 'cadd': { - '37': { - 'version': 'v1.6', - 'path': 'gs://seqr-reference-data/GRCh37/CADD/CADD_snvs_and_indels.v1.6.ht', - 'select': ['PHRED'], - }, - '38': { - 'version': 'v1.6', - 'path': 'gs://seqr-reference-data/GRCh38/CADD/CADD_snvs_and_indels.v1.6.ht', - 'select': ['PHRED'], - }, - }, - 'clinvar': { - '37': { - 'custom_import': get_clinvar_ht, - 'source_path': 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - 'select': {'alleleId': 'info.ALLELEID'}, - 'custom_select': clinvar_custom_select, - 'enum_select': { - 'pathogenicity': CLINVAR_PATHOGENICITIES, - 'assertion': CLINVAR_ASSERTIONS, - }, - 'filter': lambda ht: ht.locus.contig != 'MT', - }, - '38': { - 'custom_import': get_clinvar_ht, - 'source_path': 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz', - 'select': {'alleleId': 'info.ALLELEID'}, - 'custom_select': clinvar_custom_select, - 'enum_select': { - 'pathogenicity': CLINVAR_PATHOGENICITIES, - 'assertion': CLINVAR_ASSERTIONS, - }, - 'filter': lambda ht: ht.locus.contig != 'chrM', - }, - }, - 'dbnsfp': { - '37': { - 'version': '2.9.3', - 'path': 'gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.with_new_scores.ht', - 'custom_select': dbnsfp_custom_select, - 'enum_select': { - 'MutationTaster_pred': ['D', 'A', 'N', 'P'], - }, - 'filter': lambda ht: ht.locus.contig != 'MT', - }, - '38': { - 'version': '4.2', - 'path': 'gs://seqr-reference-data/GRCh38/dbNSFP/v4.2/dbNSFP4.2a_variant.with_new_scores.ht', - 'custom_select': dbnsfp_custom_select_38, - 'enum_select': { - 'MutationTaster_pred': ['D', 'A', 'N', 'P'], - }, - 'filter': lambda ht: ht.locus.contig != 'chrM', - }, - }, - 'eigen': { - '37': { - 'path': 'gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.ht', - 'select': {'Eigen_phred': 'info.Eigen-phred'}, - }, - '38': { - 'path': 'gs://seqr-reference-data/GRCh38/eigen/EIGEN_coding_noncoding.liftover_grch38.ht', - 'select': {'Eigen_phred': 'info.Eigen-phred'}, - }, - }, - 'hgmd': { - '37': { - 'custom_import': download_and_import_hgmd_vcf, - 'version': 'HGMD_Pro_2023', - 'source_path': 'gs://seqr-reference-data-private/GRCh37/HGMD/HGMD_Pro_2023.1_hg19.vcf.gz', - 'select': {'accession': 'rsid', 'class': 'info.CLASS'}, - 'enum_select': { - 'class': [ - 'DM', - 'DM?', - 'DP', - 'DFP', - 'FP', - 'R', - ], - }, - }, - '38': { - 'custom_import': download_and_import_hgmd_vcf, - 'version': 'HGMD_Pro_2023', - 'source_path': 'gs://seqr-reference-data-private/GRCh38/HGMD/HGMD_Pro_2023.1_hg38.vcf.gz', - 'select': {'accession': 'rsid', 'class': 'info.CLASS'}, - 'enum_select': { - 'class': [ - 'DM', - 'DM?', - 'DP', - 'DFP', - 'FP', - 'R', - ], - }, - }, - }, - 'mpc': { - '37': { - 'path': 'gs://seqr-reference-data/GRCh37/MPC/fordist_constraint_official_mpc_values.ht', - 'custom_select': custom_mpc_select, - }, - '38': { - 'path': 'gs://seqr-reference-data/GRCh38/MPC/fordist_constraint_official_mpc_values.liftover.GRCh38.ht', - 'custom_select': custom_mpc_select, - }, - }, - 'primate_ai': { - '37': { - 'version': 'v0.2', - 'path': 'gs://seqr-reference-data/GRCh37/primate_ai/PrimateAI_scores_v0.2.ht', - 'select': {'score': 'info.score'}, - }, - '38': { - 'version': 'v0.2', - 'path': 'gs://seqr-reference-data/GRCh38/primate_ai/PrimateAI_scores_v0.2.liftover_grch38.ht', - 'select': {'score': 'info.score'}, - }, - }, - 'splice_ai': { - '37': { - 'path': 'gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.ht', - 'select': { - 'delta_score': 'info.max_DS', - 'splice_consequence': 'info.splice_consequence', - }, - 'enum_select': { - 'splice_consequence': [ - 'Acceptor gain', - 'Acceptor loss', - 'Donor gain', - 'Donor loss', - 'No consequence', - ], - }, - }, - '38': { - 'path': 'gs://seqr-reference-data/GRCh38/spliceai/spliceai_scores.ht', - 'select': { - 'delta_score': 'info.max_DS', - 'splice_consequence': 'info.splice_consequence', - }, - 'enum_select': { - 'splice_consequence': [ - 'Acceptor gain', - 'Acceptor loss', - 'Donor gain', - 'Donor loss', - 'No consequence', - ], - }, - }, - }, - 'topmed': { - '37': { - 'path': 'gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.ht', - 'select': { - 'AC': 'info.AC#', - 'AF': 'info.AF#', - 'AN': 'info.AN', - 'Hom': 'info.Hom#', - 'Het': 'info.Het#', - }, - }, - '38': { - 'path': 'gs://seqr-reference-data/GRCh38/TopMed/freeze8/TOPMed.all.ht', - 'select': { - 'AC': 'info.AC', - 'AF': 'info.AF', - 'AN': 'info.AN', - 'Hom': 'info.Hom', - 'Het': 'info.Het', - }, - }, - }, - 'gnomad_exomes': { - '37': { - 'version': 'r2.1.1', - 'path': 'gs://gcp-public-data--gnomad/release/2.1.1/ht/exomes/gnomad.exomes.r2.1.1.sites.ht', - 'custom_select': custom_gnomad_select_v2, - }, - '38': { - 'version': '4.1', - 'path': 'gs://gcp-public-data--gnomad/release/4.1/ht/exomes/gnomad.exomes.v4.1.sites.ht', - 'custom_select': custom_gnomad_select_v4, - }, - }, - 'gnomad_genomes': { - '37': { - 'version': 'r2.1.1', - 'path': 'gs://gcp-public-data--gnomad/release/2.1.1/ht/genomes/gnomad.genomes.r2.1.1.sites.ht', - 'custom_select': custom_gnomad_select_v2, - }, - '38': { - 'version': '4.1', - 'path': 'gs://gcp-public-data--gnomad/release/4.1/ht/genomes/gnomad.genomes.v4.1.sites.ht', - 'custom_select': custom_gnomad_select_v4, - }, - }, - 'gnomad_qc': { - '37': { - 'version': 'v2', - 'custom_import': import_matrix_table, - # Note: copied from 'gs://gnomad/sample_qc/mt/gnomad.joint.high_callrate_common_biallelic_snps.pruned.mt' - 'source_path': 'gs://seqr-reference-data/gnomad_qc/GRCh37/gnomad.joint.high_callrate_common_biallelic_snps.pruned.mt', - }, - '38': { - 'version': '4.0', - 'path': 'gs://gcp-public-data--gnomad/release/4.0/pca/gnomad.v4.0.pca_loadings.ht', - }, - }, - 'exac': { - '37': { - 'path': 'gs://seqr-reference-data/GRCh37/gnomad/ExAC.r1.sites.vep.ht', - 'select': { - 'AF_POPMAX': 'info.AF_POPMAX', - 'AF': 'info.AF#', - 'AC_Adj': 'info.AC_Adj#', - 'AC_Het': 'info.AC_Het#', - 'AC_Hom': 'info.AC_Hom#', - 'AC_Hemi': 'info.AC_Hemi#', - 'AN_Adj': 'info.AN_Adj', - }, - }, - '38': { - 'path': 'gs://seqr-reference-data/GRCh38/gnomad/ExAC.r1.sites.liftover.b38.ht', - 'select': { - 'AF_POPMAX': 'info.AF_POPMAX', - 'AF': 'info.AF#', - 'AC_Adj': 'info.AC_Adj#', - 'AC_Het': 'info.AC_Het#', - 'AC_Hom': 'info.AC_Hom#', - 'AC_Hemi': 'info.AC_Hemi#', - 'AN_Adj': 'info.AN_Adj', - }, - }, - }, - 'gnomad_non_coding_constraint': { - '38': { - 'path': 'gs://seqr-reference-data/GRCh38/gnomad_nc_constraint/gnomad_non-coding_constraint_z_scores.ht', - 'select': {'z_score': 'target'}, - }, - }, - 'screen': { - '38': { - 'path': 'gs://seqr-reference-data/GRCh38/ccREs/GRCh38-ccREs.ht', - 'select': {'region_type': 'target'}, - 'enum_select': { - 'region_type': [ - 'CTCF-bound', - 'CTCF-only', - 'DNase-H3K4me3', - 'PLS', - 'dELS', - 'pELS', - 'DNase-only', - 'low-DNase', - ], - }, - }, - }, - 'clinvar_mito': { - '37': { - 'custom_import': get_clinvar_ht, - 'source_path': 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - 'select': {'alleleId': 'info.ALLELEID'}, - 'custom_select': clinvar_custom_select, - 'enum_select': { - 'pathogenicity': CLINVAR_PATHOGENICITIES, - 'assertion': CLINVAR_ASSERTIONS, - }, - 'filter': lambda ht: ht.locus.contig == 'MT', - }, - '38': { - 'custom_import': get_clinvar_ht, - 'source_path': 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz', - 'select': {'alleleId': 'info.ALLELEID'}, - 'custom_select': clinvar_custom_select, - 'enum_select': { - 'pathogenicity': CLINVAR_PATHOGENICITIES, - 'assertion': CLINVAR_ASSERTIONS, - }, - 'filter': lambda ht: ht.locus.contig == 'chrM', - }, - }, - 'dbnsfp_mito': { - '37': { - 'version': '2.9.3', - 'path': 'gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.with_new_scores.ht', - 'custom_select': dbnsfp_mito_custom_select, - 'enum_select': { - 'MutationTaster_pred': ['D', 'A', 'N', 'P'], - }, - 'filter': lambda ht: ht.locus.contig == 'MT', - }, - '38': { - 'version': '4.2', - 'path': 'gs://seqr-reference-data/GRCh38/dbNSFP/v4.2/dbNSFP4.2a_variant.with_new_scores.ht', - 'custom_select': dbnsfp_mito_custom_select, - 'enum_select': { - 'MutationTaster_pred': ['D', 'A', 'N', 'P'], - }, - 'filter': lambda ht: ht.locus.contig == 'chrM', - }, - }, - 'gnomad_mito': { - '38': { - 'version': 'v3.1', - 'path': 'gs://gcp-public-data--gnomad/release/3.1/ht/genomes/gnomad.genomes.v3.1.sites.chrM.ht', - 'custom_select': custom_gnomad_mito, - }, - }, - 'mitomap': { - '38': { - 'version': 'Feb. 04 2022', - 'path': 'gs://seqr-reference-data/GRCh38/mitochondrial/MITOMAP/mitomap-confirmed-mutations-2022-02-04.ht', - 'select': ['pathogenic'], - }, - }, - 'mitimpact': { - '38': { - 'version': '3.1.3', - 'path': 'gs://seqr-reference-data/GRCh38/mitochondrial/MitImpact/MitImpact_db_3.1.3.ht', - 'select': {'score': 'APOGEE2_score'}, - }, - }, - 'hmtvar': { - '38': { - 'version': 'Jan. 10 2022', - 'path': 'gs://seqr-reference-data/GRCh38/mitochondrial/HmtVar/HmtVar%20Jan.%2010%202022.ht', - 'select': {'score': 'disease_score'}, - }, - }, - 'helix_mito': { - '38': { - 'version': '20200327', - 'path': 'gs://seqr-reference-data/GRCh38/mitochondrial/Helix/HelixMTdb_20200327.ht', - 'select': { - 'AC_hom': 'counts_hom', - 'AF_hom': 'AF_hom', - 'AC_het': 'counts_het', - 'AF_het': 'AF_het', - 'AN': 'AN', - 'max_hl': 'max_ARF', - }, - }, - }, - 'high_constraint_region_mito': { - '38': { - 'version': 'Feb-15-2022', - 'source_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/Helix high constraint intervals Feb-15-2022.tsv', - 'custom_import': import_locus_intervals, - }, - }, - 'local_constraint_mito': { - '38': { - 'version': '2024-07-24', - # Originally sourced from https://www.biorxiv.org/content/10.1101/2022.12.16.520778v2.supplementary-material - # Supplementary Table 7. - 'source_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/local_constraint.tsv', - 'custom_import': download_and_import_local_constraint_tsv, - 'select': {'score': 'MLC_score'}, - }, - }, -} diff --git a/v03_pipeline/lib/reference_data/dataset_table_operations.py b/v03_pipeline/lib/reference_data/dataset_table_operations.py deleted file mode 100644 index 7d4d44b67..000000000 --- a/v03_pipeline/lib/reference_data/dataset_table_operations.py +++ /dev/null @@ -1,218 +0,0 @@ -from datetime import datetime -from types import FunctionType - -import hail as hl -import pytz - -from v03_pipeline.lib.misc.nested_field import parse_nested_field -from v03_pipeline.lib.model import ( - DatasetType, - ReferenceDatasetCollection, - ReferenceGenome, -) -from v03_pipeline.lib.reference_data.config import CONFIG - - -def update_or_create_joined_ht( - reference_dataset_collection: ReferenceDatasetCollection, - dataset_type: DatasetType, - reference_genome: ReferenceGenome, - datasets: list[str], - joined_ht: hl.Table, -) -> hl.Table: - for dataset in datasets: - # Drop the dataset if it exists. - if dataset in joined_ht.row: - joined_ht = joined_ht.drop(dataset) - joined_ht = joined_ht.annotate_globals( - paths=joined_ht.paths.drop(dataset), - versions=joined_ht.versions.drop(dataset), - enums=joined_ht.enums.drop(dataset), - ) - - # Handle cases where a dataset has been dropped OR renamed. - if dataset not in CONFIG: - continue - - # Join the new one! - dataset_ht = get_dataset_ht(dataset, reference_genome) - joined_ht = joined_ht.join(dataset_ht, 'outer') - joined_ht = annotate_dataset_globals(joined_ht, dataset, dataset_ht) - - return joined_ht.filter( - hl.any( - [ - ~hl.is_missing(joined_ht[dataset]) - for dataset in reference_dataset_collection.datasets(dataset_type) - ], - ), - ) - - -def get_dataset_ht( - dataset: str, - reference_genome: ReferenceGenome, -) -> hl.Table: - config = CONFIG[dataset][reference_genome.v02_value] - ht = import_ht_from_config_path(config, dataset, reference_genome) - if hasattr(ht, 'locus'): - ht = ht.filter( - hl.set(reference_genome.standard_contigs).contains(ht.locus.contig), - ) - - ht = ht.filter(config['filter'](ht)) if 'filter' in config else ht - ht = ht.select(**get_all_select_fields(ht, config)) - ht = ht.transmute(**get_enum_select_fields(ht, config)) - return ht.select(**{dataset: ht.row.drop(*ht.key)}).distinct() - - -def get_ht_path(config: dict) -> str: - return config['source_path'] if 'custom_import' in config else config['path'] - - -def import_ht_from_config_path( - config: dict, - dataset: str, - reference_genome: ReferenceGenome, -) -> hl.Table: - path = get_ht_path(config) - ht = ( - config['custom_import'](path, reference_genome) - if 'custom_import' in config - else hl.read_table(path) - ) - return ht.annotate_globals( - path=path, - version=parse_dataset_version(ht, dataset, config), - enums=hl.Struct( - **config.get( - 'enum_select', - hl.missing(hl.tstruct(hl.tstr, hl.tarray(hl.tstr))), - ), - ), - ) - - -def get_select_fields(selects: list | dict | None, base_ht: hl.Table) -> dict: - """ - Generic function that takes in a select config and base_ht and generates a - select dict that is generated from traversing the base_ht and extracting the right - annotation. If '#' is included at the end of a select field, the appropriate - biallelic position will be selected (e.g. 'x#' -> x[base_ht.a_index-1]. - :param selects: mapping or list of selections - :param base_ht: base_ht to traverse - :return: select mapping from annotation name to base_ht annotation - """ - select_fields = {} - if selects is None: - return select_fields - if isinstance(selects, list): - select_fields = {selection: base_ht[selection] for selection in selects} - elif isinstance(selects, dict): - for key, val in selects.items(): - expression = parse_nested_field(base_ht, val) - # Parse float64s into float32s to save space! - if expression.dtype == hl.tfloat64: - expression = hl.float32(expression) - select_fields[key] = expression - return select_fields - - -def get_custom_select_fields(custom_select: FunctionType | None, ht: hl.Table) -> dict: - if custom_select is None: - return {} - return custom_select(ht) - - -def get_all_select_fields( - ht: hl.Table, - config: dict, -) -> dict: - return { - **get_select_fields(config.get('select'), ht), - **get_custom_select_fields(config.get('custom_select'), ht), - } - - -def get_enum_select_fields(ht: hl.Table, config: dict) -> dict: - enum_selects = config.get('enum_select') - enum_select_fields = {} - if enum_selects is None: - return enum_select_fields - for field_name, values in enum_selects.items(): - lookup = hl.dict( - hl.enumerate(values, index_first=False).extend( - # NB: adding missing values here allows us to - # hard fail if a mapped key is present and has an unexpected value - # but propagate missing values. - [(hl.missing(hl.tstr), hl.missing(hl.tint32))], - ), - ) - # NB: this conditioning on type is "outside" the hail expression context. - if ( - isinstance(ht[field_name].dtype, hl.tarray | hl.tset) - and ht[field_name].dtype.element_type == hl.tstr - ): - enum_select_fields[f'{field_name}_ids'] = ht[field_name].map( - lambda x: lookup[x], # noqa: B023 - ) - else: - enum_select_fields[f'{field_name}_id'] = lookup[ht[field_name]] - return enum_select_fields - - -def parse_dataset_version( - ht: hl.Table, - dataset: str, - config: dict, -) -> hl.StringExpression: - annotated_version = ht.globals.get('version', hl.missing(hl.tstr)) - config_version = config.get('version', hl.missing(hl.tstr)) - return ( - hl.case() - .when(hl.is_missing(config_version), annotated_version) - .when(hl.is_missing(annotated_version), config_version) - .when(annotated_version == config_version, config_version) - .or_error( - hl.format( - 'found mismatching versions for dataset %s. config version: %s, ht version: %s', - dataset, - config_version, - annotated_version, - ), - ) - ) - - -def annotate_dataset_globals(joined_ht: hl.Table, dataset: str, dataset_ht: hl.Table): - return joined_ht.select_globals( - paths=joined_ht.paths.annotate(**{dataset: dataset_ht.index_globals().path}), - versions=joined_ht.versions.annotate( - **{dataset: dataset_ht.index_globals().version}, - ), - enums=joined_ht.enums.annotate(**{dataset: dataset_ht.index_globals().enums}), - date=datetime.now(tz=pytz.timezone('US/Eastern')).isoformat(), - ) - - -def join_hts( - reference_genome: ReferenceGenome, - dataset_type: DatasetType, - reference_dataset_collection: ReferenceDatasetCollection, -): - key_type = reference_dataset_collection.table_key_type(reference_genome) - joined_ht = hl.Table.parallelize( - [], - key_type, - key=key_type.fields, - globals=hl.Struct( - paths=hl.Struct(), - versions=hl.Struct(), - enums=hl.Struct(), - ), - ) - for dataset in reference_dataset_collection.datasets(dataset_type): - dataset_ht = get_dataset_ht(dataset, reference_genome) - joined_ht = joined_ht.join(dataset_ht, 'outer') - joined_ht = annotate_dataset_globals(joined_ht, dataset, dataset_ht) - return joined_ht diff --git a/v03_pipeline/lib/reference_data/dataset_table_operations_test.py b/v03_pipeline/lib/reference_data/dataset_table_operations_test.py deleted file mode 100644 index f1376c8ff..000000000 --- a/v03_pipeline/lib/reference_data/dataset_table_operations_test.py +++ /dev/null @@ -1,585 +0,0 @@ -import unittest -from datetime import datetime -from unittest import mock - -import hail as hl -import pytz - -from v03_pipeline.lib.model import ( - DatasetType, - ReferenceDatasetCollection, - ReferenceGenome, -) -from v03_pipeline.lib.reference_data.config import ( - dbnsfp_custom_select, - dbnsfp_mito_custom_select, -) -from v03_pipeline.lib.reference_data.dataset_table_operations import ( - get_dataset_ht, - get_enum_select_fields, - update_or_create_joined_ht, -) - -MOCK_CONFIG = { - 'a': { - '38': { - 'path': '', - 'select': [ - 'd', - ], - }, - }, - 'b': { - '38': { - 'path': '', - 'select': [ - 'e', - ], - 'enum_select': {}, - }, - }, -} -MOCK_JOINED_REFERENCE_DATA_HT = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=1, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'a': hl.Struct(d=1), - 'b': hl.Struct(e=2), - }, - { - 'locus': hl.Locus( - contig='chr1', - position=2, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'a': hl.Struct(d=3), - 'b': hl.Struct(e=4), - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - a=hl.tstruct(d=hl.tint32), - b=hl.tstruct(e=hl.tint32), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - paths=hl.Struct( - a='a_path', - b='b_path', - ), - versions=hl.Struct( - a='a_version', - b='b_version', - ), - enums=hl.Struct( - a=hl.Struct(), - b=hl.Struct(), - ), - ), -) -MOCK_A_DATASET_HT = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=1, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'a': hl.Struct(d=1), - }, - { - 'locus': hl.Locus( - contig='chr1', - position=2, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'a': hl.Struct(d=3), - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - a=hl.tstruct(d=hl.tint32), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - path='a_path', - version='a_version', - enums=hl.Struct(), - ), -) -MOCK_B_DATASET_HT = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=1, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'b': hl.Struct(e=5, f=1), - }, - { - 'locus': hl.Locus( - contig='chr1', - position=3, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'b': hl.Struct(e=7, f=2), - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - b=hl.tstruct(e=hl.tint32, f=hl.tint32), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - path='b_new_path', - version='b_new_version', - enums=hl.Struct( - enum_1=[ - 'D', - 'F', - ], - ), - ), -) -EXPECTED_JOINED_DATA = [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=1, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - a=hl.Struct(d=1), - b=hl.Struct(e=5, f=1), - ), - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=2, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - a=hl.Struct(d=3), - b=None, - ), - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=3, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - a=None, - b=hl.Struct(e=7, f=2), - ), -] -EXPECTED_GLOBALS = [ - hl.Struct( - date='2023-04-19T16:43:39.361110-04:56', - paths=hl.Struct( - a='a_path', - b='b_new_path', - ), - versions=hl.Struct( - a='a_version', - b='b_new_version', - ), - enums=hl.Struct( - a=hl.Struct(), - b=hl.Struct( - enum_1=[ - 'D', - 'F', - ], - ), - ), - ), -] - -MOCK_DATETIME = datetime( - 2023, - 4, - 19, - 16, - 43, - 39, - 361110, - tzinfo=pytz.timezone('US/Eastern'), -) - -PATH_TO_FILE_UNDER_TEST = 'v03_pipeline.lib.reference_data.dataset_table_operations' - - -class DatasetTableOperationsTest(unittest.TestCase): - def test_get_enum_select_fields(self): - ht = hl.Table.parallelize( - [ - {'variant': ['1', '2'], 'sv_type': 'a', 'sample_fix': '1'}, - { - 'variant': ['1', '3', '2'], - 'sv_type': 'b', - 'sample_fix': '2', - }, - {'variant': ['1', '3'], 'sv_type': 'c', 'sample_fix': '3'}, - {'variant': ['4'], 'sv_type': 'd', 'sample_fix': '4'}, - ], - hl.tstruct( - variant=hl.dtype('array'), - sv_type=hl.dtype('str'), - sample_fix=hl.dtype('str'), - ), - ) - enum_select_fields = get_enum_select_fields( - ht, - { - 'enum_select': { - 'variant': ['1', '2', '3', '4'], - 'sv_type': ['a', 'b', 'c', 'd'], - }, - }, - ) - mapped_ht = ht.transmute(**enum_select_fields) - self.assertListEqual( - mapped_ht.collect(), - [ - hl.Struct(variant_ids=[0, 1], sv_type_id=0, sample_fix='1'), - hl.Struct(variant_ids=[0, 2, 1], sv_type_id=1, sample_fix='2'), - hl.Struct(variant_ids=[0, 2], sv_type_id=2, sample_fix='3'), - hl.Struct(variant_ids=[3], sv_type_id=3, sample_fix='4'), - ], - ) - - enum_select_fields = get_enum_select_fields( - ht, - { - 'enum_select': {'sv_type': ['d']}, - }, - ) - mapped_ht = ht.select(**enum_select_fields) - self.assertRaises(Exception, mapped_ht.collect) - - @mock.patch.dict( - f'{PATH_TO_FILE_UNDER_TEST}.CONFIG', - { - 'mock_dbnsfp': { - '38': { - 'path': '', - 'select': [ - 'fathmm_MKL_coding_pred', - ], - 'custom_select': dbnsfp_custom_select, - 'enum_select': { - 'MutationTaster_pred': ['D', 'A', 'N', 'P'], - 'fathmm_MKL_coding_pred': ['D', 'N'], - }, - }, - }, - 'mock_dbnsfp_mito': { - '38': { - 'path': '', - 'custom_select': dbnsfp_mito_custom_select, - 'enum_select': { - 'MutationTaster_pred': ['D', 'A', 'N', 'P'], - }, - 'filter': lambda ht: ht.locus.contig == 'chrM', - }, - }, - }, - ) - @mock.patch(f'{PATH_TO_FILE_UNDER_TEST}.hl.read_table') - def test_dbnsfp_select_and_filter(self, mock_read_table): - mock_read_table.return_value = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=1, - reference_genome='GRCh38', - ), - 'REVEL_score': hl.missing(hl.tstr), - 'SIFT_score': '.;0.082', - 'Polyphen2_HVAR_score': '.;0.401', - 'MutationTaster_pred': 'P', - 'fathmm_MKL_coding_pred': 'N', - }, - { - 'locus': hl.Locus( - contig='chrM', - position=2, - reference_genome='GRCh38', - ), - 'REVEL_score': '0.052', - 'SIFT_score': '.;0.082', - 'Polyphen2_HVAR_score': '.;0.401', - 'MutationTaster_pred': 'P', - 'fathmm_MKL_coding_pred': 'D', - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - REVEL_score=hl.tstr, - SIFT_score=hl.tstr, - Polyphen2_HVAR_score=hl.tstr, - MutationTaster_pred=hl.tstr, - fathmm_MKL_coding_pred=hl.tstr, - ), - key='locus', - ) - ht = get_dataset_ht( - 'mock_dbnsfp', - ReferenceGenome.GRCh38, - ) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=1, - reference_genome='GRCh38', - ), - mock_dbnsfp=hl.Struct( - REVEL_score=None, - SIFT_score=hl.eval(hl.float32(0.082)), - Polyphen2_HVAR_score=hl.eval(hl.float32(0.401)), - MutationTaster_pred_id=3, - fathmm_MKL_coding_pred_id=1, - ), - ), - hl.Struct( - locus=hl.Locus( - contig='chrM', - position=2, - reference_genome='GRCh38', - ), - mock_dbnsfp=hl.Struct( - REVEL_score=hl.eval(hl.float32(0.052)), - SIFT_score=hl.eval(hl.float32(0.082)), - Polyphen2_HVAR_score=hl.eval(hl.float32(0.401)), - MutationTaster_pred_id=3, - fathmm_MKL_coding_pred_id=0, - ), - ), - ], - ) - ht = get_dataset_ht( - 'mock_dbnsfp_mito', - ReferenceGenome.GRCh38, - ) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chrM', - position=2, - reference_genome='GRCh38', - ), - mock_dbnsfp_mito=hl.Struct( - SIFT_score=hl.eval(hl.float32(0.0820000022649765)), - MutationTaster_pred_id=3, - ), - ), - ], - ) - - @mock.patch.dict( - f'{PATH_TO_FILE_UNDER_TEST}.CONFIG', - { - 'a': { - '38': { - 'path': 'gs://a.com', - 'select': ['b'], - 'version': '2.2.2', - }, - }, - }, - ) - @mock.patch(f'{PATH_TO_FILE_UNDER_TEST}.hl.read_table') - def test_parse_version(self, mock_read_table): - ht = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=1, - reference_genome='GRCh38', - ), - 'b': 1, - }, - { - 'locus': hl.Locus( - contig='chr1', - position=2, - reference_genome='GRCh38', - ), - 'b': 2, - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - b=hl.tint32, - ), - key=['locus'], - globals=hl.Struct( - version='2.2.2', - ), - ) - mock_read_table.return_value = ht - self.assertCountEqual( - get_dataset_ht( - 'a', - ReferenceGenome.GRCh38, - ).globals.collect(), - [ - hl.Struct( - path='gs://a.com', - version='2.2.2', - enums=hl.Struct(), - ), - ], - ) - mock_read_table.return_value = ht.annotate_globals(version=hl.missing(hl.tstr)) - - self.assertCountEqual( - get_dataset_ht( - 'a', - ReferenceGenome.GRCh38, - ).globals.collect(), - [ - hl.Struct( - path='gs://a.com', - version='2.2.2', - enums=hl.Struct(), - ), - ], - ) - - mock_read_table.return_value = ht.annotate_globals(version='1.2.3') - ht = get_dataset_ht( - 'a', - ReferenceGenome.GRCh38, - ) - self.assertRaises(Exception, ht.globals.collect) - - @mock.patch.dict(f'{PATH_TO_FILE_UNDER_TEST}.CONFIG', MOCK_CONFIG) - @mock.patch(f'{PATH_TO_FILE_UNDER_TEST}.get_dataset_ht') - @mock.patch(f'{PATH_TO_FILE_UNDER_TEST}.datetime', wraps=datetime) - @mock.patch.object(ReferenceDatasetCollection, 'datasets') - def test_update_or_create_joined_ht_one_dataset( - self, - mock_reference_dataset_collection_datasets, - mock_datetime, - mock_get_dataset_ht, - ): - mock_reference_dataset_collection_datasets.return_value = ['a', 'b'] - mock_datetime.now.return_value = MOCK_DATETIME - mock_get_dataset_ht.return_value = MOCK_B_DATASET_HT - - ht = update_or_create_joined_ht( - ReferenceDatasetCollection.INTERVAL, - DatasetType.SNV_INDEL, - ReferenceGenome.GRCh38, - datasets=['b'], - joined_ht=MOCK_JOINED_REFERENCE_DATA_HT, - ) - self.assertCountEqual( - ht.collect(), - EXPECTED_JOINED_DATA, - ) - self.assertCountEqual(ht.globals.collect(), EXPECTED_GLOBALS) - - @mock.patch.dict(f'{PATH_TO_FILE_UNDER_TEST}.CONFIG', MOCK_CONFIG) - @mock.patch(f'{PATH_TO_FILE_UNDER_TEST}.get_dataset_ht') - @mock.patch(f'{PATH_TO_FILE_UNDER_TEST}.datetime', wraps=datetime) - @mock.patch.object(ReferenceDatasetCollection, 'datasets') - def test_update_or_create_joined_ht_all_datasets( - self, - mock_reference_dataset_collection_datasets, - mock_datetime, - mock_get_dataset_ht, - ): - mock_reference_dataset_collection_datasets.return_value = ['a', 'b'] - mock_datetime.now.return_value = MOCK_DATETIME - mock_get_dataset_ht.side_effect = [MOCK_A_DATASET_HT, MOCK_B_DATASET_HT] - - empty_ht = hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus(ReferenceGenome.GRCh38.value), - alleles=hl.tarray(hl.tstr), - ), - key=('locus', 'alleles'), - globals=hl.Struct( - paths=hl.Struct(), - versions=hl.Struct(), - enums=hl.Struct(), - ), - ) - - ht = update_or_create_joined_ht( - ReferenceDatasetCollection.COMBINED, - DatasetType.SNV_INDEL, - ReferenceGenome.GRCh38, - datasets=['a', 'b'], - joined_ht=empty_ht, - ) - self.assertCountEqual( - ht.collect(), - EXPECTED_JOINED_DATA, - ) - self.assertCountEqual(ht.globals.collect(), EXPECTED_GLOBALS) - - @mock.patch.dict(f'{PATH_TO_FILE_UNDER_TEST}.CONFIG', MOCK_CONFIG) - @mock.patch.object(ReferenceDatasetCollection, 'datasets') - def test_update_or_create_joined_ht_drop_a_dataset( - self, - mock_reference_dataset_collection_datasets, - ): - mock_reference_dataset_collection_datasets.return_value = ['b'] - ht = hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus(ReferenceGenome.GRCh38.value), - alleles=hl.tarray(hl.tstr), - c=hl.tint32, - b=hl.tint32, - ), - key=('locus', 'alleles'), - globals=hl.Struct( - paths=hl.Struct(c='abc', b='123'), - versions=hl.Struct(c='def', b='456'), - enums=hl.Struct(c=hl.Struct(d=['a', 'b'])), - ), - ) - ht = update_or_create_joined_ht( - ReferenceDatasetCollection.COMBINED, - DatasetType.SNV_INDEL, - ReferenceGenome.GRCh38, - datasets=['c'], - joined_ht=ht, - ) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct(b='123'), - versions=hl.Struct(b='456'), - enums=hl.Struct(), - ), - ], - ) diff --git a/v03_pipeline/lib/reference_data/hgmd.py b/v03_pipeline/lib/reference_data/hgmd.py deleted file mode 100644 index d71825888..000000000 --- a/v03_pipeline/lib/reference_data/hgmd.py +++ /dev/null @@ -1,18 +0,0 @@ -import hail as hl - -from v03_pipeline.lib.model.definitions import ReferenceGenome - - -def download_and_import_hgmd_vcf( - hgmd_url: str, - reference_genome: ReferenceGenome, -) -> hl.Table: - mt = hl.import_vcf( - hgmd_url, - reference_genome=reference_genome.value, - force=True, - min_partitions=100, - skip_invalid_loci=True, - contig_recoding=reference_genome.contig_recoding(), - ) - return mt.rows() diff --git a/v03_pipeline/lib/reference_data/hgmd_test.py b/v03_pipeline/lib/reference_data/hgmd_test.py deleted file mode 100644 index cac7d9be5..000000000 --- a/v03_pipeline/lib/reference_data/hgmd_test.py +++ /dev/null @@ -1,12 +0,0 @@ -import unittest - -from v03_pipeline.lib.model import ReferenceGenome -from v03_pipeline.lib.reference_data.hgmd import download_and_import_hgmd_vcf - -TEST_HGMD_VCF = 'v03_pipeline/var/test/reference_data/test_hgmd.vcf' - - -class HGMDTest(unittest.TestCase): - def test_import_hgmd_vcf(self): - ht = download_and_import_hgmd_vcf(TEST_HGMD_VCF, ReferenceGenome.GRCh38) - self.assertEqual(ht.count(), 1) diff --git a/v03_pipeline/lib/reference_data/mito.py b/v03_pipeline/lib/reference_data/mito.py deleted file mode 100644 index 7df647324..000000000 --- a/v03_pipeline/lib/reference_data/mito.py +++ /dev/null @@ -1,16 +0,0 @@ -import hail as hl - -from v03_pipeline.lib.model.definitions import ReferenceGenome - - -def download_and_import_local_constraint_tsv( - url: str, - reference_genome: ReferenceGenome, -) -> hl.Table: - ht = hl.import_table(url, types={'Position': hl.tint32, 'MLC_score': hl.tfloat32}) - ht = ht.select( - locus=hl.locus('chrM', ht.Position, reference_genome.value), - alleles=[ht.Reference, ht.Alternate], - MLC_score=ht.MLC_score, - ) - return ht.key_by('locus', 'alleles') diff --git a/v03_pipeline/lib/reference_data/__init__.py b/v03_pipeline/lib/reference_datasets/__init__.py similarity index 100% rename from v03_pipeline/lib/reference_data/__init__.py rename to v03_pipeline/lib/reference_datasets/__init__.py diff --git a/v03_pipeline/lib/reference_datasets/clinvar.py b/v03_pipeline/lib/reference_datasets/clinvar.py new file mode 100644 index 000000000..89e70e47b --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/clinvar.py @@ -0,0 +1,172 @@ +import gzip +import shutil +import tempfile + +import hail as hl +import requests + +from v03_pipeline.lib.annotations.enums import ( + CLINVAR_ASSERTIONS, + CLINVAR_DEFAULT_PATHOGENICITY, + CLINVAR_PATHOGENICITIES, + CLINVAR_PATHOGENICITIES_LOOKUP, +) +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import vcf_to_ht + +CLINVAR_GOLD_STARS_LOOKUP = hl.dict( + { + 'no_classification_for_the_single_variant': 0, + 'no_classification_provided': 0, + 'no_assertion_criteria_provided': 0, + 'no_classifications_from_unflagged_records': 0, + 'criteria_provided,_single_submitter': 1, + 'criteria_provided,_conflicting_classifications': 1, + 'criteria_provided,_multiple_submitters,_no_conflicts': 2, + 'reviewed_by_expert_panel': 3, + 'practice_guideline': 4, + }, +) +CLINVAR_SUBMISSION_SUMMARY_URL = ( + 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/tab_delimited/submission_summary.txt.gz' +) + +ENUMS = { + 'assertion': CLINVAR_ASSERTIONS, + 'pathogenicity': CLINVAR_PATHOGENICITIES, +} + + +def parsed_clnsig(ht: hl.Table): + return ( + hl.delimit(ht.info.CLNSIG) + .replace( + 'Likely_pathogenic,_low_penetrance', + 'Likely_pathogenic|low_penetrance', + ) + .replace( + '/Pathogenic,_low_penetrance/Established_risk_allele', + '/Established_risk_allele|low_penetrance', + ) + .replace( + '/Pathogenic,_low_penetrance', + '|low_penetrance', + ) + .split(r'\|') + ) + + +def parse_to_count(entry: str): + splt = entry.split( + r'\(', + ) # pattern, count = entry... if destructuring worked on a hail expression! + return hl.Struct( + pathogenicity_id=CLINVAR_PATHOGENICITIES_LOOKUP[splt[0]], + count=hl.int32(splt[1][:-1]), + ) + + +def parsed_and_mapped_clnsigconf(ht: hl.Table): + return ( + hl.delimit(ht.info.CLNSIGCONF) + .replace(',_low_penetrance', '') + .split(r'\|') + .map(parse_to_count) + .group_by(lambda x: x.pathogenicity_id) + .map_values( + lambda values: ( + values.fold( + lambda x, y: x + y.count, + 0, + ) + ), + ) + .items() + .map(lambda x: hl.Struct(pathogenicity_id=x[0], count=x[1])) + ) + + +def parse_clinvar_release_date(clinvar_url: str) -> str: + response = requests.get(clinvar_url, stream=True, timeout=10) + for byte_line in gzip.GzipFile(fileobj=response.raw): + line = byte_line.decode('ascii').strip() + if not line: + continue + if line.startswith('##fileDate='): + return line.split('=')[-1].strip() + if not line.startswith('#'): + return None + return None + + +def get_submission_summary_ht() -> hl.Table: + with tempfile.NamedTemporaryFile( + suffix='.txt.gz', + delete=False, + ) as tmp_file, requests.get( + CLINVAR_SUBMISSION_SUMMARY_URL, + stream=True, + timeout=10, + ) as r: + shutil.copyfileobj(r.raw, tmp_file) + ht = hl.import_table( + tmp_file.name, + force=True, + filter='^(#[^:]*:|^##).*$', # removes all comments except for the header line + types={ + '#VariationID': hl.tstr, + 'Submitter': hl.tstr, + 'ReportedPhenotypeInfo': hl.tstr, + }, + missing='-', + ) + ht = ht.rename({'#VariationID': 'VariationID'}) + ht = ht.select('VariationID', 'Submitter', 'ReportedPhenotypeInfo') + return ht.group_by('VariationID').aggregate( + Submitters=hl.agg.collect(ht.Submitter), + Conditions=hl.agg.collect(ht.ReportedPhenotypeInfo), + ) + + +def select_fields(ht): + clnsigs = parsed_clnsig(ht) + return ht.select( + alleleId=ht.info.ALLELEID, + pathogenicity=hl.if_else( + CLINVAR_PATHOGENICITIES_LOOKUP.contains(clnsigs[0]), + clnsigs[0], + CLINVAR_DEFAULT_PATHOGENICITY, + ), + assertion=hl.if_else( + CLINVAR_PATHOGENICITIES_LOOKUP.contains(clnsigs[0]), + clnsigs[1:], + clnsigs, + ), + # NB: there's a hidden enum-mapping inside this clinvar function. + conflictingPathogenicities=parsed_and_mapped_clnsigconf(ht), + goldStars=CLINVAR_GOLD_STARS_LOOKUP.get(hl.delimit(ht.info.CLNREVSTAT)), + submitters=ht.submitters, + # assumes the format 'MedGen#:condition', e.g.'C0023264:Leigh syndrome' + conditions=hl.map( + lambda p: p.split(r':')[1], + ht.conditions, + ), + ) + + +def get_ht( + clinvar_url: str, + reference_genome: ReferenceGenome, +) -> hl.Table: + with tempfile.NamedTemporaryFile( + suffix='.vcf.gz', + delete=False, + ) as tmp_file, requests.get(clinvar_url, stream=True, timeout=10) as r: + shutil.copyfileobj(r.raw, tmp_file) + ht = vcf_to_ht(tmp_file.name, reference_genome) + submitters_ht = get_submission_summary_ht() + ht = ht.annotate( + submitters=submitters_ht[ht.rsid].Submitters, + conditions=submitters_ht[ht.rsid].Conditions, + ) + return select_fields(ht) diff --git a/v03_pipeline/lib/reference_datasets/clinvar_path_variants.py b/v03_pipeline/lib/reference_datasets/clinvar_path_variants.py new file mode 100644 index 000000000..f77fa4726 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/clinvar_path_variants.py @@ -0,0 +1,38 @@ +import hail as hl + +from v03_pipeline.lib.annotations.enums import ( + CLINVAR_PATHOGENICITIES_LOOKUP, +) + +CLINVAR_PATH_RANGE = ('Pathogenic', 'Pathogenic/Likely_risk_allele') +CLINVAR_LIKELY_PATH_RANGE = ('Pathogenic/Likely_pathogenic', 'Likely_risk_allele') + + +def get_ht( + ht: hl.Table, + *_, +) -> hl.Table: + ht = ht.select_globals() + ht = ht.select( + is_pathogenic=( + ( + ht.pathogenicity_id + >= CLINVAR_PATHOGENICITIES_LOOKUP[CLINVAR_PATH_RANGE[0]] + ) + & ( + ht.pathogenicity_id + <= CLINVAR_PATHOGENICITIES_LOOKUP[CLINVAR_PATH_RANGE[1]] + ) + ), + is_likely_pathogenic=( + ( + ht.pathogenicity_id + >= CLINVAR_PATHOGENICITIES_LOOKUP[CLINVAR_LIKELY_PATH_RANGE[0]] + ) + & ( + ht.pathogenicity_id + <= CLINVAR_PATHOGENICITIES_LOOKUP[CLINVAR_LIKELY_PATH_RANGE[1]] + ) + ), + ) + return ht.filter(ht.is_pathogenic | ht.is_likely_pathogenic) diff --git a/v03_pipeline/lib/reference_datasets/clinvar_test.py b/v03_pipeline/lib/reference_datasets/clinvar_test.py new file mode 100644 index 000000000..e62e34e2d --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/clinvar_test.py @@ -0,0 +1,171 @@ +import unittest + +import hail as hl +import responses + +from v03_pipeline.lib.annotations.enums import ( + CLINVAR_ASSERTIONS, + CLINVAR_PATHOGENICITIES, +) +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.clinvar import ( + parsed_and_mapped_clnsigconf, + parsed_clnsig, +) +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset +from v03_pipeline.lib.test.mock_clinvar_urls import mock_clinvar_urls + + +class ClinvarTest(unittest.TestCase): + @responses.activate + def test_get_clinvar_version(self): + with mock_clinvar_urls(): + self.assertEqual( + ReferenceDataset.clinvar.version(ReferenceGenome.GRCh38), + '2024-11-11', + ) + + def test_parsed_clnsig(self): + ht = hl.Table.parallelize( + [ + {'info': hl.Struct(CLNSIG=['Pathogenic|Affects'])}, + { + 'info': hl.Struct( + CLNSIG=[ + 'Pathogenic/Likely_pathogenic/Pathogenic', + '_low_penetrance', + ], + ), + }, + { + 'info': hl.Struct( + CLNSIG=[ + 'Likely_pathogenic/Pathogenic', + '_low_penetrance|association|protective', + ], + ), + }, + {'info': hl.Struct(CLNSIG=['Likely_pathogenic', '_low_penetrance'])}, + {'info': hl.Struct(CLNSIG=['association|protective'])}, + { + 'info': hl.Struct( + CLNSIG=[ + 'Pathogenic/Likely_pathogenic/Pathogenic', + '_low_penetrance/Established_risk_allele', + ], + ), + }, + ], + hl.tstruct(info=hl.tstruct(CLNSIG=hl.tarray(hl.tstr))), + ) + self.assertListEqual( + parsed_clnsig(ht).collect(), + [ + ['Pathogenic', 'Affects'], + ['Pathogenic/Likely_pathogenic', 'low_penetrance'], + ['Likely_pathogenic', 'low_penetrance', 'association', 'protective'], + ['Likely_pathogenic', 'low_penetrance'], + ['association', 'protective'], + [ + 'Pathogenic/Likely_pathogenic/Established_risk_allele', + 'low_penetrance', + ], + ], + ) + + def test_parsed_and_mapped_clnsigconf(self): + ht = hl.Table.parallelize( + [ + {'info': hl.Struct(CLNSIGCONF=hl.missing(hl.tarray(hl.tstr)))}, + { + 'info': hl.Struct( + CLNSIGCONF=[ + 'Pathogenic(8)|Likely_pathogenic(2)|Pathogenic', + '_low_penetrance(1)|Uncertain_significance(1)', + ], + ), + }, + ], + hl.tstruct(info=hl.tstruct(CLNSIGCONF=hl.tarray(hl.tstr))), + ) + self.assertListEqual( + parsed_and_mapped_clnsigconf(ht).collect(), + [ + None, + [ + hl.Struct(count=9, pathogenicity_id=0), + hl.Struct(count=2, pathogenicity_id=5), + hl.Struct(count=1, pathogenicity_id=12), + ], + ], + ) + + @responses.activate + def test_get_ht(self): + with mock_clinvar_urls(): + ht = ReferenceDataset.clinvar.get_ht( + ReferenceGenome.GRCh38, + ) + self.assertEqual( + ht.globals.collect()[0], + hl.Struct( + version='2024-11-11', + enums=hl.Struct( + assertion=CLINVAR_ASSERTIONS, + pathogenicity=CLINVAR_PATHOGENICITIES, + ), + ), + ) + self.assertEqual( + ht.collect()[:3], + [ + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=69134, + reference_genome='GRCh38', + ), + alleles=['A', 'G'], + alleleId=2193183, + conflictingPathogenicities=None, + goldStars=1, + submitters=None, + conditions=None, + pathogenicity_id=0, + assertion_ids=[], + ), + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=69314, + reference_genome='GRCh38', + ), + alleles=['T', 'G'], + alleleId=3374047, + conflictingPathogenicities=None, + goldStars=1, + submitters=['Paris Brain Institute, Inserm - ICM', 'OMIM'], + conditions=[ + 'Hereditary spastic paraplegia 48', + 'Hereditary spastic paraplegia 48', + ], + pathogenicity_id=12, + assertion_ids=[], + ), + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=69423, + reference_genome='GRCh38', + ), + alleles=['G', 'A'], + alleleId=3374048, + conflictingPathogenicities=None, + goldStars=1, + submitters=['OMIM'], + conditions=['Hereditary spastic paraplegia 48'], + pathogenicity_id=12, + assertion_ids=[], + ), + ], + ) diff --git a/v03_pipeline/lib/reference_datasets/dbnsfp.py b/v03_pipeline/lib/reference_datasets/dbnsfp.py new file mode 100644 index 000000000..b011cf034 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/dbnsfp.py @@ -0,0 +1,83 @@ +import hail as hl + +from v03_pipeline.lib.model import DatasetType, ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import ( + download_zip_file, + key_by_locus_alleles, +) + +SHARED_TYPES = { + 'REVEL_score': hl.tfloat32, + 'fathmm-MKL_coding_score': hl.tfloat32, + 'MutPred_score': hl.tfloat32, + 'PrimateAI_score': hl.tfloat32, +} +TYPES = { + ReferenceGenome.GRCh37: { + **SHARED_TYPES, + 'pos(1-based)': hl.tint, + 'CADD_phred_hg19': hl.tfloat32, + }, + ReferenceGenome.GRCh38: { + **SHARED_TYPES, + 'hg19_pos(1-based)': hl.tint, + 'CADD_phred': hl.tfloat32, + }, +} + +SHARED_RENAME = { + 'fathmm-MKL_coding_score': 'fathmm_MKL_coding_score', +} +RENAME = { + ReferenceGenome.GRCh37: { + **SHARED_RENAME, + 'hg19_chr': 'chrom', + 'hg19_pos(1-based)': 'pos', + }, + ReferenceGenome.GRCh38: { + **SHARED_RENAME, + '#chr': 'chrom', + 'pos(1-based)': 'pos', + }, +} + +PREDICTOR_SCORES = { + 'REVEL_score', + 'SIFT_score', + 'Polyphen2_HVAR_score', + 'VEST4_score', + 'MPC_score', +} +PREDICTOR_FIELDS = ['MutationTaster_pred'] + + +def predictor_parse(field: hl.StringExpression) -> hl.StringExpression: + return field.split(';').find(lambda p: p != '.') + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + types = TYPES[reference_genome] + rename = RENAME[reference_genome] + + with download_zip_file(path) as unzipped_dir: + ht = hl.import_table( + f'{unzipped_dir}/dbNSFP*_variant.chr*.gz', + types=types, + missing='.', + force=True, + ) + select_fields = {'ref', 'alt', *types.keys(), *rename.keys()} + ht = ht.select( + *select_fields, + **{k: hl.parse_float32(predictor_parse(ht[k])) for k in PREDICTOR_SCORES}, + **{k: predictor_parse(ht[k]) for k in PREDICTOR_FIELDS}, + ) + ht = ht.rename(**rename) + + return key_by_locus_alleles(ht, reference_genome) + + +def select(_: ReferenceGenome, dataset_type: DatasetType, ht: hl.Table) -> hl.Table: + if dataset_type == DatasetType.MITO: + return ht.select(ht.SIFT_score, ht.MutationTaster_pred_id) + return ht diff --git a/v03_pipeline/lib/reference_datasets/eigen.py b/v03_pipeline/lib/reference_datasets/eigen.py new file mode 100644 index 000000000..5e56cfdca --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/eigen.py @@ -0,0 +1,6 @@ +import hail as hl + + +def get_ht(path: str, *_) -> hl.Table: + ht = hl.read_table(path) + return ht.select(Eigen_phred=ht.info['Eigen-phred']) diff --git a/v03_pipeline/lib/reference_datasets/exac.py b/v03_pipeline/lib/reference_datasets/exac.py new file mode 100644 index 000000000..45e8e6edb --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/exac.py @@ -0,0 +1,22 @@ +import hail as hl + +from v03_pipeline.lib.misc.nested_field import parse_nested_field +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import vcf_to_ht + +SELECT = { + 'AF_POPMAX': 'info.POPMAX', + 'AF': 'info.AF#', + 'AC_Adj': 'info.AC_Adj#', + 'AC_Het': 'info.AC_Het#', + 'AC_Hom': 'info.AC_Hom#', + 'AC_Hemi': 'info.AC_Hemi#', + 'AN_Adj': 'info.AN_Adj', +} + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + ht = vcf_to_ht(path, reference_genome, split_multi=True) + return ht.select( + **{k: parse_nested_field(ht, v) for k, v in SELECT.items()}, + ) diff --git a/v03_pipeline/lib/reference_datasets/exac_test.py b/v03_pipeline/lib/reference_datasets/exac_test.py new file mode 100644 index 000000000..c5256c0bf --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/exac_test.py @@ -0,0 +1,54 @@ +import unittest +from unittest.mock import patch + +import hail as hl + +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset + +EXAC_PATH = 'v03_pipeline/var/test/reference_datasets/raw/exac.vcf' + + +class ExacTest(unittest.TestCase): + def test_exac(self): + with patch.object( + ReferenceDataset, + 'path', + return_value=EXAC_PATH, + ): + ht = ReferenceDataset.exac.get_ht(ReferenceGenome.GRCh38) + self.assertEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=1046973, + reference_genome='GRCh38', + ), + alleles=['G', 'A'], + AF_POPMAX=['NA', 'NFE'], + AF=1.702e-05, + AC_Adj=0, + AC_Het=0, + AC_Hom=0, + AC_Hemi=None, + AN_Adj=27700, + ), + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=1046973, + reference_genome='GRCh38', + ), + alleles=['G', 'T'], + AF_POPMAX=['NA', 'NFE'], + AF=1.702e-05, + AC_Adj=1, + AC_Het=1, + AC_Hom=0, + AC_Hemi=None, + AN_Adj=27700, + ), + ], + ) diff --git a/v03_pipeline/lib/reference_data/gencode/__init__.py b/v03_pipeline/lib/reference_datasets/gencode/__init__.py similarity index 100% rename from v03_pipeline/lib/reference_data/gencode/__init__.py rename to v03_pipeline/lib/reference_datasets/gencode/__init__.py diff --git a/v03_pipeline/lib/reference_data/gencode/mapping_gene_ids.py b/v03_pipeline/lib/reference_datasets/gencode/mapping_gene_ids.py similarity index 100% rename from v03_pipeline/lib/reference_data/gencode/mapping_gene_ids.py rename to v03_pipeline/lib/reference_datasets/gencode/mapping_gene_ids.py diff --git a/v03_pipeline/lib/reference_data/gencode/mapping_gene_ids_tests.py b/v03_pipeline/lib/reference_datasets/gencode/mapping_gene_ids_tests.py similarity index 97% rename from v03_pipeline/lib/reference_data/gencode/mapping_gene_ids_tests.py rename to v03_pipeline/lib/reference_datasets/gencode/mapping_gene_ids_tests.py index 58c037048..1827a3453 100644 --- a/v03_pipeline/lib/reference_data/gencode/mapping_gene_ids_tests.py +++ b/v03_pipeline/lib/reference_datasets/gencode/mapping_gene_ids_tests.py @@ -3,7 +3,7 @@ import responses -from v03_pipeline.lib.reference_data.gencode.mapping_gene_ids import ( +from v03_pipeline.lib.reference_datasets.gencode.mapping_gene_ids import ( GENCODE_ENSEMBL_TO_REFSEQ_URL, GENCODE_GTF_URL, load_gencode_ensembl_to_refseq_id, diff --git a/v03_pipeline/lib/reference_datasets/gnomad_coding_and_noncoding.py b/v03_pipeline/lib/reference_datasets/gnomad_coding_and_noncoding.py new file mode 100644 index 000000000..f9aac07a4 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_coding_and_noncoding.py @@ -0,0 +1,59 @@ +import hail as hl + +from v03_pipeline.lib.annotations.enums import ( + TRANSCRIPT_CONSEQUENCE_TERMS, +) +from v03_pipeline.lib.annotations.expression_helpers import ( + get_expr_for_vep_sorted_transcript_consequences_array, + get_expr_for_worst_transcript_consequence_annotations_struct, +) +from v03_pipeline.lib.model import ReferenceGenome + +GNOMAD_CODING_NONCODING_HIGH_AF_THRESHOLD = 0.90 +TRANSCRIPT_CONSEQUENCE_TERM_RANK_LOOKUP = hl.dict( + hl.enumerate(TRANSCRIPT_CONSEQUENCE_TERMS, index_first=False), +) + + +def get_ht( + path: str, + reference_genome: ReferenceGenome, +) -> hl.Table: + ht = hl.read_table(path) + filtered_contig = 'chr1' if reference_genome == ReferenceGenome.GRCh38 else '1' + ht = hl.filter_intervals( + ht, + [ + hl.parse_locus_interval( + filtered_contig, + reference_genome=reference_genome.value, + ), + ], + ) + ht = ht.filter(ht.freq[0].AF > GNOMAD_CODING_NONCODING_HIGH_AF_THRESHOLD) + ht = ht.annotate( + sorted_transaction_consequences=( + get_expr_for_vep_sorted_transcript_consequences_array( + ht.vep, + omit_consequences=[], + ) + ), + ) + ht = ht.annotate( + main_transcript=( + get_expr_for_worst_transcript_consequence_annotations_struct( + ht.sorted_transaction_consequences, + ) + ), + ) + ht = ht.select( + coding=( + ht.main_transcript.major_consequence_rank + <= TRANSCRIPT_CONSEQUENCE_TERM_RANK_LOOKUP['synonymous_variant'] + ), + noncoding=( + ht.main_transcript.major_consequence_rank + >= TRANSCRIPT_CONSEQUENCE_TERM_RANK_LOOKUP['downstream_gene_variant'] + ), + ) + return ht.filter(ht.coding | ht.noncoding) diff --git a/v03_pipeline/lib/reference_datasets/gnomad_exomes.py b/v03_pipeline/lib/reference_datasets/gnomad_exomes.py new file mode 100644 index 000000000..a5e2919ec --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_exomes.py @@ -0,0 +1,21 @@ +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.gnomad_utils import get_ht as _get_ht + + +def af_popmax_expression( + ht: hl.Table, + reference_genome: ReferenceGenome, +) -> hl.Expression: + if reference_genome == ReferenceGenome.GRCh37: + return ht.popmax[ht.globals.popmax_index_dict['gnomad']].AF + return ht.grpmax['gnomad'].AF + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + return _get_ht( + path, + reference_genome, + af_popmax_expression, + ) diff --git a/v03_pipeline/lib/reference_datasets/gnomad_exomes_test.py b/v03_pipeline/lib/reference_datasets/gnomad_exomes_test.py new file mode 100644 index 000000000..75c81eb4e --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_exomes_test.py @@ -0,0 +1,72 @@ +import unittest +from unittest.mock import patch + +import hail as hl + +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset + +GNOMAD_EXOMES_37_PATH = ( + 'v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht' +) +GNOMAD_EXOMES_38_PATH = ( + 'v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht' +) + + +class GnomadTest(unittest.TestCase): + def test_gnomad_exomes_37(self): + with patch.object( + ReferenceDataset, + 'path', + return_value=GNOMAD_EXOMES_37_PATH, + ): + ht = ReferenceDataset.gnomad_exomes.get_ht(ReferenceGenome.GRCh37) + self.assertEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='1', + position=12586, + reference_genome='GRCh37', + ), + alleles=['C', 'T'], + AF=0.0005589714855886996, + AN=3578, + AC=2, + Hom=0, + AF_POPMAX_OR_GLOBAL=0.0022026430815458298, + FAF_AF=9.839000267675146e-05, + Hemi=0, + ), + ], + ) + + def test_gnomad_exomes_38(self): + with patch.object( + ReferenceDataset, + 'path', + return_value=GNOMAD_EXOMES_38_PATH, + ): + ht = ReferenceDataset.gnomad_exomes.get_ht(ReferenceGenome.GRCh38) + self.assertEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=12138, + reference_genome='GRCh38', + ), + alleles=['C', 'A'], + AF=0.00909090880304575, + AN=110, + AC=1, + Hom=0, + AF_POPMAX_OR_GLOBAL=0.009803921915590763, + FAF_AF=0.0, + Hemi=0, + ), + ], + ) diff --git a/v03_pipeline/lib/reference_datasets/gnomad_genomes.py b/v03_pipeline/lib/reference_datasets/gnomad_genomes.py new file mode 100644 index 000000000..acd78efed --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_genomes.py @@ -0,0 +1,21 @@ +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.gnomad_utils import get_ht as _get_ht + + +def af_popmax_expression( + ht: hl.Table, + reference_genome: ReferenceGenome, +) -> hl.Expression: + if reference_genome == ReferenceGenome.GRCh37: + return ht.popmax[ht.globals.popmax_index_dict['gnomad']].AF + return ht.grpmax.AF + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + return _get_ht( + path, + reference_genome, + af_popmax_expression, + ) diff --git a/v03_pipeline/lib/reference_datasets/gnomad_genomes_test.py b/v03_pipeline/lib/reference_datasets/gnomad_genomes_test.py new file mode 100644 index 000000000..a620ab88c --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_genomes_test.py @@ -0,0 +1,72 @@ +import unittest +from unittest.mock import patch + +import hail as hl + +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset + +GNOMAD_GENOMES_37_PATH = ( + 'v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht' +) +GNOMAD_GENOMES_38_PATH = ( + 'v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht' +) + + +class GnomadTest(unittest.TestCase): + def test_gnomad_genomes_37(self): + with patch.object( + ReferenceDataset, + 'path', + return_value=GNOMAD_GENOMES_37_PATH, + ): + ht = ReferenceDataset.gnomad_genomes.get_ht(ReferenceGenome.GRCh37) + self.assertEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='1', + position=10131, + reference_genome='GRCh37', + ), + alleles=['CT', 'C'], + AF=3.6635403375839815e-05, + AN=27296, + AC=1, + Hom=0, + AF_POPMAX_OR_GLOBAL=3.6635403375839815e-05, + FAF_AF=0.0, + Hemi=0, + ), + ], + ) + + def test_gnomad_genomes_38(self): + with patch.object( + ReferenceDataset, + 'path', + return_value=GNOMAD_GENOMES_38_PATH, + ): + ht = ReferenceDataset.gnomad_genomes.get_ht(ReferenceGenome.GRCh38) + self.assertEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=10057, + reference_genome='GRCh38', + ), + alleles=['A', 'C'], + AF=2.642333674884867e-05, + AN=113536, + AC=3, + Hom=0, + AF_POPMAX_OR_GLOBAL=3.779861071961932e-05, + FAF_AF=7.019999884505523e-06, + Hemi=0, + ), + ], + ) diff --git a/v03_pipeline/lib/reference_datasets/gnomad_mito.py b/v03_pipeline/lib/reference_datasets/gnomad_mito.py new file mode 100644 index 000000000..bc5e64954 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_mito.py @@ -0,0 +1,14 @@ +import hail as hl + + +def get_ht(path: str, *_) -> hl.Table: + ht = hl.read_table(path) + ht = ht.select( + AN=hl.int32(ht.AN), + AC_hom=hl.int32(ht.AC_hom), + AC_het=hl.int32(ht.AC_het), + AF_hom=ht.AF_hom, + AF_het=ht.AF_het, + max_hl=ht.max_hl, + ) + return ht.select_globals() diff --git a/v03_pipeline/lib/reference_datasets/gnomad_non_coding_constraint.py b/v03_pipeline/lib/reference_datasets/gnomad_non_coding_constraint.py new file mode 100644 index 000000000..fda2064b2 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_non_coding_constraint.py @@ -0,0 +1,23 @@ +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import ( + select_for_interval_reference_dataset, +) + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + ht = hl.import_table( + path, + types={ + 'start': hl.tint32, + 'end': hl.tint32, + 'z': hl.tfloat32, + }, + force_bgz=True, + ) + return select_for_interval_reference_dataset( + ht, + reference_genome, + {'z_score': ht['z']}, + ) diff --git a/v03_pipeline/lib/reference_datasets/gnomad_qc.py b/v03_pipeline/lib/reference_datasets/gnomad_qc.py new file mode 100644 index 000000000..fd580ef61 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_qc.py @@ -0,0 +1,9 @@ +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + if reference_genome == ReferenceGenome.GRCh37: + return hl.read_matrix_table(path).rows() + return hl.read_table(path) diff --git a/v03_pipeline/lib/reference_datasets/gnomad_utils.py b/v03_pipeline/lib/reference_datasets/gnomad_utils.py new file mode 100644 index 000000000..846758db9 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/gnomad_utils.py @@ -0,0 +1,55 @@ +from collections.abc import Callable + +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome + + +def global_idx_field(reference_genome: ReferenceGenome) -> str: + return 'gnomad' if reference_genome == ReferenceGenome.GRCh37 else 'adj' + + +def faf_globals_field(reference_genome: ReferenceGenome) -> str: + return ( + 'popmax_index_dict' + if reference_genome == ReferenceGenome.GRCh37 + else 'faf_index_dict' + ) + + +def hemi_field(reference_genome: ReferenceGenome) -> str: + return 'gnomad_male' if reference_genome == ReferenceGenome.GRCh37 else 'XY_adj' + + +def get_ht( + path: str, + reference_genome: ReferenceGenome, + af_popmax_expression: Callable, +) -> hl.Table: + ht = hl.read_table(path) + global_idx = hl.eval(ht.globals.freq_index_dict[global_idx_field(reference_genome)]) + ht = ht.select( + AF=hl.float32(ht.freq[global_idx].AF), + AN=ht.freq[global_idx].AN, + AC=ht.freq[global_idx].AC, + Hom=ht.freq[global_idx].homozygote_count, + AF_POPMAX_OR_GLOBAL=hl.float32( + hl.or_else( + af_popmax_expression(ht, reference_genome), + ht.freq[global_idx].AF, + ), + ), + FAF_AF=hl.float32( + ht.faf[ + ht.globals[faf_globals_field(reference_genome)][ + global_idx_field(reference_genome) + ] + ].faf95, + ), + Hemi=hl.if_else( + ht.locus.in_autosome_or_par(), + 0, + ht.freq[ht.globals.freq_index_dict[hemi_field(reference_genome)]].AC, + ), + ) + return ht.select_globals() diff --git a/v03_pipeline/lib/reference_datasets/helix_mito.py b/v03_pipeline/lib/reference_datasets/helix_mito.py new file mode 100644 index 000000000..f81571a6d --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/helix_mito.py @@ -0,0 +1,53 @@ +import shutil +import tempfile + +import hail as hl +import requests + +from v03_pipeline.lib.model.definitions import ReferenceGenome + +RENAME = { + 'counts_hom': 'AC_hom', + 'counts_het': 'AC_het', + 'max_ARF': 'max_hl', +} + + +def get_ht( + url: str, + reference_genome: ReferenceGenome, +) -> hl.Table: + with tempfile.NamedTemporaryFile( + suffix='.tsv', + delete=False, + ) as tmp_file, requests.get(url, stream=True, timeout=10) as r: + shutil.copyfileobj(r.raw, tmp_file) + ht = hl.import_table( + tmp_file.name, + types={ + 'counts_hom': hl.tint32, + 'counts_het': hl.tint32, + 'max_ARF': hl.tfloat32, + 'AF_het': hl.tfloat32, + 'AF_hom': hl.tfloat32, + 'alleles': hl.tarray(hl.tstr), + }, + ) + ht = ht.rename(RENAME) + ht = ht.select( + *RENAME.values(), + locus=hl.locus( + 'chrM', + hl.parse_int32(ht.locus.split(':')[1]), + reference_genome, + ), + alleles=ht.alleles, + AN=hl.if_else( + ht.AF_hom > 0, + hl.int32(ht.AC_hom / ht.AF_hom), + hl.int32(ht.AC_het / ht.AF_het), + ), + AF_hom=ht.AF_hom, + AF_het=ht.AF_het, + ) + return ht.key_by('locus', 'alleles') diff --git a/v03_pipeline/lib/reference_datasets/hgmd.py b/v03_pipeline/lib/reference_datasets/hgmd.py new file mode 100644 index 000000000..df922ac01 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/hgmd.py @@ -0,0 +1,20 @@ +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + mt = hl.import_vcf( + path, + reference_genome=reference_genome.value, + force=True, + skip_invalid_loci=True, + contig_recoding=reference_genome.contig_recoding(), + ) + ht = mt.rows() + return ht.select( + **{ + 'accession': ht.rsid, + 'class': ht.info.CLASS, + }, + ) diff --git a/v03_pipeline/lib/reference_datasets/hgmd_test.py b/v03_pipeline/lib/reference_datasets/hgmd_test.py new file mode 100644 index 000000000..953ffe7f5 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/hgmd_test.py @@ -0,0 +1,41 @@ +import unittest +from unittest.mock import patch + +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset + +TEST_HGMD_VCF = 'v03_pipeline/var/test/reference_datasets/raw/test_hgmd.vcf' + + +class HGMDTest(unittest.TestCase): + def test_hgmd_38(self): + with patch.object( + ReferenceDataset, + 'path', + return_value=TEST_HGMD_VCF, + ): + ht = ReferenceDataset.hgmd.get_ht(ReferenceGenome.GRCh38) + self.assertEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=925942, + reference_genome='GRCh38', + ), + alleles=['A', 'G'], + accession='CM2039807', + class_id=1, + ), + ], + ) + self.assertEqual( + ht.globals.collect()[0], + hl.Struct( + version='1.0', + enums=hl.Struct(**ReferenceDataset.hgmd.enums), + ), + ) diff --git a/v03_pipeline/lib/reference_datasets/high_af_variants.py b/v03_pipeline/lib/reference_datasets/high_af_variants.py new file mode 100644 index 000000000..f73a077d3 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/high_af_variants.py @@ -0,0 +1,20 @@ +import hail as hl + +ONE_TENTH_PERCENT = 0.001 +ONE_PERCENT = 0.01 +THREE_PERCENT = 0.03 +FIVE_PERCENT = 0.05 +TEN_PERCENT = 0.10 + + +def get_ht( + ht: hl.Table, +) -> hl.Table: + ht = ht.select_globals() + ht = ht.filter(ht.AF_POPMAX_OR_GLOBAL > ONE_TENTH_PERCENT) + return ht.select( + is_gt_1_percent=ht.AF_POPMAX_OR_GLOBAL > ONE_PERCENT, + is_gt_3_percent=ht.AF_POPMAX_OR_GLOBAL > THREE_PERCENT, + is_gt_5_percent=ht.AF_POPMAX_OR_GLOBAL > FIVE_PERCENT, + is_gt_10_percent=ht.AF_POPMAX_OR_GLOBAL > TEN_PERCENT, + ) diff --git a/v03_pipeline/lib/reference_datasets/hmtvar.py b/v03_pipeline/lib/reference_datasets/hmtvar.py new file mode 100644 index 000000000..0fdcdecd8 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/hmtvar.py @@ -0,0 +1,23 @@ +import hail as hl +import requests + +from v03_pipeline.lib.model.definitions import ReferenceGenome + + +def get_ht( + url: str, + reference_genome: ReferenceGenome, +) -> hl.Table: + response = requests.get(url, stream=True, timeout=10) + data = response.json() + ht = hl.Table.parallelize(data) + ht = ht.select( + locus=hl.locus( + reference_genome.mito_contig, + ht.nt_start, + reference_genome.value, + ), + alleles=hl.array([ht.ref_rCRS, ht.alt]), + score=ht.disease_score, + ) + return ht.key_by('locus', 'alleles') diff --git a/v03_pipeline/lib/reference_datasets/local_constraint_mito.py b/v03_pipeline/lib/reference_datasets/local_constraint_mito.py new file mode 100644 index 000000000..bb473318d --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/local_constraint_mito.py @@ -0,0 +1,24 @@ +import os + +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import download_zip_file + +EXTRACTED_FILE_NAME = 'supplementary_dataset_7.tsv' + + +def get_ht(url: str, reference_genome: ReferenceGenome) -> hl.Table: + with download_zip_file(url, decode_content=True) as unzipped_dir: + ht = hl.import_table( + os.path.join( + unzipped_dir, + EXTRACTED_FILE_NAME, + ), + ) + ht = ht.select( + locus=hl.locus('chrM', hl.parse_int32(ht.Position), reference_genome.value), + alleles=[ht.Reference, ht.Alternate], + score=hl.parse_float32(ht.MLC_score), + ) + return ht.key_by('locus', 'alleles') diff --git a/v03_pipeline/lib/reference_datasets/misc.py b/v03_pipeline/lib/reference_datasets/misc.py new file mode 100644 index 000000000..5e1004c71 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/misc.py @@ -0,0 +1,150 @@ +import contextlib +import os +import tempfile +import zipfile + +import hail as hl +import requests + +from v03_pipeline.lib.misc.io import split_multi_hts +from v03_pipeline.lib.model.dataset_type import DatasetType +from v03_pipeline.lib.model.definitions import ReferenceGenome + +BIALLELIC = 2 + + +def get_enum_select_fields( + ht: hl.Table, + enums: dict | None, +) -> dict[str, hl.Expression]: + enum_select_fields = {} + for field_name, values in (enums or {}).items(): + if not hasattr(ht, field_name): + if hasattr(ht, f'{field_name}_id') or hasattr(ht, f'{field_name}_ids'): + continue + error = f'Unused enum {field_name}' + raise ValueError(error) + + lookup = hl.dict( + hl.enumerate(values, index_first=False).extend( + # NB: adding missing values here allows us to + # hard fail if a mapped key is present and has an unexpected value + # but propagate missing values. + [(hl.missing(hl.tstr), hl.missing(hl.tint32))], + ), + ) + # NB: this conditioning on type is "outside" the hail expression context. + if ( + isinstance(ht[field_name].dtype, hl.tarray | hl.tset) + and ht[field_name].dtype.element_type == hl.tstr + ): + enum_select_fields[f'{field_name}_ids'] = ht[field_name].map( + lambda x: lookup[x], # noqa: B023 + ) + else: + enum_select_fields[f'{field_name}_id'] = lookup[ht[field_name]] + return enum_select_fields + + +def filter_mito_contigs( + reference_genome: ReferenceGenome, + dataset_type: DatasetType, + ht: hl.Table, +) -> hl.Table: + if dataset_type == DatasetType.MITO: + return ht.filter(ht.locus.contig == reference_genome.mito_contig) + return ht.filter(ht.locus.contig != reference_genome.mito_contig) + + +def filter_contigs(ht, reference_genome: ReferenceGenome): + if hasattr(ht, 'interval'): + return ht.filter( + hl.set(reference_genome.standard_contigs).contains( + ht.interval.start.contig, + ), + ) + return ht.filter( + hl.set(reference_genome.standard_contigs).contains(ht.locus.contig), + ) + + +def vcf_to_ht( + file_name: str, + reference_genome: ReferenceGenome, + split_multi=False, +) -> hl.Table: + mt = hl.import_vcf( + file_name, + reference_genome=reference_genome.value, + drop_samples=True, + skip_invalid_loci=True, + contig_recoding=reference_genome.contig_recoding(include_mt=True), + force_bgz=True, + array_elements_required=False, + ) + if split_multi: + return split_multi_hts(mt, True).rows() + + # Validate that there exist no multialellic variants in the table. + count_non_biallelic = mt.aggregate_rows( + hl.agg.count_where(hl.len(mt.alleles) > BIALLELIC), + ) + if count_non_biallelic: + error = f'Encountered {count_non_biallelic} multiallelic variants' + raise ValueError(error) + return mt.rows() + + +def key_by_locus_alleles(ht: hl.Table, reference_genome: ReferenceGenome) -> hl.Table: + chrom = ( + hl.format('chr%s', ht.chrom) + if reference_genome == ReferenceGenome.GRCh38 + else ht.chrom + ) + ht = ht.transmute( + locus=hl.locus(chrom, ht.pos, reference_genome.value), + alleles=hl.array([ht.ref, ht.alt]), + ) + return ht.key_by('locus', 'alleles') + + +def copyfileobj(fsrc, fdst, decode_content, length=16 * 1024): + """Copy data from file-like object fsrc to file-like object fdst.""" + while True: + buf = fsrc.read(length, decode_content=decode_content) + if not buf: + break + fdst.write(buf) + + +@contextlib.contextmanager +def download_zip_file(url, suffix='.zip', decode_content=False): + with tempfile.NamedTemporaryFile( + suffix=suffix, + ) as tmp_file, requests.get(url, stream=True, timeout=10) as r: + copyfileobj(r.raw, tmp_file, decode_content) + with zipfile.ZipFile(tmp_file.name, 'r') as zipf: + zipf.extractall(os.path.dirname(tmp_file.name)) + # Extracting the zip file + yield os.path.dirname(tmp_file.name) + + +def select_for_interval_reference_dataset( + ht: hl.Table, + reference_genome: ReferenceGenome, + additional_selects: dict, + chrom_field: str = 'chrom', + start_field: str = 'start', + end_field: str = 'end', +) -> hl.Table: + ht = ht.select( + interval=hl.locus_interval( + ht[chrom_field], + ht[start_field] + 1, + ht[end_field] + 1, + reference_genome=reference_genome.value, + invalid_missing=True, + ), + **additional_selects, + ) + return ht.key_by('interval') diff --git a/v03_pipeline/lib/reference_datasets/misc_test.py b/v03_pipeline/lib/reference_datasets/misc_test.py new file mode 100644 index 000000000..dbfc0de72 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/misc_test.py @@ -0,0 +1,76 @@ +import unittest + +import hail as hl + +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import get_enum_select_fields, vcf_to_ht + +EXAC_PATH = 'v03_pipeline/var/test/reference_datasets/raw/exac.vcf' + + +class MiscTest(unittest.TestCase): + def test_get_enum_select_fields(self): + ht = hl.Table.parallelize( + [ + {'variant': ['1', '2'], 'sv_type': 'a', 'sample_fix': '1'}, + { + 'variant': ['1', '3', '2'], + 'sv_type': 'b', + 'sample_fix': '2', + }, + {'variant': ['1', '3'], 'sv_type': 'c', 'sample_fix': '3'}, + {'variant': ['4'], 'sv_type': 'd', 'sample_fix': '4'}, + ], + hl.tstruct( + variant=hl.dtype('array'), + sv_type=hl.dtype('str'), + sample_fix=hl.dtype('str'), + ), + ) + enum_select_fields = get_enum_select_fields( + ht, + { + 'variant': ['1', '2', '3', '4'], + 'sv_type': ['a', 'b', 'c', 'd'], + }, + ) + mapped_ht = ht.transmute(**enum_select_fields) + self.assertListEqual( + mapped_ht.collect(), + [ + hl.Struct(variant_ids=[0, 1], sv_type_id=0, sample_fix='1'), + hl.Struct(variant_ids=[0, 2, 1], sv_type_id=1, sample_fix='2'), + hl.Struct(variant_ids=[0, 2], sv_type_id=2, sample_fix='3'), + hl.Struct(variant_ids=[3], sv_type_id=3, sample_fix='4'), + ], + ) + + mapped_enum_select_fields = get_enum_select_fields( + mapped_ht, + { + 'variant': ['1', '2', '3', '4'], + 'sv_type': ['a', 'b', 'c', 'd'], + }, + ) + self.assertDictEqual(mapped_enum_select_fields, {}) + + enum_select_fields = get_enum_select_fields( + ht, + {'sv_type': ['d']}, + ) + mapped_ht = ht.select(**enum_select_fields) + self.assertRaises(Exception, mapped_ht.collect) + + with self.assertRaises(ValueError) as cm: + get_enum_select_fields(ht, {'variant_renamed': ['1', '2', '3', '4']}) + self.assertEqual(str(cm.exception), 'Unused enum variant_renamed') + + self.assertDictEqual(get_enum_select_fields(ht, None), {}) + + def test_vcf_to_ht_throw_multiallelic(self): + self.assertRaises( + ValueError, + vcf_to_ht, + EXAC_PATH, + ReferenceGenome.GRCh38, + ) diff --git a/v03_pipeline/lib/reference_datasets/mitimpact.py b/v03_pipeline/lib/reference_datasets/mitimpact.py new file mode 100644 index 000000000..f12c507c1 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/mitimpact.py @@ -0,0 +1,26 @@ +import os + +import hail as hl + +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import download_zip_file + + +def get_ht( + url: str, + reference_genome: ReferenceGenome, +) -> hl.Table: + extracted_filename = url.removesuffix('.zip').split('/')[-1] + with download_zip_file(url, suffix='.txt.zip') as unzipped_dir: + ht = hl.import_table( + os.path.join( + unzipped_dir, + extracted_filename, + ), + ) + ht = ht.select( + locus=hl.locus('chrM', hl.parse_int32(ht.Start), reference_genome), + alleles=[ht.Ref, ht.Alt], + score=hl.parse_float32(ht.APOGEE2_score), + ) + return ht.key_by('locus', 'alleles') diff --git a/v03_pipeline/lib/reference_datasets/mitomap.py b/v03_pipeline/lib/reference_datasets/mitomap.py new file mode 100644 index 000000000..346856acf --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/mitomap.py @@ -0,0 +1,22 @@ +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + ht = hl.import_table( + path, + delimiter=',', + quote='"', + types={'Position': hl.tint32}, + ) + ht = ht.select( + locus=hl.locus( + 'chrM', + ht.Position, + reference_genome=reference_genome.value, + ), + alleles=ht.Allele.first_match_in('m.[0-9]+([ATGC]+)>([ATGC]+)'), + pathogenic=True, + ) + return ht.key_by('locus', 'alleles') diff --git a/v03_pipeline/lib/reference_datasets/mitomap_test.py b/v03_pipeline/lib/reference_datasets/mitomap_test.py new file mode 100644 index 000000000..bef0b036e --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/mitomap_test.py @@ -0,0 +1,51 @@ +import unittest +from unittest.mock import patch + +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset + +TEST_MITOMAP_CSV = 'v03_pipeline/var/test/reference_datasets/raw/test_mitomap.csv' + + +class MitomapTest(unittest.TestCase): + def test_mitomap(self): + with patch.object( + ReferenceDataset, + 'path', + return_value=TEST_MITOMAP_CSV, + ): + ht = ReferenceDataset.mitomap.get_ht(ReferenceGenome.GRCh38) + self.assertEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='chrM', + position=583, + reference_genome='GRCh38', + ), + alleles=['G', 'A'], + pathogenic=True, + ), + hl.Struct( + locus=hl.Locus( + contig='chrM', + position=591, + reference_genome='GRCh38', + ), + alleles=['C', 'T'], + pathogenic=True, + ), + hl.Struct( + locus=hl.Locus( + contig='chrM', + position=616, + reference_genome='GRCh38', + ), + alleles=['T', 'C'], + pathogenic=True, + ), + ], + ) diff --git a/v03_pipeline/lib/reference_data/queries.py b/v03_pipeline/lib/reference_datasets/queries.py similarity index 100% rename from v03_pipeline/lib/reference_data/queries.py rename to v03_pipeline/lib/reference_datasets/queries.py diff --git a/v03_pipeline/lib/reference_datasets/reference_dataset.py b/v03_pipeline/lib/reference_datasets/reference_dataset.py new file mode 100644 index 000000000..bc5d42f0f --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/reference_dataset.py @@ -0,0 +1,412 @@ +import importlib +import types +from collections.abc import Callable +from enum import Enum +from typing import Union + +import hail as hl + +from v03_pipeline.lib.model import AccessControl, DatasetType, Env, ReferenceGenome +from v03_pipeline.lib.reference_datasets import clinvar, dbnsfp +from v03_pipeline.lib.reference_datasets.misc import ( + filter_contigs, + filter_mito_contigs, + get_enum_select_fields, +) + +DATASET_TYPES = 'dataset_types' +ENUMS = 'enums' +EXCLUDE_FROM_ANNOTATIONS = 'exclude_from_annotations' +FILTER = 'filter' +IS_INTERVAL = 'is_interval' +SELECT = 'select' +VERSION = 'version' +PATH = 'path' + + +class BaseReferenceDataset: + @classmethod + def for_reference_genome_dataset_type( + cls, + reference_genome: ReferenceGenome, + dataset_type: DatasetType, + ) -> set[Union['ReferenceDataset', 'ReferenceDatasetQuery']]: + reference_datasets = [ + dataset + for dataset, config in CONFIG.items() + if dataset_type in config.get(reference_genome, {}).get(DATASET_TYPES, []) + ] + if not Env.ACCESS_PRIVATE_REFERENCE_DATASETS: + return { + dataset + for dataset in reference_datasets + if dataset.access_control == AccessControl.PUBLIC + } + return set(reference_datasets) + + @classmethod + def for_reference_genome_dataset_type_annotations( + cls, + reference_genome: ReferenceGenome, + dataset_type: DatasetType, + ) -> set['ReferenceDataset']: + return { + dataset + for dataset in cls.for_reference_genome_dataset_type( + reference_genome, + dataset_type, + ) + if not CONFIG[dataset].get(EXCLUDE_FROM_ANNOTATIONS, False) + } + + @property + def is_keyed_by_interval(self) -> bool: + return CONFIG[self].get(IS_INTERVAL, False) + + @property + def access_control(self) -> AccessControl: + if self == ReferenceDataset.hgmd: + return AccessControl.PRIVATE + return AccessControl.PUBLIC + + def version(self, reference_genome: ReferenceGenome) -> str: + version = CONFIG[self][reference_genome][VERSION] + if isinstance(version, types.FunctionType): + return version( + self.path(reference_genome), + ) + return version + + @property + def enums(self) -> dict | None: + return CONFIG[self].get(ENUMS) + + @property + def enum_globals(self) -> hl.Struct: + if self.enums: + return hl.Struct(**self.enums) + return hl.missing(hl.tstruct(hl.tstr, hl.tarray(hl.tstr))) + + @property + def filter( # noqa: A003 + self, + ) -> Callable[[ReferenceGenome, DatasetType, hl.Table], hl.Table] | None: + return CONFIG[self].get(FILTER) + + @property + def select( + self, + ) -> Callable[[ReferenceGenome, DatasetType, hl.Table], hl.Table] | None: + return CONFIG[self].get(SELECT) + + def path(self, reference_genome: ReferenceGenome) -> str | list[str]: + return CONFIG[self][reference_genome][PATH] + + def get_ht( + self, + reference_genome: ReferenceGenome, + ) -> hl.Table: + module = importlib.import_module( + f'v03_pipeline.lib.reference_datasets.{self.name}', + ) + path = self.path(reference_genome) + ht = module.get_ht(path, reference_genome) + enum_selects = get_enum_select_fields(ht, self.enums) + if enum_selects: + ht = ht.transmute(**enum_selects) + ht = filter_contigs(ht, reference_genome) + # NB: we do not filter with "filter" here + # ReferenceDatasets are DatasetType agnostic and that + # filter is only used at annotation time. + return ht.annotate_globals( + version=self.version(reference_genome), + enums=self.enum_globals, + ) + + +class ReferenceDataset(BaseReferenceDataset, str, Enum): + clinvar = 'clinvar' + dbnsfp = 'dbnsfp' + exac = 'exac' + eigen = 'eigen' + helix_mito = 'helix_mito' + hgmd = 'hgmd' + hmtvar = 'hmtvar' + mitimpact = 'mitimpact' + splice_ai = 'splice_ai' + topmed = 'topmed' + gnomad_coding_and_noncoding = 'gnomad_coding_and_noncoding' + gnomad_exomes = 'gnomad_exomes' + gnomad_genomes = 'gnomad_genomes' + gnomad_qc = 'gnomad_qc' + gnomad_mito = 'gnomad_mito' + gnomad_non_coding_constraint = 'gnomad_non_coding_constraint' + screen = 'screen' + local_constraint_mito = 'local_constraint_mito' + mitomap = 'mitomap' + + +class ReferenceDatasetQuery(BaseReferenceDataset, str, Enum): + clinvar_path_variants = 'clinvar_path_variants' + high_af_variants = 'high_af_variants' + + @property + def requires(self) -> ReferenceDataset: + return { + self.clinvar_path_variants: ReferenceDataset.clinvar, + self.high_af_variants: ReferenceDataset.gnomad_genomes, + }[self] + + def get_ht( + self, + reference_genome: ReferenceGenome, + dataset_type: DatasetType, + reference_dataset_ht: hl.Table, + ) -> hl.Table: + module = importlib.import_module( + f'v03_pipeline.lib.reference_datasets.{self.name}', + ) + ht = module.get_ht(reference_dataset_ht) + if self.filter: + ht = self.filter(reference_genome, dataset_type, ht) + return ht.annotate_globals( + version=self.version(reference_genome), + ) + + +CONFIG = { + ReferenceDataset.dbnsfp: { + ENUMS: { + 'MutationTaster_pred': ['D', 'A', 'N', 'P'], + }, + FILTER: filter_mito_contigs, + SELECT: dbnsfp.select, + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'https://dbnsfp.s3.amazonaws.com/dbNSFP4.7a.zip', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL, DatasetType.MITO]), + VERSION: '1.0', + PATH: 'https://dbnsfp.s3.amazonaws.com/dbNSFP4.7a.zip', + }, + }, + ReferenceDataset.eigen: { + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + # NB: The download link on the Eigen website (http://www.columbia.edu/~ii2135/download.html) is broken + # as of 11/15/24 so we will host the data + PATH: 'gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.ht', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://seqr-reference-data/GRCh38/eigen/EIGEN_coding_noncoding.liftover_grch38.ht', + }, + }, + ReferenceDataset.clinvar: { + ENUMS: clinvar.ENUMS, + FILTER: filter_mito_contigs, + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: clinvar.parse_clinvar_release_date, + PATH: 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL, DatasetType.MITO]), + VERSION: clinvar.parse_clinvar_release_date, + PATH: 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz', + }, + }, + ReferenceDataset.exac: { + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/legacy/exacv1_downloads/release1/ExAC.r1.sites.vep.vcf.gz', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + # NB: Exac is only available on GRCh37 so we host a lifted over version + PATH: 'gs://seqr-reference-data/GRCh38/gnomad/ExAC.r1.sites.liftover.b38.vcf.gz', + }, + }, + ReferenceDataset.helix_mito: { + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.MITO]), + VERSION: '1.0', + PATH: 'https://helix-research-public.s3.amazonaws.com/mito/HelixMTdb_20200327.tsv', + }, + }, + ReferenceDataset.splice_ai: { + ENUMS: { + 'splice_consequence': [ + 'Acceptor gain', + 'Acceptor loss', + 'Donor gain', + 'Donor loss', + 'No consequence', + ], + }, + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: [ + 'gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.masked.snv.hg19.vcf.gz', + 'gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.masked.indel.hg19.vcf.gz', + ], + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + # NB: SpliceAI data is only available to download for authenticated Illumina users, so we will host the data + PATH: [ + 'gs://seqr-reference-data/GRCh38/spliceai/spliceai_scores.masked.snv.hg38.vcf.gz', + 'gs://seqr-reference-data/GRCh38/spliceai/spliceai_scores.masked.indel.hg38.vcf.gz', + ], + }, + }, + ReferenceDataset.topmed: { + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.vcf.gz', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + # NB: TopMed data is available to download via https://legacy.bravo.sph.umich.edu/freeze8/hg38/downloads/vcf/ + # However, users must be authenticated and accept TOS to access it so for now we will host a copy of the data + PATH: 'gs://seqr-reference-data/GRCh38/TopMed/bravo-dbsnp-all.vcf.gz', + }, + }, + ReferenceDataset.hmtvar: { + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.MITO]), + VERSION: '1.0', + # NB: https://www.hmtvar.uniba.it is unavailable as of 11/15/24 so we will host the data + PATH: 'https://storage.googleapis.com/seqr-reference-data/GRCh38/mitochondrial/HmtVar/HmtVar%20Jan.%2010%202022.json', + }, + }, + ReferenceDataset.mitimpact: { + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.MITO]), + VERSION: '1.0', + PATH: 'https://mitimpact.css-mendel.it/cdn/MitImpact_db_3.1.3.txt.zip', + }, + }, + ReferenceDataset.hgmd: { + ENUMS: {'class': ['DM', 'DM?', 'DP', 'DFP', 'FP', 'R']}, + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://seqr-reference-data-private/GRCh37/HGMD/HGMD_Pro_2023.1_hg19.vcf.gz', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://seqr-reference-data-private/GRCh38/HGMD/HGMD_Pro_2023.1_hg38.vcf.gz', + }, + }, + ReferenceDataset.gnomad_exomes: { + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/release/2.1.1/ht/exomes/gnomad.exomes.r2.1.1.sites.ht', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/release/4.1/ht/exomes/gnomad.exomes.v4.1.sites.ht', + }, + }, + ReferenceDataset.gnomad_genomes: { + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/release/2.1.1/ht/genomes/gnomad.genomes.r2.1.1.sites.ht', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/release/4.1/ht/genomes/gnomad.genomes.v4.1.sites.ht', + }, + }, + ReferenceDataset.gnomad_qc: { + EXCLUDE_FROM_ANNOTATIONS: True, + ReferenceGenome.GRCh37: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://seqr-reference-data/gnomad_qc/GRCh37/gnomad.joint.high_callrate_common_biallelic_snps.pruned.mt', + }, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/release/4.0/pca/gnomad.v4.0.pca_loadings.ht', + }, + }, + ReferenceDataset.mitomap: { + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.MITO]), + VERSION: '1.0', + # Downloaded via https://www.mitomap.org/foswiki/bin/view/MITOMAP/ConfirmedMutations + PATH: 'gs://seqr-reference-data/GRCh38/mitochondrial/MITOMAP/mitomap_confirmed_mutations_nov_2024.csv', + }, + }, + ReferenceDataset.gnomad_mito: { + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.MITO]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/release/3.1/ht/genomes/gnomad.genomes.v3.1.sites.chrM.ht', + }, + }, + ReferenceDataset.gnomad_non_coding_constraint: { + IS_INTERVAL: True, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'gs://gcp-public-data--gnomad/release/3.1/secondary_analyses/genomic_constraint/constraint_z_genome_1kb.qc.download.txt.gz', + }, + }, + ReferenceDataset.screen: { + ENUMS: { + 'region_type': [ + 'CTCF-bound', + 'CTCF-only', + 'DNase-H3K4me3', + 'PLS', + 'dELS', + 'pELS', + 'DNase-only', + 'low-DNase', + ], + }, + IS_INTERVAL: True, + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.SNV_INDEL]), + VERSION: '1.0', + PATH: 'https://downloads.wenglab.org/V3/GRCh38-cCREs.bed', + }, + }, + ReferenceDataset.local_constraint_mito: { + ReferenceGenome.GRCh38: { + DATASET_TYPES: frozenset([DatasetType.MITO]), + VERSION: '1.0', + PATH: 'https://www.biorxiv.org/content/biorxiv/early/2023/01/27/2022.12.16.520778/DC3/embed/media-3.zip', + }, + }, +} +CONFIG[ReferenceDatasetQuery.clinvar_path_variants] = { + EXCLUDE_FROM_ANNOTATIONS: True, + **CONFIG[ReferenceDataset.clinvar], +} +CONFIG[ReferenceDataset.gnomad_coding_and_noncoding] = { + EXCLUDE_FROM_ANNOTATIONS: True, + **CONFIG[ReferenceDataset.gnomad_genomes], +} +CONFIG[ReferenceDatasetQuery.high_af_variants] = { + EXCLUDE_FROM_ANNOTATIONS: True, + **CONFIG[ReferenceDataset.gnomad_genomes], +} diff --git a/v03_pipeline/lib/reference_datasets/screen.py b/v03_pipeline/lib/reference_datasets/screen.py new file mode 100644 index 000000000..4960e7adf --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/screen.py @@ -0,0 +1,38 @@ +import shutil +import tempfile + +import hail as hl +import requests + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import ( + select_for_interval_reference_dataset, +) + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + with tempfile.NamedTemporaryFile( + suffix='.bed', + delete=False, + ) as tmp_file, requests.get( + path, + stream=True, + timeout=10, + ) as r: + shutil.copyfileobj(r.raw, tmp_file) + ht = hl.import_table( + tmp_file.name, + no_header=True, + types={ + 'f1': hl.tint32, + 'f2': hl.tint32, + }, + ) + return select_for_interval_reference_dataset( + ht, + reference_genome, + {'region_type': ht['f5'].split(',')}, + chrom_field='f0', + start_field='f1', + end_field='f2', + ) diff --git a/v03_pipeline/lib/reference_datasets/splice_ai.py b/v03_pipeline/lib/reference_datasets/splice_ai.py new file mode 100644 index 000000000..e0e3f6db1 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/splice_ai.py @@ -0,0 +1,33 @@ +import hail as hl + +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import vcf_to_ht + + +def get_ht( + paths: list[str], + reference_genome: ReferenceGenome, +) -> hl.Table: + ht = vcf_to_ht(paths, reference_genome) + + # SpliceAI INFO field description from the VCF header: SpliceAIv1.3 variant annotation. These include + # delta scores (DS) and delta positions (DP) for acceptor gain (AG), acceptor loss (AL), donor gain (DG), and + # donor loss (DL). Format: ALLELE|SYMBOL|DS_AG|DS_AL|DS_DG|DS_DL|DP_AG|DP_AL|DP_DG|DP_DL + ds_start_index = 2 + ds_end_index = 6 + num_delta_scores = ds_end_index - ds_start_index + ht = ht.select( + delta_scores=ht.info.SpliceAI[0] + .split(delim='\\|')[ds_start_index:ds_end_index] + .map(hl.float32), + ) + ht = ht.annotate(delta_score=hl.max(ht.delta_scores)) + return ht.annotate( + splice_consequence_id=hl.if_else( + ht.delta_score > 0, + # Splice Consequence enum ID is the index of the max score + ht.delta_scores.index(ht.delta_score), + # If no score, use the last index for "No Consequence" + num_delta_scores, + ), + ).drop('delta_scores') diff --git a/v03_pipeline/lib/reference_datasets/topmed.py b/v03_pipeline/lib/reference_datasets/topmed.py new file mode 100644 index 000000000..4e0fddf11 --- /dev/null +++ b/v03_pipeline/lib/reference_datasets/topmed.py @@ -0,0 +1,20 @@ +import hail as hl + +from v03_pipeline.lib.misc.nested_field import parse_nested_field +from v03_pipeline.lib.model import ReferenceGenome +from v03_pipeline.lib.reference_datasets.misc import vcf_to_ht + +SELECT = { + 'AC': 'info.AC#', + 'AF': 'info.AF#', + 'AN': 'info.AN', + 'Hom': 'info.Hom#', + 'Het': 'info.Het#', +} + + +def get_ht(path: str, reference_genome: ReferenceGenome) -> hl.Table: + ht = vcf_to_ht(path, reference_genome) + return ht.select( + **{k: parse_nested_field(ht, v) for k, v in SELECT.items()}, + ) diff --git a/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table.py b/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table.py index 31c718034..cca50e609 100644 --- a/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table.py +++ b/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table.py @@ -2,30 +2,25 @@ import luigi from v03_pipeline.lib.annotations.misc import annotate_enums -from v03_pipeline.lib.annotations.rdc_dependencies import ( - get_rdc_annotation_dependencies, -) -from v03_pipeline.lib.model import ( - ReferenceDatasetCollection, -) from v03_pipeline.lib.paths import ( + valid_reference_dataset_path, variant_annotations_table_path, ) +from v03_pipeline.lib.reference_datasets.reference_dataset import ( + BaseReferenceDataset, + ReferenceDatasetQuery, +) from v03_pipeline.lib.tasks.base.base_update import BaseUpdateTask from v03_pipeline.lib.tasks.files import GCSorLocalTarget -from v03_pipeline.lib.tasks.reference_data.update_cached_reference_dataset_queries import ( - UpdateCachedReferenceDatasetQueries, +from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset import ( + UpdatedReferenceDatasetTask, ) -from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_collection import ( - UpdatedReferenceDatasetCollectionTask, +from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_query import ( + UpdatedReferenceDatasetQueryTask, ) class BaseUpdateVariantAnnotationsTableTask(BaseUpdateTask): - @property - def rdc_annotation_dependencies(self) -> dict[str, hl.Table]: - return get_rdc_annotation_dependencies(self.dataset_type, self.reference_genome) - def output(self) -> luigi.Target: return GCSorLocalTarget( variant_annotations_table_path( @@ -35,20 +30,26 @@ def output(self) -> luigi.Target: ) def requires(self) -> list[luigi.Task]: - requirements = [ - self.clone(UpdateCachedReferenceDatasetQueries), - ] - requirements.extend( - self.clone( - UpdatedReferenceDatasetCollectionTask, - reference_dataset_collection=rdc, - ) - for rdc in ReferenceDatasetCollection.for_reference_genome_dataset_type( - self.reference_genome, - self.dataset_type, - ) - ) - return requirements + reqs = [] + for reference_dataset in BaseReferenceDataset.for_reference_genome_dataset_type( + self.reference_genome, + self.dataset_type, + ): + if isinstance(reference_dataset, ReferenceDatasetQuery): + reqs.append( + self.clone( + UpdatedReferenceDatasetQueryTask, + reference_dataset_query=reference_dataset, + ), + ) + else: + reqs.append( + self.clone( + UpdatedReferenceDatasetTask, + reference_dataset=reference_dataset, + ), + ) + return reqs def initialize_table(self) -> hl.Table: key_type = self.dataset_type.table_key_type(self.reference_genome) @@ -57,7 +58,6 @@ def initialize_table(self) -> hl.Table: key_type, key=key_type.fields, globals=hl.Struct( - paths=hl.Struct(), versions=hl.Struct(), enums=hl.Struct(), updates=hl.empty_set( @@ -79,28 +79,27 @@ def annotate_globals( ht: hl.Table, ) -> hl.Table: ht = ht.annotate_globals( - paths=hl.Struct(), versions=hl.Struct(), enums=hl.Struct(), ) - for rdc in ReferenceDatasetCollection.for_reference_genome_dataset_type( + for ( + reference_dataset + ) in BaseReferenceDataset.for_reference_genome_dataset_type_annotations( self.reference_genome, self.dataset_type, ): - rdc_ht = self.rdc_annotation_dependencies[f'{rdc.value}_ht'] - rdc_globals = rdc_ht.index_globals() + rd_ht = hl.read_table( + valid_reference_dataset_path(self.reference_genome, reference_dataset), + ) + rd_ht_globals = rd_ht.index_globals() ht = ht.select_globals( - paths=hl.Struct( - **ht.globals.paths, - **rdc_globals.paths, - ), versions=hl.Struct( **ht.globals.versions, - **rdc_globals.versions, + **{reference_dataset.name: rd_ht_globals.version}, ), enums=hl.Struct( **ht.globals.enums, - **rdc_globals.enums, + **{reference_dataset.name: rd_ht_globals.enums}, ), updates=ht.globals.updates, migrations=ht.globals.migrations, diff --git a/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table_test.py b/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table_test.py index ce7747768..1b50875e3 100644 --- a/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table_test.py +++ b/v03_pipeline/lib/tasks/base/base_update_variant_annotations_table_test.py @@ -1,83 +1,55 @@ -import shutil -from unittest.mock import patch - import hail as hl import luigi.worker +import responses from v03_pipeline.lib.model import ( DatasetType, - ReferenceDatasetCollection, ReferenceGenome, ) -from v03_pipeline.lib.paths import valid_reference_dataset_collection_path +from v03_pipeline.lib.paths import valid_reference_dataset_query_path +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDatasetQuery from v03_pipeline.lib.tasks.base.base_update_variant_annotations_table import ( BaseUpdateVariantAnnotationsTableTask, ) from v03_pipeline.lib.tasks.files import GCSorLocalFolderTarget -from v03_pipeline.lib.test.mock_complete_task import MockCompleteTask -from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase - -TEST_COMBINED_1 = 'v03_pipeline/var/test/reference_data/test_combined_1.ht' -TEST_HGMD_1 = 'v03_pipeline/var/test/reference_data/test_hgmd_1.ht' -TEST_INTERVAL_1 = 'v03_pipeline/var/test/reference_data/test_interval_1.ht' - +from v03_pipeline.lib.test.mock_clinvar_urls import mock_clinvar_urls +from v03_pipeline.lib.test.mocked_reference_datasets_testcase import ( + MockedReferenceDatasetsTestCase, +) -class BaseVariantAnnotationsTableTest(MockedDatarootTestCase): - def setUp(self) -> None: - super().setUp() - shutil.copytree( - TEST_COMBINED_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.COMBINED, - ), - ) - shutil.copytree( - TEST_HGMD_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, - ), - ) - shutil.copytree( - TEST_INTERVAL_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.INTERVAL, - ), - ) - @patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdatedReferenceDatasetCollectionTask', - ) - @patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdateCachedReferenceDatasetQueries', - ) +class BaseVariantAnnotationsTableTest(MockedReferenceDatasetsTestCase): + @responses.activate def test_should_create_initialized_table( self, - mock_update_crdqs_task, - mock_update_rdc_task, ) -> None: - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - vat_task = BaseUpdateVariantAnnotationsTableTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - ) - self.assertTrue('annotations.ht' in vat_task.output().path) - self.assertTrue(DatasetType.SNV_INDEL.value in vat_task.output().path) - self.assertFalse(vat_task.output().exists()) - self.assertFalse(vat_task.complete()) - - worker = luigi.worker.Worker() - worker.add(vat_task) - worker.run() - self.assertTrue(GCSorLocalFolderTarget(vat_task.output().path).exists()) - self.assertTrue(vat_task.complete()) - - ht = hl.read_table(vat_task.output().path) - self.assertEqual(ht.count(), 0) - self.assertEqual(list(ht.key.keys()), ['locus', 'alleles']) + with mock_clinvar_urls(ReferenceGenome.GRCh38): + vat_task = BaseUpdateVariantAnnotationsTableTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + ) + self.assertTrue('annotations.ht' in vat_task.output().path) + self.assertFalse(vat_task.output().exists()) + self.assertFalse(vat_task.complete()) + + worker = luigi.worker.Worker() + worker.add(vat_task) + worker.run() + self.assertTrue(GCSorLocalFolderTarget(vat_task.output().path).exists()) + self.assertTrue(vat_task.complete()) + + ht = hl.read_table(vat_task.output().path) + self.assertEqual(ht.count(), 0) + self.assertEqual(list(ht.key.keys()), ['locus', 'alleles']) + self.assertEqual( + hl.eval( + hl.read_table( + valid_reference_dataset_query_path( + ReferenceGenome.GRCh38, + DatasetType.SNV_INDEL, + ReferenceDatasetQuery.clinvar_path_variants, + ), + ).globals.version, + ), + '2024-11-11', + ) diff --git a/v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries.py b/v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries.py deleted file mode 100644 index dc9c2a17e..000000000 --- a/v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries.py +++ /dev/null @@ -1,37 +0,0 @@ -import luigi -import luigi.util - -from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, -) -from v03_pipeline.lib.tasks.base.base_loading_run_params import ( - BaseLoadingRunParams, -) -from v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query import ( - UpdatedCachedReferenceDatasetQuery, -) - - -@luigi.util.inherits(BaseLoadingRunParams) -class UpdateCachedReferenceDatasetQueries(luigi.Task): - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.checked_for_tasks = False - self.dynamic_crdq_tasks = set() - - def complete(self) -> bool: - return self.checked_for_tasks - - def run(self): - self.checked_for_tasks = True - for crdq in CachedReferenceDatasetQuery.for_reference_genome_dataset_type( - self.reference_genome, - self.dataset_type, - ): - self.dynamic_crdq_tasks.add( - UpdatedCachedReferenceDatasetQuery( - **self.param_kwargs, - crdq=crdq, - ), - ) - yield self.dynamic_crdq_tasks diff --git a/v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries_test.py b/v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries_test.py deleted file mode 100644 index d6bf33d36..000000000 --- a/v03_pipeline/lib/tasks/reference_data/update_cached_reference_dataset_queries_test.py +++ /dev/null @@ -1,124 +0,0 @@ -import unittest -from unittest import mock - -import luigi - -from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, - DatasetType, - ReferenceGenome, - SampleType, -) -from v03_pipeline.lib.tasks.reference_data.update_cached_reference_dataset_queries import ( - UpdateCachedReferenceDatasetQueries, -) -from v03_pipeline.lib.test.mock_complete_task import MockCompleteTask - - -@mock.patch( - 'v03_pipeline.lib.tasks.reference_data.update_cached_reference_dataset_queries.UpdatedCachedReferenceDatasetQuery', -) -class UpdateCachedReferenceDatasetQueriesTest(unittest.TestCase): - def test_37_snv_indel(self, mock_crdq_task): - mock_crdq_task.return_value = MockCompleteTask() - worker = luigi.worker.Worker() - kwargs = { - 'sample_type': SampleType.WGS, - 'callset_path': '', - 'project_guids': [], - 'project_remap_paths': [], - 'project_pedigree_paths': [], - 'skip_validation': True, - 'run_id': '1', - } - task = UpdateCachedReferenceDatasetQueries( - reference_genome=ReferenceGenome.GRCh37, - dataset_type=DatasetType.SNV_INDEL, - **kwargs, - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - call_args_list = mock_crdq_task.call_args_list - self.assertEqual(len(call_args_list), 4) - self.assertEqual( - [x.kwargs['crdq'] for x in call_args_list], - list(CachedReferenceDatasetQuery), - ) - - def test_38_snv_indel(self, mock_crdq_task): - mock_crdq_task.return_value = MockCompleteTask() - worker = luigi.worker.Worker() - kwargs = { - 'sample_type': SampleType.WGS, - 'callset_path': '', - 'project_guids': [], - 'project_remap_paths': [], - 'project_pedigree_paths': [], - 'skip_validation': True, - 'run_id': '2', - } - task = UpdateCachedReferenceDatasetQueries( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - **kwargs, - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - call_args_list = mock_crdq_task.call_args_list - self.assertEqual(len(call_args_list), 4) - self.assertEqual( - [x.kwargs['crdq'] for x in call_args_list], - list(CachedReferenceDatasetQuery), - ) - - def test_38_mito(self, mock_crdq_task): - mock_crdq_task.return_value = MockCompleteTask() - worker = luigi.worker.Worker() - kwargs = { - 'sample_type': SampleType.WGS, - 'callset_path': '', - 'project_guids': [], - 'project_remap_paths': [], - 'project_pedigree_paths': [], - 'skip_validation': True, - 'run_id': '3', - } - task = UpdateCachedReferenceDatasetQueries( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.MITO, - **kwargs, - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - call_args_list = mock_crdq_task.call_args_list - self.assertEqual(len(call_args_list), 1) - self.assertEqual( - next(x.kwargs['crdq'] for x in call_args_list), - CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS, - ) - - def test_38_sv(self, mock_crdq_task): - mock_crdq_task.return_value = MockCompleteTask() - worker = luigi.worker.Worker() - kwargs = { - 'sample_type': SampleType.WGS, - 'callset_path': '', - 'project_guids': [], - 'project_remap_paths': [], - 'project_pedigree_paths': [], - 'skip_validation': True, - 'run_id': '4', - } - task = UpdateCachedReferenceDatasetQueries( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SV, - **kwargs, - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - # assert no crdq tasks for this reference genome and dataset type - mock_crdq_task.assert_has_calls([]) diff --git a/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset.py b/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset.py index 9a0aeca2d..002eb8d62 100644 --- a/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset.py +++ b/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset.py @@ -3,15 +3,13 @@ from v03_pipeline.lib.annotations.fields import get_fields from v03_pipeline.lib.logger import get_logger -from v03_pipeline.lib.model import ReferenceDatasetCollection -from v03_pipeline.lib.reference_data.compare_globals import ( - Globals, - clinvar_versions_equal, - get_datasets_to_update, +from v03_pipeline.lib.paths import valid_reference_dataset_path +from v03_pipeline.lib.reference_datasets.reference_dataset import ( + BaseReferenceDataset, + ReferenceDataset, ) -from v03_pipeline.lib.reference_data.config import CONFIG -from v03_pipeline.lib.tasks.base.base_loading_run_params import ( - BaseLoadingRunParams, +from v03_pipeline.lib.tasks.base.base_loading_pipeline_params import ( + BaseLoadingPipelineParams, ) from v03_pipeline.lib.tasks.base.base_update_variant_annotations_table import ( BaseUpdateVariantAnnotationsTableTask, @@ -20,94 +18,89 @@ logger = get_logger(__name__) -@luigi.util.inherits(BaseLoadingRunParams) +@luigi.util.inherits(BaseLoadingPipelineParams) class UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( BaseUpdateVariantAnnotationsTableTask, ): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) - self._datasets_to_update = [] - - @property - def reference_dataset_collections(self) -> list[ReferenceDatasetCollection]: - return ReferenceDatasetCollection.for_reference_genome_dataset_type( - self.reference_genome, - self.dataset_type, - ) + self._datasets_to_update: set[str] = set() def complete(self) -> bool: - logger.info( - 'Checking if UpdateVariantAnnotationsTableWithUpdatedReferenceDataset is complete', - ) - self._datasets_to_update = [] - + reference_dataset_names = { + rd.name + for rd in BaseReferenceDataset.for_reference_genome_dataset_type_annotations( + self.reference_genome, + self.dataset_type, + ) + } if not super().complete(): - for rdc in self.reference_dataset_collections: - self._datasets_to_update.extend( - rdc.datasets( - self.dataset_type, - ), - ) + self._datasets_to_update = reference_dataset_names return False - - datasets_to_check = [ - dataset - for rdc in self.reference_dataset_collections - for dataset in rdc.datasets(self.dataset_type) - ] - - if any( - 'clinvar' in d for d in datasets_to_check - ) and not clinvar_versions_equal( - hl.read_table(self.output().path), - self.reference_genome, - self.dataset_type, - ): - datasets_to_check.remove('clinvar') - self._datasets_to_update.add('clinvar') - - annotations_ht_globals = Globals.from_ht( - hl.read_table(self.output().path), - datasets_to_check, + # Find datasets with mismatched versions + annotation_ht_versions = dict( + hl.eval(hl.read_table(self.output().path).globals.versions), ) - rdc_ht_globals = Globals.from_dataset_configs( - self.reference_genome, - datasets_to_check, + self._datasets_to_update = ( + reference_dataset_names ^ annotation_ht_versions.keys() ) - self._datasets_to_update.extend( - get_datasets_to_update( - annotations_ht_globals, - rdc_ht_globals, - ), + for dataset_name in reference_dataset_names & annotation_ht_versions.keys(): + if ( + ReferenceDataset(dataset_name).version(self.reference_genome) + != annotation_ht_versions[dataset_name] + ): + self._datasets_to_update.add(dataset_name) + logger.info( + f"Datasets to update: {', '.join(d for d in self._datasets_to_update)}", ) - logger.info(f'Datasets to update: {self._datasets_to_update}') return not self._datasets_to_update def update_table(self, ht: hl.Table) -> hl.Table: - for dataset in self._datasets_to_update: - if dataset in ht.row: - ht = ht.drop(dataset) - if dataset not in CONFIG: + for dataset_name in self._datasets_to_update: + if dataset_name in ht.row: + ht = ht.drop(dataset_name) + if dataset_name not in set(ReferenceDataset): continue - - rdc = ReferenceDatasetCollection.for_dataset(dataset, self.dataset_type) - rdc_ht = self.rdc_annotation_dependencies[f'{rdc.value}_ht'] - if rdc.requires_annotation: + reference_dataset = ReferenceDataset(dataset_name) + reference_dataset_ht = hl.read_table( + valid_reference_dataset_path(self.reference_genome, reference_dataset), + ) + if reference_dataset.is_keyed_by_interval: formatting_fn = next( x for x in self.dataset_type.formatting_annotation_fns( self.reference_genome, ) - if x.__name__ == dataset + if x.__name__ == reference_dataset.name ) ht = ht.annotate( **get_fields( ht, [formatting_fn], - **self.rdc_annotation_dependencies, + **{f'{reference_dataset.name}_ht': reference_dataset_ht}, **self.param_kwargs, ), ) else: - ht = ht.join(rdc_ht.select(dataset), 'left') + if reference_dataset.select: + reference_dataset_ht = reference_dataset.select( + self.reference_genome, + self.dataset_type, + reference_dataset_ht, + ) + if reference_dataset.filter: + reference_dataset_ht = reference_dataset.filter( + self.reference_genome, + self.dataset_type, + reference_dataset_ht, + ) + reference_dataset_ht = reference_dataset_ht.select( + **{ + f'{reference_dataset.name}': hl.Struct( + **reference_dataset_ht.row_value, + ), + }, + ) + ht = ht.join(reference_dataset_ht, 'left') + return self.annotate_globals(ht) diff --git a/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset_test.py b/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset_test.py index e008aaa8f..a0076f946 100644 --- a/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset_test.py +++ b/v03_pipeline/lib/tasks/reference_data/update_variant_annotations_table_with_updated_reference_dataset_test.py @@ -1,11 +1,12 @@ -import shutil -from unittest import mock +from unittest.mock import patch import hail as hl import luigi.worker +import responses from v03_pipeline.lib.annotations.enums import ( BIOTYPES, + CLINVAR_ASSERTIONS, CLINVAR_PATHOGENICITIES, FIVEUTR_CONSEQUENCES, LOF_FILTERS, @@ -17,725 +18,90 @@ ) from v03_pipeline.lib.model import ( DatasetType, - ReferenceDatasetCollection, ReferenceGenome, - SampleType, ) -from v03_pipeline.lib.paths import valid_reference_dataset_collection_path -from v03_pipeline.lib.reference_data.clinvar import CLINVAR_ASSERTIONS -from v03_pipeline.lib.reference_data.config import CONFIG +from v03_pipeline.lib.reference_datasets.reference_dataset import ( + BaseReferenceDataset, + ReferenceDataset, +) from v03_pipeline.lib.tasks.files import GCSorLocalFolderTarget from v03_pipeline.lib.tasks.reference_data.update_variant_annotations_table_with_updated_reference_dataset import ( UpdateVariantAnnotationsTableWithUpdatedReferenceDataset, ) -from v03_pipeline.lib.test.mock_complete_task import MockCompleteTask -from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase +from v03_pipeline.lib.test.mock_clinvar_urls import mock_clinvar_urls +from v03_pipeline.lib.test.mocked_reference_datasets_testcase import ( + MockedReferenceDatasetsTestCase, +) -TEST_COMBINED_1 = 'v03_pipeline/var/test/reference_data/test_combined_1.ht' -TEST_HGMD_1 = 'v03_pipeline/var/test/reference_data/test_hgmd_1.ht' -TEST_INTERVAL_1 = 'v03_pipeline/var/test/reference_data/test_interval_1.ht' -TEST_COMBINED_MITO_1 = 'v03_pipeline/var/test/reference_data/test_combined_mito_1.ht' -TEST_INTERVAL_MITO_1 = 'v03_pipeline/var/test/reference_data/test_interval_mito_1.ht' -TEST_COMBINED_37 = 'v03_pipeline/var/test/reference_data/test_combined_37.ht' -TEST_HGMD_37 = 'v03_pipeline/var/test/reference_data/test_hgmd_37.ht' TEST_SNV_INDEL_VCF = 'v03_pipeline/var/test/callsets/1kg_30variants.vcf' -TEST_MITO_MT = 'v03_pipeline/var/test/callsets/mito_1.mt' - -MOCK_CADD_CONFIG = { - 'version': 'v1.6', - 'select': ['PHRED'], - 'source_path': 'gs://seqr-reference-data/GRCh37/CADD/CADD_snvs_and_indels.v1.6.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - PHRED=hl.tfloat32, - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='v1.6', - ), +BASE_ENUMS = { + 'sorted_motif_feature_consequences': hl.Struct( + consequence_term=MOTIF_CONSEQUENCE_TERMS, ), -} -MOCK_CLINVAR_CONFIG = { - **CONFIG['clinvar']['38'], - 'source_path': 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - ALLELEID=hl.tint32, - CLNSIG=hl.tarray(hl.tstr), - CLNSIGCONF=hl.tarray(hl.tstr), - CLNREVSTAT=hl.tarray(hl.tstr), - ), - submitters=hl.tarray(hl.tstr), - conditions=hl.tarray(hl.tstr), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='2023-11-26', - ), + 'sorted_regulatory_feature_consequences': hl.Struct( + biotype=REGULATORY_BIOTYPES, + consequence_term=REGULATORY_CONSEQUENCE_TERMS, ), -} - -MOCK_EIGEN_CONFIG = { - 'select': {'Eigen_phred': 'info.Eigen-phred'}, - 'source_path': 'gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct(**{'Eigen-phred': hl.tfloat32}), + 'sorted_transcript_consequences': hl.Struct( + biotype=BIOTYPES, + consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, + loftee=hl.Struct( + lof_filter=LOF_FILTERS, ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), -} - -MOCK_EXAC_CONFIG = { - **CONFIG['exac']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh37/gnomad/ExAC.r1.sites.vep.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - AF_POPMAX=hl.tfloat64, - AF=hl.tarray(hl.tfloat64), - AC_Adj=hl.tarray(hl.tint32), - AC_Het=hl.tarray(hl.tint32), - AC_Hom=hl.tarray(hl.tint32), - AC_Hemi=hl.tarray(hl.tint32), - AN_Adj=hl.tint32, - ), - a_index=hl.tint32, + utrannotator=hl.Struct( + fiveutr_consequence=FIVEUTR_CONSEQUENCES, ), - key=['locus', 'alleles'], - globals=hl.Struct(), ), } -MOCK_MPC_CONFIG = { - **CONFIG['mpc']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh37/MPC/fordist_constraint_official_mpc_values.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - MPC=hl.tstr, - ), - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), -} -MOCK_PRIMATE_AI_CONFIG = { - 'version': 'v0.2', - 'select': {'score': 'info.score'}, - 'source_path': 'gs://seqr-reference-data/GRCh37/primate_ai/PrimateAI_scores_v0.2.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - score=hl.tfloat64, - ), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='v0.2', - ), - ), -} -MOCK_SPLICE_AI_CONFIG = { - **CONFIG['splice_ai']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - max_DS=hl.tfloat64, - splice_consequence=hl.tstr, - ), - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), -} -MOCK_TOPMED_CONFIG = { - **CONFIG['topmed']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - AC=hl.tint32, - AN=hl.tint32, - AF=hl.tfloat64, - Hom=hl.tint32, - Het=hl.tint32, - ), - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), -} -MOCK_CONFIG = { - 'cadd': { - '37': MOCK_CADD_CONFIG, - '38': MOCK_CADD_CONFIG, - }, - 'clinvar': { - '37': MOCK_CLINVAR_CONFIG, - '38': MOCK_CLINVAR_CONFIG, - }, - 'dbnsfp': { - '37': { - **CONFIG['dbnsfp']['37'], - 'version': '2.9.3', - 'source_path': 'gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - REVEL_score=hl.tstr, - SIFT_score=hl.tstr, - Polyphen2_HVAR_score=hl.tstr, - MutationTaster_pred=hl.tstr, - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='2.9.3', - ), - ), - }, - '38': { - **CONFIG['dbnsfp']['38'], - 'version': '2.9.3', - 'source_path': 'gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - REVEL_score=hl.tstr, - SIFT_score=hl.tstr, - Polyphen2_HVAR_score=hl.tstr, - MutationTaster_pred=hl.tstr, - VEST4_score=hl.tstr, - MutPred_score=hl.tstr, - fathmm_MKL_coding_score=hl.tfloat64, - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='2.9.3', - ), - ), - }, - }, - 'eigen': { - '37': MOCK_EIGEN_CONFIG, - '38': MOCK_EIGEN_CONFIG, - }, - 'exac': { - '37': MOCK_EXAC_CONFIG, - '38': MOCK_EXAC_CONFIG, - }, - 'gnomad_exomes': { - '37': { - **CONFIG['gnomad_exomes']['37'], - 'source_path': 'gs://gcp-public-data--gnomad/release/2.1.1/ht/exomes/gnomad.exomes.r2.1.1.sites.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - freq=hl.tarray( - hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, - ), - ), - popmax=hl.tarray( - hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, - pop=hl.tstr, - ), - ), - faf=hl.tarray(hl.tstruct(faf95=hl.tfloat64)), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - freq_index_dict={'gnomad': 0, 'gnomad_male': 1}, - popmax_index_dict={'gnomad': 0}, - ), - ), - }, - '38': { - **CONFIG['gnomad_exomes']['38'], - 'version': '4.1', - 'source_path': 'gs://gcp-public-data--gnomad/release/4.1/ht/exomes/gnomad.exomes.v4.1.sites.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - freq=hl.tarray( - hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, - ), - ), - grpmax=hl.tstruct( - gnomad=hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, - pop=hl.tstr, - ), - ), - faf=hl.tarray(hl.tstruct(faf95=hl.tfloat64)), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - freq_index_dict={'adj': 0, 'XY_adj': 1}, - faf_index_dict={'adj': 0}, - ), - ), - }, - }, - 'gnomad_genomes': { - '37': { - **CONFIG['gnomad_genomes']['37'], - 'source_path': 'gs://gcp-public-data--gnomad/release/2.1.1/ht/genomes/gnomad.genomes.r2.1.1.sites.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - freq=hl.tarray( - hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, - ), - ), - popmax=hl.tarray( - hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, - pop=hl.tstr, - ), - ), - faf=hl.tarray(hl.tstruct(faf95=hl.tfloat64)), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - freq_index_dict={'gnomad': 0, 'gnomad_male': 1}, - popmax_index_dict={'gnomad': 0}, - ), - ), - }, - '38': { - **CONFIG['gnomad_genomes']['38'], - 'version': '4.1', - 'source_path': 'gs://gcp-public-data--gnomad/release/4.1/ht/genomes/gnomad.genomes.v4.1.sites.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - freq=hl.tarray( - hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, +class UpdateVATWithUpdatedReferenceDatasets(MockedReferenceDatasetsTestCase): + @responses.activate + def test_create_empty_annotations_table(self): + with patch.object( + BaseReferenceDataset, + 'for_reference_genome_dataset_type_annotations', + return_value=[ReferenceDataset.clinvar], + ), mock_clinvar_urls(ReferenceGenome.GRCh38): + task = UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + ) + worker = luigi.worker.Worker() + worker.add(task) + worker.run() + self.assertTrue(GCSorLocalFolderTarget(task.output().path).exists()) + self.assertTrue(task.complete()) + + ht = hl.read_table(task.output().path) + self.assertCountEqual( + ht.globals.collect(), + [ + hl.Struct( + versions=hl.Struct(clinvar='2024-11-11'), + enums=hl.Struct( + clinvar=hl.Struct( + assertion=CLINVAR_ASSERTIONS, + pathogenicity=CLINVAR_PATHOGENICITIES, + ), + **BASE_ENUMS, ), + migrations=[], + updates=set(), ), - grpmax=hl.tstruct( - AF=hl.tfloat64, - AN=hl.tint32, - AC=hl.tint32, - homozygote_count=hl.tint32, - pop=hl.tstr, - ), - faf=hl.tarray(hl.tstruct(faf95=hl.tfloat64)), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - freq_index_dict={'adj': 0, 'XY_adj': 1}, - faf_index_dict={'adj': 0}, - ), - ), - }, - }, - 'mpc': { - '37': MOCK_MPC_CONFIG, - '38': MOCK_MPC_CONFIG, - }, - 'primate_ai': { - '37': MOCK_PRIMATE_AI_CONFIG, - '38': MOCK_PRIMATE_AI_CONFIG, - }, - 'splice_ai': { - '37': MOCK_SPLICE_AI_CONFIG, - '38': MOCK_SPLICE_AI_CONFIG, - }, - 'topmed': { - '37': MOCK_TOPMED_CONFIG, - '38': MOCK_TOPMED_CONFIG, - }, - 'hgmd': { - '37': { - **CONFIG['hgmd']['37'], - 'source_path': 'gs://seqr-reference-data-private/GRCh37/HGMD/HGMD_Pro_2023.1_hg19.vcf.gz', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - rsid=hl.tstr, - info=hl.tstruct( - CLASS=hl.tstr, - ), - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - '38': { - **CONFIG['hgmd']['38'], - 'source_path': 'gs://seqr-reference-data-private/GRCh38/HGMD/HGMD_Pro_2023.1_hg38.vcf.gz', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - rsid=hl.tstr, - info=hl.tstruct( - CLASS=hl.tstr, - ), - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'gnomad_non_coding_constraint': { - '38': { - 'select': {'z_score': 'target'}, - 'source_path': 'gs://seqr-reference-data/GRCh38/gnomad_nc_constraint/gnomad_non-coding_constraint_z_scores.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - target=hl.tfloat64, - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'screen': { - '38': { - **CONFIG['screen']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh38/ccREs/GRCh38-ccREs.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - target=hl.tarray(hl.tstr), - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, -} -MOCK_CONFIG_MITO = { - 'clinvar_mito': { - '38': { - **CONFIG['clinvar_mito']['38'], - 'source_path': 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - ALLELEID=hl.tint32, - CLNSIG=hl.tarray(hl.tstr), - CLNSIGCONF=hl.tarray(hl.tstr), - CLNREVSTAT=hl.tarray(hl.tstr), - ), - submitters=hl.tarray(hl.tstr), - conditions=hl.tarray(hl.tstr), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='2023-07-22', - ), - ), - }, - }, - 'dbnsfp_mito': { - '38': { - **CONFIG['dbnsfp_mito']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh38/dbNSFP/v4.2/dbNSFP4.2a_variant.with_new_scores.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - SIFT_score=hl.tstr, - MutationTaster_pred=hl.tstr, - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='4.2', - ), - ), - }, - }, - 'gnomad_mito': { - '38': { - **CONFIG['gnomad_mito']['38'], - 'source_path': 'gs://gcp-public-data--gnomad/release/3.1/ht/genomes/gnomad.genomes.v3.1.sites.chrM.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - AN=hl.tint64, - AC_hom=hl.tint64, - AC_het=hl.tint64, - AF_hom=hl.tfloat32, - AF_het=hl.tfloat32, - max_hl=hl.tfloat64, - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'helix_mito': { - '38': { - **CONFIG['helix_mito']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/Helix/HelixMTdb_20200327.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - counts_hom=hl.tint32, - counts_het=hl.tint32, - AF_hom=hl.tfloat64, - AF_het=hl.tfloat64, - AN=hl.tint32, - max_ARF=hl.tfloat64, - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'hmtvar': { - '38': { - **CONFIG['hmtvar']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/HmtVar/HmtVar%20Jan.%2010%202022.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - disease_score=hl.tfloat64, - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'mitomap': { - '38': { - **CONFIG['mitomap']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/MITOMAP/mitomap-confirmed-mutations-2022-02-04.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - pathogenic=hl.tbool, - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'mitimpact': { - '38': { - **CONFIG['mitimpact']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/MitImpact/MitImpact_db_3.1.3.ht', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - APOGEE2_score=hl.tfloat64, - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'high_constraint_region_mito': { - '38': { - **CONFIG['high_constraint_region_mito']['38'], - 'source_path': 'gs://seqr-reference-data/GRCh38/mitochondrial/Helix high constraint intervals Feb-15-2022.tsv', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - interval=hl.tstr, - ), - key=['interval'], - globals=hl.Struct(), - ), - }, - }, - 'local_constraint_mito': { - '38': { - **CONFIG['local_constraint_mito']['38'], - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - MLC_score=hl.tfloat32, - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, -} - - -@mock.patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdatedReferenceDatasetCollectionTask', -) -@mock.patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdateCachedReferenceDatasetQueries', -) -@mock.patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.BaseUpdateVariantAnnotationsTableTask.initialize_table', -) -class UpdateVATWithUpdatedRDC(MockedDatarootTestCase): - def setUp(self) -> None: - super().setUp() - shutil.copytree( - TEST_COMBINED_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.COMBINED, - ), - ) - shutil.copytree( - TEST_HGMD_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, - ), - ) - shutil.copytree( - TEST_INTERVAL_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.INTERVAL, - ), - ) - shutil.copytree( - TEST_COMBINED_MITO_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.MITO, - ReferenceDatasetCollection.COMBINED, - ), - ) - shutil.copytree( - TEST_INTERVAL_MITO_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.MITO, - ReferenceDatasetCollection.INTERVAL, - ), - ) - shutil.copytree( - TEST_COMBINED_37, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh37, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.COMBINED, - ), - ) - shutil.copytree( - TEST_HGMD_37, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh37, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, - ), - ) + ], + ) - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG, + @responses.activate + @patch( + 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.BaseUpdateVariantAnnotationsTableTask.initialize_table', ) - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.update_variant_annotations_table_with_updated_reference_dataset.clinvar_versions_equal', - ) - def test_update_vat_with_updated_rdc_snv_indel_38( + def test_update_vat_snv_indel_38( self, - mock_clinvar_versions_equal, - mock_initialize_table, - mock_update_crdqs_task, - mock_update_rdc_task, + mock_initialize_annotations_ht, ): - mock_clinvar_versions_equal.return_value = True - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_initialize_table.return_value = hl.Table.parallelize( + mock_initialize_annotations_ht.return_value = hl.Table.parallelize( [ hl.Struct( locus=hl.Locus( @@ -752,212 +118,141 @@ def test_update_vat_with_updated_rdc_snv_indel_38( ), key=['locus', 'alleles'], globals=hl.Struct( - paths=hl.Struct(), versions=hl.Struct(), enums=hl.Struct(), updates=hl.empty_set(hl.tstruct(callset=hl.tstr, project_guid=hl.tstr)), migrations=hl.empty_array(hl.tstr), ), ) - task = UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=[], - project_remap_paths=[], - project_pedigree_paths=[], - skip_validation=True, - run_id='3', - ) - worker = luigi.worker.Worker() - worker.add(task) - worker.run() - self.assertTrue(GCSorLocalFolderTarget(task.output().path).exists()) - self.assertTrue(task.complete()) - ht = hl.read_table(task.output().path) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - cadd=hl.Struct(PHRED=2), - clinvar=hl.Struct( - alleleId=None, - conflictingPathogenicities=None, - goldStars=None, - pathogenicity_id=None, - assertion_ids=None, - submitters=None, - conditions=None, - ), - dbnsfp=hl.Struct( - REVEL_score=0.0430000014603138, - SIFT_score=None, - Polyphen2_HVAR_score=None, - MutationTaster_pred_id=0, - VEST4_score=None, - MutPred_score=None, - fathmm_MKL_coding_score=None, - ), - eigen=hl.Struct(Eigen_phred=1.5880000591278076), - exac=hl.Struct( - AF_POPMAX=0.0004100881633348763, - AF=0.0004633000062312931, - AC_Adj=51, - AC_Het=51, - AC_Hom=0, - AC_Hemi=None, - AN_Adj=108288, - ), - gnomad_exomes=hl.Struct( - AF=0.00012876000255346298, - AN=240758, - AC=31, - Hom=0, - AF_POPMAX_OR_GLOBAL=0.0001119549197028391, - FAF_AF=9.315000352216884e-05, - Hemi=0, - ), - gnomad_genomes=None, - mpc=None, - primate_ai=None, - splice_ai=hl.Struct( - delta_score=0.029999999329447746, - splice_consequence_id=3, - ), - topmed=None, - gnomad_non_coding_constraint=hl.Struct(z_score=0.75), - screen=hl.Struct(region_type_ids=[1]), - hgmd=hl.Struct(accession='abcdefg', class_id=3), - ), - ], - ) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct( - cadd='gs://seqr-reference-data/GRCh37/CADD/CADD_snvs_and_indels.v1.6.ht', - clinvar='https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - dbnsfp='gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.ht', - eigen='gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.ht', - exac='gs://seqr-reference-data/GRCh37/gnomad/ExAC.r1.sites.vep.ht', - gnomad_exomes='gs://gcp-public-data--gnomad/release/4.1/ht/exomes/gnomad.exomes.v4.1.sites.ht', - gnomad_genomes='gs://gcp-public-data--gnomad/release/4.1/ht/genomes/gnomad.genomes.v4.1.sites.ht', - mpc='gs://seqr-reference-data/GRCh37/MPC/fordist_constraint_official_mpc_values.ht', - primate_ai='gs://seqr-reference-data/GRCh37/primate_ai/PrimateAI_scores_v0.2.ht', - splice_ai='gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.ht', - topmed='gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.ht', - gnomad_non_coding_constraint='gs://seqr-reference-data/GRCh38/gnomad_nc_constraint/gnomad_non-coding_constraint_z_scores.ht', - screen='gs://seqr-reference-data/GRCh38/ccREs/GRCh38-ccREs.ht', - hgmd='gs://seqr-reference-data-private/GRCh38/HGMD/HGMD_Pro_2023.1_hg38.vcf.gz', - ), - versions=hl.Struct( - cadd='v1.6', - clinvar='2023-11-26', - dbnsfp='2.9.3', - eigen=None, - exac=None, - gnomad_exomes='4.1', - gnomad_genomes='4.1', - mpc=None, - primate_ai='v0.2', - splice_ai=None, - topmed=None, - gnomad_non_coding_constraint=None, - screen=None, - hgmd='HGMD_Pro_2023', - ), - enums=hl.Struct( - cadd=hl.Struct(), - clinvar=hl.Struct( - pathogenicity=CLINVAR_PATHOGENICITIES, - assertion=CLINVAR_ASSERTIONS, + with mock_clinvar_urls(): + task = UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + ) + worker = luigi.worker.Worker() + worker.add(task) + worker.run() + self.assertTrue(GCSorLocalFolderTarget(task.output().path).exists()) + self.assertTrue(task.complete()) + + ht = hl.read_table(task.output().path) + self.assertCountEqual( + ht.globals.collect(), + [ + hl.Struct( + versions=hl.Struct( + dbnsfp='1.0', + eigen='1.0', + clinvar='2024-11-11', + exac='1.0', + splice_ai='1.0', + topmed='1.0', + hgmd='1.0', + gnomad_exomes='1.0', + gnomad_genomes='1.0', + screen='1.0', + gnomad_non_coding_constraint='1.0', ), - dbnsfp=hl.Struct( - MutationTaster_pred=['D', 'A', 'N', 'P'], + enums=hl.Struct( + dbnsfp=ReferenceDataset.dbnsfp.enum_globals, + eigen=hl.Struct(), + clinvar=ReferenceDataset.clinvar.enum_globals, + exac=hl.Struct(), + splice_ai=ReferenceDataset.splice_ai.enum_globals, + topmed=hl.Struct(), + hgmd=ReferenceDataset.hgmd.enum_globals, + gnomad_exomes=hl.Struct(), + gnomad_genomes=hl.Struct(), + screen=ReferenceDataset.screen.enum_globals, + gnomad_non_coding_constraint=hl.Struct(), + **BASE_ENUMS, ), - eigen=hl.Struct(), - exac=hl.Struct(), - gnomad_exomes=hl.Struct(), - gnomad_genomes=hl.Struct(), - mpc=hl.Struct(), - primate_ai=hl.Struct(), - splice_ai=hl.Struct( - splice_consequence=[ - 'Acceptor gain', - 'Acceptor loss', - 'Donor gain', - 'Donor loss', - 'No consequence', - ], + migrations=[], + updates=set(), + ), + ], + ) + self.assertCountEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=871269, + reference_genome='GRCh38', + ), + alleles=['A', 'C'], + dbnsfp=hl.Struct( + REVEL_score=0.0430000014603138, + SIFT_score=None, + Polyphen2_HVAR_score=None, + MutationTaster_pred_id=0, + VEST4_score=None, + MutPred_score=None, + fathmm_MKL_coding_score=None, + MPC_score=None, + CADD_phred=2, + PrimateAI_score=None, ), - topmed=hl.Struct(), - gnomad_non_coding_constraint=hl.Struct(), - screen=hl.Struct( - region_type=[ - 'CTCF-bound', - 'CTCF-only', - 'DNase-H3K4me3', - 'PLS', - 'dELS', - 'pELS', - 'DNase-only', - 'low-DNase', - ], + eigen=hl.Struct(Eigen_phred=1.5880000591278076), + clinvar=hl.Struct( + alleleId=None, + conflictingPathogenicities=None, + goldStars=None, + pathogenicity_id=None, + assertion_ids=None, + submitters=None, + conditions=None, ), - hgmd=hl.Struct( - **{'class': ['DM', 'DM?', 'DP', 'DFP', 'FP', 'R']}, + exac=hl.Struct( + AF_POPMAX=0.0004100881633348763, + AF=0.0004633000062312931, + AC_Adj=51, + AC_Het=51, + AC_Hom=0, + AC_Hemi=None, + AN_Adj=108288, ), - sorted_motif_feature_consequences=hl.Struct( - consequence_term=MOTIF_CONSEQUENCE_TERMS, + splice_ai=hl.Struct( + delta_score=0.029999999329447746, + splice_consequence_id=3, ), - sorted_regulatory_feature_consequences=hl.Struct( - biotype=REGULATORY_BIOTYPES, - consequence_term=REGULATORY_CONSEQUENCE_TERMS, + topmed=hl.Struct(AC=None, AF=None, AN=None, Hom=None, Het=None), + hgmd=hl.Struct(accession='abcdefg', class_id=3), + gnomad_exomes=hl.Struct( + AF=0.00012876000255346298, + AN=240758, + AC=31, + Hom=0, + AF_POPMAX_OR_GLOBAL=0.0001119549197028391, + FAF_AF=9.315000352216884e-05, + Hemi=0, ), - sorted_transcript_consequences=hl.Struct( - biotype=BIOTYPES, - consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, - loftee=hl.Struct( - lof_filter=LOF_FILTERS, - ), - utrannotator=hl.Struct( - fiveutr_consequence=FIVEUTR_CONSEQUENCES, - ), + gnomad_genomes=hl.Struct( + AC=None, + AF=None, + AN=None, + Hom=None, + AF_POPMAX_OR_GLOBAL=None, + FAF_AF=None, + Hemi=None, ), + gnomad_non_coding_constraint=hl.Struct(z_score=0.75), + screen=hl.Struct(region_type_ids=[1]), ), - migrations=[], - updates=set(), - ), - ], - ) + ], + ) - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG_MITO, - ) - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.update_variant_annotations_table_with_updated_reference_dataset.clinvar_versions_equal', + @responses.activate + @patch( + 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.BaseUpdateVariantAnnotationsTableTask.initialize_table', ) - def test_update_vat_with_updated_rdc_mito_38( + def test_update_vat_mito_38( self, - mock_clinvar_versions_equal, - mock_initialize_table, - mock_update_crdqs_task, - mock_update_rdc_task, + mock_initialize_annotations_ht, ): - mock_clinvar_versions_equal.return_value = (True,) - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_initialize_table.return_value = hl.Table.parallelize( + mock_initialize_annotations_ht.return_value = hl.Table.parallelize( [ hl.Struct( locus=hl.Locus( @@ -974,152 +269,118 @@ def test_update_vat_with_updated_rdc_mito_38( ), key=['locus', 'alleles'], globals=hl.Struct( - paths=hl.Struct(), versions=hl.Struct(), enums=hl.Struct(), updates=hl.empty_set(hl.tstruct(callset=hl.tstr, project_guid=hl.tstr)), migrations=hl.empty_array(hl.tstr), ), ) - task = UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.MITO, - sample_type=SampleType.WGS, - callset_path=TEST_MITO_MT, - project_guids=[], - project_remap_paths=[], - project_pedigree_paths=[], - skip_validation=True, - run_id='1', - ) - worker = luigi.worker.Worker() - worker.add(task) - worker.run() - self.assertTrue(GCSorLocalFolderTarget(task.output().path).exists()) - self.assertTrue(task.complete()) - ht = hl.read_table(task.output().path) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct( - gnomad_mito='gs://gcp-public-data--gnomad/release/3.1/ht/genomes/gnomad.genomes.v3.1.sites.chrM.ht', - helix_mito='gs://seqr-reference-data/GRCh38/mitochondrial/Helix/HelixMTdb_20200327.ht', - hmtvar='gs://seqr-reference-data/GRCh38/mitochondrial/HmtVar/HmtVar%20Jan.%2010%202022.ht', - mitomap='gs://seqr-reference-data/GRCh38/mitochondrial/MITOMAP/mitomap-confirmed-mutations-2022-02-04.ht', - mitimpact='gs://seqr-reference-data/GRCh38/mitochondrial/MitImpact/MitImpact_db_3.1.3.ht', - clinvar_mito='https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz', - dbnsfp_mito='gs://seqr-reference-data/GRCh38/dbNSFP/v4.2/dbNSFP4.2a_variant.with_new_scores.ht', - high_constraint_region_mito='gs://seqr-reference-data/GRCh38/mitochondrial/Helix high constraint intervals Feb-15-2022.tsv', - local_constraint_mito='gs://seqr-reference-data/GRCh38/mitochondrial/local_constraint.tsv', - ), - versions=hl.Struct( - gnomad_mito='v3.1', - helix_mito='20200327', - hmtvar='Jan. 10 2022', - mitomap='Feb. 04 2022', - mitimpact='3.1.3', - clinvar_mito='2023-07-22', - dbnsfp_mito='4.2', - high_constraint_region_mito='Feb-15-2022', - local_constraint_mito='2024-07-24', - ), - enums=hl.Struct( - gnomad_mito=hl.Struct(), - helix_mito=hl.Struct(), - hmtvar=hl.Struct(), - mitomap=hl.Struct(), - mitimpact=hl.Struct(), - clinvar_mito=hl.Struct( - pathogenicity=CLINVAR_PATHOGENICITIES, - assertion=CLINVAR_ASSERTIONS, + with mock_clinvar_urls(): + task = UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.MITO, + ) + worker = luigi.worker.Worker() + worker.add(task) + worker.run() + self.assertTrue(GCSorLocalFolderTarget(task.output().path).exists()) + self.assertTrue(task.complete()) + + ht = hl.read_table(task.output().path) + self.assertCountEqual( + ht.globals.collect(), + [ + hl.Struct( + versions=hl.Struct( + helix_mito='1.0', + hmtvar='1.0', + mitimpact='1.0', + mitomap='1.0', + gnomad_mito='1.0', + local_constraint_mito='1.0', + clinvar='2024-11-11', + dbnsfp='1.0', ), - dbnsfp_mito=hl.Struct( - MutationTaster_pred=['D', 'A', 'N', 'P'], + enums=hl.Struct( + helix_mito=hl.Struct(), + hmtvar=hl.Struct(), + mitimpact=hl.Struct(), + mitomap=hl.Struct(), + gnomad_mito=hl.Struct(), + local_constraint_mito=hl.Struct(), + clinvar=ReferenceDataset.clinvar.enum_globals, + dbnsfp=ReferenceDataset.dbnsfp.enum_globals, + sorted_transcript_consequences=hl.Struct( + biotype=BIOTYPES, + consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, + lof_filter=LOF_FILTERS, + ), + mitotip=hl.Struct( + trna_prediction=MITOTIP_PATHOGENICITIES, + ), ), - high_constraint_region_mito=hl.Struct(), - local_constraint_mito=hl.Struct(), - sorted_transcript_consequences=hl.Struct( - biotype=BIOTYPES, - consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, - lof_filter=LOF_FILTERS, + migrations=[], + updates=set(), + ), + ], + ) + self.assertCountEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig='chrM', + position=1, + reference_genome='GRCh38', ), - mitotip=hl.Struct( - trna_prediction=MITOTIP_PATHOGENICITIES, + alleles=['A', 'C'], + helix_mito=hl.Struct( + AC_het=0, + AF_het=0.0, + AN=195982, + max_hl=None, + AC_hom=0, + AF_hom=0, + ), + hmtvar=hl.Struct(score=0.6700000166893005), + mitimpact=hl.Struct(score=0.42500001192092896), + mitomap=hl.Struct(pathogenic=None), + gnomad_mito=hl.Struct( + AC_het=0, + AF_het=0.0, + AN=195982, + max_hl=None, + AC_hom=0, + AF_hom=0, + ), + local_constraint_mito=hl.Struct(score=0.5), + clinvar=hl.Struct( + alleleId=None, + conflictingPathogenicities=None, + goldStars=None, + pathogenicity_id=None, + assertion_ids=None, + submitters=None, + conditions=None, + ), + dbnsfp=hl.Struct( + SIFT_score=None, + MutationTaster_pred_id=2, ), ), - migrations=[], - updates=set(), - ), - ], - ) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chrM', - position=1, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - clinvar_mito=hl.Struct( - alleleId=None, - conflictingPathogenicities=None, - goldStars=None, - pathogenicity_id=None, - assertion_ids=None, - submitters=None, - conditions=None, - ), - dbnsfp_mito=hl.Struct( - SIFT_score=None, - MutationTaster_pred_id=2, - ), - gnomad_mito=hl.Struct( - AC_het=0, - AF_het=0.0, - AN=195982, - max_hl=None, - AC_hom=0, - AF_hom=0, - ), - helix_mito=hl.Struct( - AC_het=0, - AF_het=0.0, - AN=195982, - max_hl=None, - AC_hom=0, - AF_hom=0, - ), - hmtvar=hl.Struct(score=0.6700000166893005), - mitomap=None, - mitimpact=hl.Struct(score=0.42500001192092896), - high_constraint_region_mito=True, - local_constraint_mito=hl.Struct(score=0.5), - ), - ], - ) + ], + ) - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG, + @responses.activate + @patch( + 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.BaseUpdateVariantAnnotationsTableTask.initialize_table', ) - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.update_variant_annotations_table_with_updated_reference_dataset.clinvar_versions_equal', - ) - def test_update_vat_with_updated_rdc_snv_indel_37( + def test_update_vat_snv_indel_37( self, - mock_clinvar_versions_equal, - mock_initialize_table, - mock_update_crdqs_task, - mock_update_rdc_task, + mock_initialize_annotations_ht, ): - mock_clinvar_versions_equal.return_value = True - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_initialize_table.return_value = hl.Table.parallelize( + mock_initialize_annotations_ht.return_value = hl.Table.parallelize( [ hl.Struct( locus=hl.Locus( @@ -1136,156 +397,123 @@ def test_update_vat_with_updated_rdc_snv_indel_37( ), key=['locus', 'alleles'], globals=hl.Struct( - paths=hl.Struct(), versions=hl.Struct(), enums=hl.Struct(), updates=hl.empty_set(hl.tstruct(callset=hl.tstr, project_guid=hl.tstr)), migrations=hl.empty_array(hl.tstr), ), ) - task = UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( - reference_genome=ReferenceGenome.GRCh37, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=[], - project_remap_paths=[], - project_pedigree_paths=[], - skip_validation=True, - run_id='2', - ) - worker = luigi.worker.Worker() - worker.add(task) - worker.run() - self.assertTrue(GCSorLocalFolderTarget(task.output().path).exists()) - self.assertTrue(task.complete()) - ht = hl.read_table(task.output().path) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct( - cadd='gs://seqr-reference-data/GRCh37/CADD/CADD_snvs_and_indels.v1.6.ht', - clinvar='https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - dbnsfp='gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.ht', - eigen='gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.ht', - exac='gs://seqr-reference-data/GRCh37/gnomad/ExAC.r1.sites.vep.ht', - gnomad_exomes='gs://gcp-public-data--gnomad/release/2.1.1/ht/exomes/gnomad.exomes.r2.1.1.sites.ht', - gnomad_genomes='gs://gcp-public-data--gnomad/release/2.1.1/ht/genomes/gnomad.genomes.r2.1.1.sites.ht', - mpc='gs://seqr-reference-data/GRCh37/MPC/fordist_constraint_official_mpc_values.ht', - primate_ai='gs://seqr-reference-data/GRCh37/primate_ai/PrimateAI_scores_v0.2.ht', - splice_ai='gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.ht', - topmed='gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.ht', - hgmd='gs://seqr-reference-data-private/GRCh37/HGMD/HGMD_Pro_2023.1_hg19.vcf.gz', - ), - versions=hl.Struct( - cadd='v1.6', - clinvar='2023-11-26', - dbnsfp='2.9.3', - eigen=None, - exac=None, - gnomad_exomes='r2.1.1', - gnomad_genomes='r2.1.1', - mpc=None, - primate_ai='v0.2', - splice_ai=None, - topmed=None, - hgmd='HGMD_Pro_2023', - ), - enums=hl.Struct( - cadd=hl.Struct(), - clinvar=hl.Struct( - pathogenicity=CLINVAR_PATHOGENICITIES, - assertion=CLINVAR_ASSERTIONS, + with mock_clinvar_urls(ReferenceGenome.GRCh37): + task = UpdateVariantAnnotationsTableWithUpdatedReferenceDataset( + reference_genome=ReferenceGenome.GRCh37, + dataset_type=DatasetType.SNV_INDEL, + ) + worker = luigi.worker.Worker() + worker.add(task) + worker.run() + self.assertTrue(GCSorLocalFolderTarget(task.output().path).exists()) + self.assertTrue(task.complete()) + + ht = hl.read_table(task.output().path) + self.assertCountEqual( + ht.globals.collect(), + [ + hl.Struct( + versions=hl.Struct( + dbnsfp='1.0', + eigen='1.0', + clinvar='2024-11-11', + exac='1.0', + splice_ai='1.0', + topmed='1.0', + hgmd='1.0', + gnomad_exomes='1.0', + gnomad_genomes='1.0', + ), + enums=hl.Struct( + dbnsfp=ReferenceDataset.dbnsfp.enum_globals, + eigen=hl.Struct(), + clinvar=ReferenceDataset.clinvar.enum_globals, + exac=hl.Struct(), + splice_ai=ReferenceDataset.splice_ai.enum_globals, + topmed=hl.Struct(), + hgmd=ReferenceDataset.hgmd.enum_globals, + gnomad_exomes=hl.Struct(), + gnomad_genomes=hl.Struct(), + sorted_transcript_consequences=hl.Struct( + biotype=BIOTYPES, + consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, + lof_filter=LOF_FILTERS, + ), ), + migrations=[], + updates=set(), + ), + ], + ) + self.assertCountEqual( + ht.collect(), + [ + hl.Struct( + locus=hl.Locus( + contig=1, + position=871269, + reference_genome='GRCh37', + ), + alleles=['A', 'C'], dbnsfp=hl.Struct( - MutationTaster_pred=['D', 'A', 'N', 'P'], + REVEL_score=0.0430000014603138, + SIFT_score=None, + Polyphen2_HVAR_score=None, + MutationTaster_pred_id=0, + CADD_phred=9.699999809265137, + MPC_score=None, + PrimateAI_score=None, + ), + eigen=hl.Struct(Eigen_phred=1.5880000591278076), + clinvar=hl.Struct( + alleleId=None, + conflictingPathogenicities=None, + goldStars=None, + pathogenicity_id=None, + assertion_ids=None, + submitters=None, + conditions=None, + ), + exac=hl.Struct( + AF_POPMAX=0.0004100881633348763, + AF=0.0004633000062312931, + AC_Adj=51, + AC_Het=51, + AC_Hom=0, + AC_Hemi=None, + AN_Adj=108288, ), - eigen=hl.Struct(), - exac=hl.Struct(), - gnomad_exomes=hl.Struct(), - gnomad_genomes=hl.Struct(), - mpc=hl.Struct(), - primate_ai=hl.Struct(), splice_ai=hl.Struct( - splice_consequence=[ - 'Acceptor gain', - 'Acceptor loss', - 'Donor gain', - 'Donor loss', - 'No consequence', - ], + delta_score=0.029999999329447746, + splice_consequence_id=3, ), - topmed=hl.Struct(), - hgmd=hl.Struct( - **{'class': ['DM', 'DM?', 'DP', 'DFP', 'FP', 'R']}, + topmed=hl.Struct(AC=None, AF=None, AN=None, Hom=None, Het=None), + hgmd=None, + gnomad_exomes=hl.Struct( + AF=0.00012876000255346298, + AN=240758, + AC=31, + Hom=0, + AF_POPMAX_OR_GLOBAL=0.0001119549197028391, + FAF_AF=9.315000352216884e-05, + Hemi=0, ), - sorted_transcript_consequences=hl.Struct( - biotype=BIOTYPES, - consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, - lof_filter=LOF_FILTERS, + gnomad_genomes=hl.Struct( + AC=None, + AF=None, + AN=None, + Hom=None, + AF_POPMAX_OR_GLOBAL=None, + FAF_AF=None, + Hemi=None, ), ), - migrations=[], - updates=set(), - ), - ], - ) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig=1, - position=871269, - reference_genome='GRCh37', - ), - alleles=['A', 'C'], - cadd=hl.Struct(PHRED=9.699999809265137), - clinvar=hl.Struct( - alleleId=None, - conflictingPathogenicities=None, - goldStars=None, - pathogenicity_id=None, - assertion_ids=None, - submitters=None, - conditions=None, - ), - dbnsfp=hl.Struct( - REVEL_score=0.0430000014603138, - SIFT_score=None, - Polyphen2_HVAR_score=None, - MutationTaster_pred_id=0, - ), - eigen=hl.Struct(Eigen_phred=1.5880000591278076), - exac=hl.Struct( - AF_POPMAX=0.0004100881633348763, - AF=0.0004633000062312931, - AC_Adj=51, - AC_Het=51, - AC_Hom=0, - AC_Hemi=None, - AN_Adj=108288, - ), - gnomad_exomes=hl.Struct( - AF=0.00012876000255346298, - AN=240758, - AC=31, - Hom=0, - AF_POPMAX_OR_GLOBAL=0.0001119549197028391, - FAF_AF=9.315000352216884e-05, - Hemi=0, - ), - gnomad_genomes=None, - mpc=None, - primate_ai=None, - splice_ai=hl.Struct( - delta_score=0.029999999329447746, - splice_consequence_id=3, - ), - topmed=None, - hgmd=None, - ), - ], - ) + ], + ) diff --git a/v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query.py b/v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query.py deleted file mode 100644 index 2302c8ac3..000000000 --- a/v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query.py +++ /dev/null @@ -1,142 +0,0 @@ -import hail as hl -import luigi - -from v03_pipeline.lib.logger import get_logger -from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, - ReferenceDatasetCollection, -) -from v03_pipeline.lib.paths import ( - cached_reference_dataset_query_path, - valid_reference_dataset_collection_path, -) -from v03_pipeline.lib.reference_data.compare_globals import ( - Globals, - clinvar_versions_equal, - get_datasets_to_update, -) -from v03_pipeline.lib.reference_data.config import CONFIG -from v03_pipeline.lib.reference_data.dataset_table_operations import ( - get_ht_path, - import_ht_from_config_path, -) -from v03_pipeline.lib.tasks.base.base_loading_run_params import ( - BaseLoadingRunParams, -) -from v03_pipeline.lib.tasks.base.base_write import BaseWriteTask -from v03_pipeline.lib.tasks.files import GCSorLocalTarget, HailTableTask - -logger = get_logger(__name__) - - -@luigi.util.inherits(BaseLoadingRunParams) -class UpdatedCachedReferenceDatasetQuery(BaseWriteTask): - crdq = luigi.EnumParameter(enum=CachedReferenceDatasetQuery) - - def complete(self) -> bool: - if not super().complete(): - logger.info( - f'UpdatedCachedReferenceDatasetQuery: {self.output().path} does not exist', - ) - return False - - dataset = self.crdq.dataset(self.dataset_type) - if 'clinvar' in dataset and not clinvar_versions_equal( - hl.read_table(self.output().path), - self.reference_genome, - self.dataset_type, - ): - return False - - crdq_globals = Globals.from_ht( - hl.read_table(self.output().path), - [dataset], - ) - dataset_config_globals = Globals.from_dataset_configs( - self.reference_genome, - [dataset], - ) - return not get_datasets_to_update( - crdq_globals, - dataset_config_globals, - validate_selects=False, - ) - - def output(self) -> luigi.Target: - return GCSorLocalTarget( - cached_reference_dataset_query_path( - self.reference_genome, - self.dataset_type, - self.crdq, - ), - ) - - def requires(self) -> luigi.Task: - if not self.crdq.reference_dataset_collection: - return HailTableTask( - get_ht_path( - CONFIG[self.crdq.dataset(self.dataset_type)][ - self.reference_genome.v02_value - ], - ), - ) - # Special nested import to avoid a circular dependency issue - # (ValidateCallset -> this file -> UpdatedReferenceDatasetCollection -> ValidateCallset) - # The specific CRDQ referenced in ValidateCallset will never reach - # this line due to it being a raw dataset query. In theory this - # would be fixed by splitting the CRDQ into raw_dataset and non-raw_dataset - # queries. - from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_collection import ( - UpdatedReferenceDatasetCollectionTask, - ) - - return UpdatedReferenceDatasetCollectionTask( - self.reference_genome, - self.dataset_type, - self.crdq.reference_dataset_collection, - ) - - def create_table(self) -> hl.Table: - dataset: str = self.crdq.dataset(self.dataset_type) - if not self.crdq.reference_dataset_collection: - query_ht = import_ht_from_config_path( - CONFIG[dataset][self.reference_genome.v02_value], - dataset, - self.reference_genome, - ) - else: - query_ht = hl.read_table( - valid_reference_dataset_collection_path( - self.reference_genome, - self.dataset_type, - ReferenceDatasetCollection.COMBINED, - ), - ) - ht = self.crdq.query( - query_ht, - dataset_type=self.dataset_type, - reference_genome=self.reference_genome, - ) - return ht.select_globals( - paths=hl.Struct( - **{ - dataset: query_ht.index_globals().path - if not self.crdq.reference_dataset_collection - else query_ht.index_globals().paths[dataset], - }, - ), - versions=hl.Struct( - **{ - dataset: query_ht.index_globals().version - if not self.crdq.reference_dataset_collection - else query_ht.index_globals().versions[dataset], - }, - ), - enums=hl.Struct( - **{ - dataset: query_ht.index_globals().enums - if not self.crdq.reference_dataset_collection - else query_ht.index_globals().enums[dataset], - }, - ), - ) diff --git a/v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query_test.py b/v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query_test.py deleted file mode 100644 index 566337f2e..000000000 --- a/v03_pipeline/lib/tasks/reference_data/updated_cached_reference_dataset_query_test.py +++ /dev/null @@ -1,264 +0,0 @@ -import shutil -from typing import Any -from unittest import mock - -import hail as hl -import luigi - -import v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_collection -from v03_pipeline.lib.annotations.enums import CLINVAR_PATHOGENICITIES -from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, - DatasetType, - ReferenceDatasetCollection, - ReferenceGenome, - SampleType, -) -from v03_pipeline.lib.paths import ( - cached_reference_dataset_query_path, - valid_reference_dataset_collection_path, -) -from v03_pipeline.lib.reference_data.clinvar import CLINVAR_ASSERTIONS -from v03_pipeline.lib.reference_data.config import CONFIG -from v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query import ( - UpdatedCachedReferenceDatasetQuery, -) -from v03_pipeline.lib.test.mock_complete_task import MockCompleteTask -from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase - -COMBINED_1_PATH = 'v03_pipeline/var/test/reference_data/test_combined_1.ht' -CLINVAR_CRDQ_PATH = ( - 'v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht' -) -TEST_SNV_INDEL_VCF = 'v03_pipeline/var/test/callsets/1kg_30variants.vcf' - -MOCK_CONFIG = { - 'gnomad_qc': { - '38': { - 'version': 'v3.1', - 'source_path': 'gs://gnomad/sample_qc/mt/genomes_v3.1/gnomad_v3.1_qc_mt_v2_sites_dense.mt', - 'custom_import': lambda *_: hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - ), - key=['locus', 'alleles'], - globals=hl.Struct(), - ), - }, - }, - 'clinvar': { - '38': { - **CONFIG['clinvar']['38'], - 'source_path': 'https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct( - ALLELEID=hl.tint32, - CLNSIG=hl.tarray(hl.tstr), - CLNSIGCONF=hl.tarray(hl.tstr), - CLNREVSTAT=hl.tarray(hl.tstr), - ), - submitters=hl.tarray(hl.tstr), - conditions=hl.tarray(hl.tstr), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='2023-11-26', - ), - ), - }, - }, -} - - -class UpdatedCachedReferenceDatasetQueryTest(MockedDatarootTestCase): - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG, - ) - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query.CONFIG', - MOCK_CONFIG, - ) - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query.HailTableTask', - ) - def test_gnomad_qc( - self, - mock_hailtabletask, - ) -> None: - """ - Given a crdq task for gnomad_qc, expect the crdq table to be created by querying the raw dataset. - """ - # raw dataset dependency exists - mock_hailtabletask.return_value = MockCompleteTask() - - worker = luigi.worker.Worker() - task = UpdatedCachedReferenceDatasetQuery( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - crdq=CachedReferenceDatasetQuery.GNOMAD_QC, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=[], - project_remap_paths=[], - project_pedigree_paths=[], - skip_validation=True, - run_id='1', - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - - ht = hl.read_table(task.output().path) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - ), - ], - ) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct(gnomad_qc=CONFIG['gnomad_qc']['38']['source_path']), - versions=hl.Struct(gnomad_qc='v3.1'), - enums=hl.Struct(gnomad_qc=hl.Struct()), - ), - ], - ) - - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG, - ) - @mock.patch.object( - v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_collection, - 'UpdatedReferenceDatasetCollectionTask', - ) - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query.CachedReferenceDatasetQuery.query', - ) - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query.clinvar_versions_equal', - ) - def test_clinvar( - self, - mock_clinvar_versions_equal, - mock_crdq_query, - mock_updated_rdc_task, - ) -> None: - """ - Given a crdq task where there exists a clinvar crdq table and a clinvar rdc table, - expect task to replace the clinvar crdq table with new version. - """ - mock_clinvar_versions_equal.return_value = True - - # rdc dependency exists - mock_updated_rdc_task.return_value = MockCompleteTask() - - # copy existing crdq to test path - # clinvar has version '2022-01-01' - shutil.copytree( - CLINVAR_CRDQ_PATH, - cached_reference_dataset_query_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS, - ), - ) - - # copy existing rdc to test path - # clinvar has version '2023-11-26' - shutil.copytree( - COMBINED_1_PATH, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.COMBINED, - ), - ) - - # mock the clinvar_path_variants query to something simpler for testing - def _clinvar_path_variants(table, **_: Any): - table = table.select_globals() - return table.select( - is_pathogenic=False, - is_likely_pathogenic=True, - ) - - mock_crdq_query.side_effect = _clinvar_path_variants - - worker = luigi.worker.Worker() - task = UpdatedCachedReferenceDatasetQuery( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - crdq=CachedReferenceDatasetQuery.CLINVAR_PATH_VARIANTS, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=[], - project_remap_paths=[], - project_pedigree_paths=[], - skip_validation=True, - run_id='2', - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - - ht = hl.read_table(task.output().path) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - is_pathogenic=False, - is_likely_pathogenic=True, - ), - ], - ) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct( - clinvar=MOCK_CONFIG['clinvar']['38']['source_path'], - ), - enums=hl.Struct( - clinvar=hl.Struct( - pathogenicity=CLINVAR_PATHOGENICITIES, - assertion=CLINVAR_ASSERTIONS, - ), - ), - versions=hl.Struct( - clinvar='2023-11-26', # crdq table should have new clinvar version - ), - ), - ], - ) diff --git a/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset.py b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset.py new file mode 100644 index 000000000..e73f2db8b --- /dev/null +++ b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset.py @@ -0,0 +1,27 @@ +import luigi + +from luigi_pipeline.lib.hail_tasks import GCSorLocalTarget +from v03_pipeline.lib.paths import valid_reference_dataset_path +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset +from v03_pipeline.lib.tasks.base.base_loading_pipeline_params import ( + BaseLoadingPipelineParams, +) +from v03_pipeline.lib.tasks.base.base_write import BaseWriteTask + + +@luigi.util.inherits(BaseLoadingPipelineParams) +class UpdatedReferenceDatasetTask(BaseWriteTask): + reference_dataset: ReferenceDataset = luigi.EnumParameter( + enum=ReferenceDataset, + ) + + def output(self): + return GCSorLocalTarget( + valid_reference_dataset_path( + self.reference_genome, + self.reference_dataset, + ), + ) + + def create_table(self): + return self.reference_dataset.get_ht(self.reference_genome) diff --git a/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection.py b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection.py deleted file mode 100644 index af2144839..000000000 --- a/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection.py +++ /dev/null @@ -1,115 +0,0 @@ -import hail as hl -import luigi - -from v03_pipeline.lib.logger import get_logger -from v03_pipeline.lib.model import ReferenceDatasetCollection -from v03_pipeline.lib.paths import valid_reference_dataset_collection_path -from v03_pipeline.lib.reference_data.compare_globals import ( - Globals, - clinvar_versions_equal, - get_datasets_to_update, -) -from v03_pipeline.lib.reference_data.dataset_table_operations import ( - update_or_create_joined_ht, -) -from v03_pipeline.lib.tasks.base.base_loading_run_params import ( - BaseLoadingRunParams, -) -from v03_pipeline.lib.tasks.base.base_update import BaseUpdateTask -from v03_pipeline.lib.tasks.files import GCSorLocalTarget -from v03_pipeline.lib.tasks.validate_callset import ValidateCallsetTask - -logger = get_logger(__name__) - - -@luigi.util.inherits(BaseLoadingRunParams) -class UpdatedReferenceDatasetCollectionTask(BaseUpdateTask): - reference_dataset_collection = luigi.EnumParameter(enum=ReferenceDatasetCollection) - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self._datasets_to_update = [] - - def requires(self) -> luigi.Task: - # Though there is no explicit functional dependency between - # validing the callset and updating the reference data, it's - # a more user-friendly experience for the callset validation - # to fail/succeed prior to attempting any - # compute intensive work. - # - # Note that, if validation is disabled or skipped the task - # still runs but is a no-op. - return self.clone(ValidateCallsetTask) - - def complete(self) -> bool: - self._datasets_to_update = [] - datasets = self.reference_dataset_collection.datasets(self.dataset_type) - - if not super().complete(): - logger.info('Creating a new reference dataset collection') - self._datasets_to_update.extend( - self.reference_dataset_collection.datasets( - self.dataset_type, - ), - ) - return False - - if any('clinvar' in d for d in datasets) and not clinvar_versions_equal( - hl.read_table(self.output().path), - self.reference_genome, - self.dataset_type, - ): - datasets.remove('clinvar') - self._datasets_to_update.add('clinvar') - - joined_ht_globals = Globals.from_ht( - hl.read_table(self.output().path), - datasets, - ) - dataset_config_globals = Globals.from_dataset_configs( - self.reference_genome, - datasets, - ) - self._datasets_to_update.extend( - get_datasets_to_update( - joined_ht_globals, - dataset_config_globals, - ), - ) - logger.info( - f'Datasets to update: {self._datasets_to_update} for {self.reference_dataset_collection}', - ) - return not self._datasets_to_update - - def output(self) -> luigi.Target: - return GCSorLocalTarget( - valid_reference_dataset_collection_path( - self.reference_genome, - self.dataset_type, - self.reference_dataset_collection, - ), - ) - - def initialize_table(self) -> hl.Table: - key_type = self.reference_dataset_collection.table_key_type( - self.reference_genome, - ) - return hl.Table.parallelize( - [], - key_type, - key=key_type.fields, - globals=hl.Struct( - paths=hl.Struct(), - versions=hl.Struct(), - enums=hl.Struct(), - ), - ) - - def update_table(self, ht: hl.Table) -> hl.Table: - return update_or_create_joined_ht( - self.reference_dataset_collection, - self.dataset_type, - self.reference_genome, - self._datasets_to_update, - ht, - ) diff --git a/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection_test.py b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection_test.py deleted file mode 100644 index bc19d39d5..000000000 --- a/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_collection_test.py +++ /dev/null @@ -1,345 +0,0 @@ -import shutil -from unittest import mock -from unittest.mock import ANY - -import hail as hl -import luigi.worker - -from v03_pipeline.lib.annotations.enums import CLINVAR_PATHOGENICITIES -from v03_pipeline.lib.model import ( - DatasetType, - ReferenceDatasetCollection, - ReferenceGenome, - SampleType, -) -from v03_pipeline.lib.paths import valid_reference_dataset_collection_path -from v03_pipeline.lib.reference_data.clinvar import CLINVAR_ASSERTIONS -from v03_pipeline.lib.reference_data.config import CONFIG -from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_collection import ( - UpdatedReferenceDatasetCollectionTask, -) -from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase - -COMBINED_2_PATH = 'v03_pipeline/var/test/reference_data/test_combined_2.ht' -TEST_SNV_INDEL_VCF = 'v03_pipeline/var/test/callsets/1kg_30variants.vcf' - -MOCK_PRIMATE_AI_DATASET_HT = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'info': hl.Struct(score=0.25), - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - info=hl.tstruct(score=hl.tfloat32), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='v0.3', - ), -) -MOCK_CADD_DATASET_HT = hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'PHRED': 1, - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - PHRED=hl.tint32, - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='v1.6', - ), -) -MOCK_CONFIG = { - 'primate_ai': { - '38': { - 'version': 'v0.3', - 'source_path': 'gs://seqr-reference-data/GRCh38/primate_ai/PrimateAI_scores_v0.2.liftover_grch38.ht', - 'select': { - 'score': 'info.score', - }, - 'custom_import': lambda *_: MOCK_PRIMATE_AI_DATASET_HT, - }, - }, - 'cadd': { - '38': { - 'version': 'v1.6', - 'source_path': 'gs://seqr-reference-data/GRCh38/CADD/CADD_snvs_and_indels.v1.6.ht', - 'select': ['PHRED'], - 'custom_import': lambda *_: MOCK_CADD_DATASET_HT, - }, - }, - 'clinvar': { - '38': { - **CONFIG['clinvar']['38'], - 'custom_import': lambda *_: hl.Table.parallelize( - [ - { - 'locus': hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - 'alleles': ['A', 'C'], - 'rsid': '5', - 'info': hl.Struct( - ALLELEID=1, - CLNSIG=[ - 'Pathogenic/Likely_pathogenic/Pathogenic', - '_low_penetrance', - ], - CLNSIGCONF=[ - 'Pathogenic(8)|Likely_pathogenic(2)|Pathogenic', - '_low_penetrance(1)|Uncertain_significance(1)', - ], - CLNREVSTAT=['no_classifications_from_unflagged_records'], - ), - 'submitters': [ - 'OMIM', - 'Broad Institute Rare Disease Group, Broad Institute', - 'PreventionGenetics, part of Exact Sciences', - 'Invitae', - ], - 'conditions': [ - 'C3661900:not provided', - 'C0023264:Leigh syndrome', - 'na:FOXRED1-related condition', - 'C4748791:Mitochondrial complex 1 deficiency, nuclear type 19', - ], - }, - ], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - rsid=hl.tstr, - info=hl.tstruct( - ALLELEID=hl.tint32, - CLNSIG=hl.tarray(hl.tstr), - CLNSIGCONF=hl.tarray(hl.tstr), - CLNREVSTAT=hl.tarray(hl.tstr), - ), - submitters=hl.tarray(hl.tstr), - conditions=hl.tarray(hl.tstr), - ), - key=['locus', 'alleles'], - globals=hl.Struct( - version='2023-11-26', - ), - ), - }, - }, -} - - -class UpdatedReferenceDatasetCollectionTaskTest(MockedDatarootTestCase): - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG, - ) - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.dataset_table_operations.CONFIG', - MOCK_CONFIG, - ) - @mock.patch.object(ReferenceDatasetCollection, 'datasets') - @mock.patch( - 'v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_collection.clinvar_versions_equal', - ) - def test_update_task_with_empty_reference_data_table( - self, - mock_clinvar_versions_equal, - mock_rdc_datasets, - ) -> None: - """ - Given a new task with no existing reference dataset collection table, - expect the task to create a new reference dataset collection table for all datasets in the collection. - """ - mock_clinvar_versions_equal.return_value = True - mock_rdc_datasets.return_value = ['cadd', 'primate_ai', 'clinvar'] - worker = luigi.worker.Worker() - task = UpdatedReferenceDatasetCollectionTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - reference_dataset_collection=ReferenceDatasetCollection.COMBINED, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=[], - project_remap_paths=[], - project_pedigree_paths=[], - skip_validation=True, - run_id='2', - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - - ht = hl.read_table(task.output().path) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - primate_ai=hl.Struct(score=0.25), - cadd=hl.Struct(PHRED=1), - clinvar=hl.Struct( - alleleId=1, - submitters=[ - 'OMIM', - 'Broad Institute Rare Disease Group, Broad Institute', - 'PreventionGenetics, part of Exact Sciences', - 'Invitae', - ], - conditions=[ - 'not provided', - 'Leigh syndrome', - 'FOXRED1-related condition', - 'Mitochondrial complex 1 deficiency, nuclear type 19', - ], - conflictingPathogenicities=[ - hl.Struct(pathogenicity_id=0, count=9), - hl.Struct(pathogenicity_id=5, count=2), - hl.Struct(pathogenicity_id=12, count=1), - ], - goldStars=0, - pathogenicity_id=1, - assertion_ids=[5], - ), - ), - ], - ) - self.assertEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct( - primate_ai='gs://seqr-reference-data/GRCh38/primate_ai/PrimateAI_scores_v0.2.liftover_grch38.ht', - cadd='gs://seqr-reference-data/GRCh38/CADD/CADD_snvs_and_indels.v1.6.ht', - clinvar='https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz', - ), - versions=hl.Struct( - primate_ai='v0.3', - cadd='v1.6', - clinvar='2023-11-26', - ), - enums=hl.Struct( - primate_ai=hl.Struct(), - cadd=hl.Struct(), - clinvar=hl.Struct( - pathogenicity=CLINVAR_PATHOGENICITIES, - assertion=CLINVAR_ASSERTIONS, - ), - ), - date=ANY, - ), - ], - ) - - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG, - ) - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.dataset_table_operations.CONFIG', - MOCK_CONFIG, - ) - @mock.patch.object(ReferenceDatasetCollection, 'datasets') - def test_update_task_with_existing_reference_dataset_collection_table( - self, - mock_rdc_datasets, - ) -> None: - """ - Given an existing reference dataset collection which contains only the primate_ai dataset and has globals: - Struct(paths=Struct(primate_ai='gs://seqr-reference-data/GRCh38/primate_ai/PrimateAI_scores_v0.2.liftover_grch38.ht'), - versions=Struct(primate_ai='v0.2'), - enums=Struct(primate_ai=Struct()), - date=ANY), - expect the task to update the existing reference dataset collection table with the new dataset (cadd), - new values for primate_ai, and update the globals with the new primate_ai dataset's globals and cadd's globals. - """ - # copy existing reference dataset collection (primate_ai only) in COMBINED_2_PATH to test path - shutil.copytree( - COMBINED_2_PATH, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.COMBINED, - ), - ) - - mock_rdc_datasets.return_value = ['cadd', 'primate_ai'] - worker = luigi.worker.Worker() - task = UpdatedReferenceDatasetCollectionTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - reference_dataset_collection=ReferenceDatasetCollection.COMBINED, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=[], - project_remap_paths=[], - project_pedigree_paths=[], - skip_validation=True, - run_id='2', - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - - ht = hl.read_table(task.output().path) - self.assertCountEqual( - ht.collect(), - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - primate_ai=hl.Struct( - score=0.25, - ), # expect row in primate_ai to be updated from 0.5 to 0.25 - cadd=hl.Struct(PHRED=1), - ), - ], - ) - self.assertEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct( - cadd='gs://seqr-reference-data/GRCh38/CADD/CADD_snvs_and_indels.v1.6.ht', - primate_ai='gs://seqr-reference-data/GRCh38/primate_ai/PrimateAI_scores_v0.2.liftover_grch38.ht', - ), - versions=hl.Struct( - cadd='v1.6', - primate_ai='v0.3', # expect primate_ai version to be updated - ), - enums=hl.Struct( - cadd=hl.Struct(), - primate_ai=hl.Struct(), - ), - date=ANY, - ), - ], - ) diff --git a/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query.py b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query.py new file mode 100644 index 000000000..ff7db2db8 --- /dev/null +++ b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query.py @@ -0,0 +1,54 @@ +import hail as hl +import luigi + +from luigi_pipeline.lib.hail_tasks import GCSorLocalTarget +from v03_pipeline.lib.paths import valid_reference_dataset_query_path +from v03_pipeline.lib.reference_datasets.reference_dataset import ( + ReferenceDatasetQuery, +) +from v03_pipeline.lib.tasks.base.base_loading_pipeline_params import ( + BaseLoadingPipelineParams, +) +from v03_pipeline.lib.tasks.base.base_write import BaseWriteTask +from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset import ( + UpdatedReferenceDatasetTask, +) + + +@luigi.util.inherits(BaseLoadingPipelineParams) +class UpdatedReferenceDatasetQueryTask(BaseWriteTask): + reference_dataset_query: ReferenceDatasetQuery = luigi.EnumParameter( + enum=ReferenceDatasetQuery, + ) + + # Reference Dataset Queries do not include version + # in the path to allow for simpler reading logic + # when they are used downstream by the hail search + # service. + def complete(self): + return super().complete() and hl.eval( + hl.read_table(self.output().path).version + == self.reference_dataset_query.version(self.reference_genome), + ) + + def requires(self): + return self.clone( + UpdatedReferenceDatasetTask, + reference_dataset=self.reference_dataset_query.requires, + ) + + def output(self): + return GCSorLocalTarget( + valid_reference_dataset_query_path( + self.reference_genome, + self.dataset_type, + self.reference_dataset_query, + ), + ) + + def create_table(self): + return self.reference_dataset_query.get_ht( + self.reference_genome, + self.dataset_type, + hl.read_table(self.input().path), + ) diff --git a/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query_test.py b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query_test.py new file mode 100644 index 000000000..fae45c7ef --- /dev/null +++ b/v03_pipeline/lib/tasks/reference_data/updated_reference_dataset_query_test.py @@ -0,0 +1,182 @@ +from unittest.mock import patch + +import hail as hl +import luigi +import responses + +from v03_pipeline.lib.misc.io import write +from v03_pipeline.lib.model.dataset_type import DatasetType +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.paths import ( + valid_reference_dataset_path, + valid_reference_dataset_query_path, +) +from v03_pipeline.lib.reference_datasets.reference_dataset import ( + ReferenceDataset, + ReferenceDatasetQuery, +) +from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset_query import ( + UpdatedReferenceDatasetQueryTask, +) +from v03_pipeline.lib.test.mock_clinvar_urls import mock_clinvar_urls +from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase + +GNOMAD_GENOMES_38_PATH = ( + 'v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht' +) + + +class UpdatedReferenceDatasetQueryTaskTest(MockedDatarootTestCase): + def setUp(self) -> None: + super().setUp() + # clinvar ReferenceDataset exists but is old + # clinvar_path ReferenceDatasetQuery dne + with patch.object( + ReferenceDataset, + 'version', + return_value='2021-01-01', + ): + write( + hl.Table.parallelize( + [ + { + 'locus': hl.Locus( + contig='chr1', + position=1, + reference_genome='GRCh38', + ), + 'alleles': ['A', 'C'], + }, + ], + hl.tstruct( + locus=hl.tlocus('GRCh38'), + alleles=hl.tarray(hl.tstr), + ), + key=['locus', 'alleles'], + globals=hl.Struct(version='2021-01-01'), + ), + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.clinvar, + ), + ) + + @responses.activate + def test_updated_query_and_dependency( + self, + ) -> None: + with mock_clinvar_urls(): + worker = luigi.worker.Worker() + task = UpdatedReferenceDatasetQueryTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + reference_dataset_query=ReferenceDatasetQuery.clinvar_path_variants, + ) + worker.add(task) + worker.run() + self.assertTrue(task.complete()) + clinvar_ht_path = valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.clinvar, + ) + clinvar_ht = hl.read_table(clinvar_ht_path) + self.assertTrue('2024-11-11' in clinvar_ht_path) + self.assertEqual( + hl.eval(clinvar_ht.version), + '2024-11-11', + ) + self.assertTrue(hasattr(clinvar_ht, 'submitters')) + contigs = clinvar_ht.aggregate(hl.agg.collect_as_set(clinvar_ht.locus.contig)) + self.assertTrue( + 'chr1' in contigs, + ) + self.assertTrue( + 'chrM' in contigs, + ) + clinvar_path_ht_path = valid_reference_dataset_query_path( + ReferenceGenome.GRCh38, + DatasetType.SNV_INDEL, + ReferenceDatasetQuery.clinvar_path_variants, + ) + clinvar_path_ht = hl.read_table(clinvar_path_ht_path) + self.assertEqual( + hl.eval(clinvar_path_ht.version), + '2024-11-11', + ) + self.assertTrue(hasattr(clinvar_path_ht, 'is_likely_pathogenic')) + contigs = clinvar_path_ht.aggregate( + hl.agg.collect_as_set(clinvar_path_ht.locus.contig), + ) + self.assertTrue( + 'chr1' in contigs, + ) + self.assertFalse( + 'chrM' in contigs, + ) + + @responses.activate + def test_updated_clinvar_query_and_dependency_mito( + self, + ) -> None: + with mock_clinvar_urls(): + worker = luigi.worker.Worker() + task = UpdatedReferenceDatasetQueryTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.MITO, + reference_dataset_query=ReferenceDatasetQuery.clinvar_path_variants, + ) + worker.add(task) + worker.run() + self.assertTrue(task.complete()) + clinvar_ht = hl.read_table( + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.clinvar, + ), + ) + self.assertEqual( + hl.eval(clinvar_ht.version), + '2024-11-11', + ) + clinvar_path_ht_path = valid_reference_dataset_query_path( + ReferenceGenome.GRCh38, + DatasetType.MITO, + ReferenceDatasetQuery.clinvar_path_variants, + ) + clinvar_path_ht = hl.read_table(clinvar_path_ht_path) + contigs = clinvar_path_ht.aggregate( + hl.agg.collect_as_set(clinvar_path_ht.locus.contig), + ) + self.assertFalse( + 'chr1' in contigs, + ) + self.assertTrue( + 'chrM' in contigs, + ) + + def test_updated_query_high_af_variants(self) -> None: + with patch.object( + ReferenceDataset, + 'path', + return_value=GNOMAD_GENOMES_38_PATH, + ): + worker = luigi.worker.Worker() + task = UpdatedReferenceDatasetQueryTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + reference_dataset_query=ReferenceDatasetQuery.high_af_variants, + ) + worker.add(task) + worker.run() + self.assertTrue(task.complete()) + high_af_variants_ht_path = valid_reference_dataset_query_path( + ReferenceGenome.GRCh38, + DatasetType.SNV_INDEL, + ReferenceDatasetQuery.high_af_variants, + ) + high_af_variants_ht = hl.read_table(high_af_variants_ht_path) + self.assertEqual( + hl.eval(high_af_variants_ht.version), + '1.0', + ) + self.assertTrue(hasattr(high_af_variants_ht, 'is_gt_1_percent')) diff --git a/v03_pipeline/lib/tasks/update_variant_annotations_table_with_new_samples_test.py b/v03_pipeline/lib/tasks/update_variant_annotations_table_with_new_samples_test.py index 345966489..5ae628407 100644 --- a/v03_pipeline/lib/tasks/update_variant_annotations_table_with_new_samples_test.py +++ b/v03_pipeline/lib/tasks/update_variant_annotations_table_with_new_samples_test.py @@ -4,10 +4,10 @@ import hail as hl import luigi.worker +import responses from v03_pipeline.lib.annotations.enums import ( BIOTYPES, - CLINVAR_PATHOGENICITIES, FIVEUTR_CONSEQUENCES, LOF_FILTERS, MITOTIP_PATHOGENICITIES, @@ -22,17 +22,14 @@ from v03_pipeline.lib.misc.io import remap_pedigree_hash from v03_pipeline.lib.misc.validation import validate_expected_contig_frequency from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, DatasetType, - ReferenceDatasetCollection, ReferenceGenome, SampleType, ) from v03_pipeline.lib.paths import ( - cached_reference_dataset_query_path, - valid_reference_dataset_collection_path, + valid_reference_dataset_path, ) -from v03_pipeline.lib.reference_data.clinvar import CLINVAR_ASSERTIONS +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset from v03_pipeline.lib.tasks.base.base_update_variant_annotations_table import ( BaseUpdateVariantAnnotationsTableTask, ) @@ -40,8 +37,11 @@ from v03_pipeline.lib.tasks.update_variant_annotations_table_with_new_samples import ( UpdateVariantAnnotationsTableWithNewSamplesTask, ) +from v03_pipeline.lib.test.mock_clinvar_urls import mock_clinvar_urls from v03_pipeline.lib.test.mock_complete_task import MockCompleteTask -from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase +from v03_pipeline.lib.test.mocked_reference_datasets_testcase import ( + MockedReferenceDatasetsTestCase, +) from v03_pipeline.var.test.vep.mock_vep_data import MOCK_37_VEP_DATA, MOCK_38_VEP_DATA GRCH38_TO_GRCH37_LIFTOVER_REF_PATH = ( @@ -55,13 +55,6 @@ TEST_PEDIGREE_3 = 'v03_pipeline/var/test/pedigrees/test_pedigree_3.tsv' TEST_PEDIGREE_4 = 'v03_pipeline/var/test/pedigrees/test_pedigree_4.tsv' TEST_PEDIGREE_5 = 'v03_pipeline/var/test/pedigrees/test_pedigree_5.tsv' -TEST_COMBINED_1 = 'v03_pipeline/var/test/reference_data/test_combined_1.ht' -TEST_COMBINED_37 = 'v03_pipeline/var/test/reference_data/test_combined_37.ht' -TEST_COMBINED_MITO_1 = 'v03_pipeline/var/test/reference_data/test_combined_mito_1.ht' -TEST_HGMD_1 = 'v03_pipeline/var/test/reference_data/test_hgmd_1.ht' -TEST_HGMD_37 = 'v03_pipeline/var/test/reference_data/test_hgmd_37.ht' -TEST_INTERVAL_1 = 'v03_pipeline/var/test/reference_data/test_interval_1.ht' -TEST_INTERVAL_MITO_1 = 'v03_pipeline/var/test/reference_data/test_interval_mito_1.ht' GENE_ID_MAPPING = { 'OR4F5': 'ENSG00000186092', @@ -85,140 +78,70 @@ TEST_RUN_ID = 'manual__2024-04-03' -@patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdatedReferenceDatasetCollectionTask', -) -@patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdateCachedReferenceDatasetQueries', -) -class UpdateVariantAnnotationsTableWithNewSamplesTaskTest(MockedDatarootTestCase): - def setUp(self) -> None: - super().setUp() - shutil.copytree( - TEST_COMBINED_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.COMBINED, - ), - ) - shutil.copytree( - TEST_COMBINED_37, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh37, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.COMBINED, - ), - ) - shutil.copytree( - TEST_HGMD_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, - ), - ) - shutil.copytree( - TEST_HGMD_37, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh37, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, - ), - ) - shutil.copytree( - TEST_INTERVAL_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.INTERVAL, - ), - ) - shutil.copytree( - TEST_COMBINED_MITO_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.MITO, - ReferenceDatasetCollection.COMBINED, - ), - ) - shutil.copytree( - TEST_INTERVAL_MITO_1, - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.MITO, - ReferenceDatasetCollection.INTERVAL, - ), - ) - +class UpdateVariantAnnotationsTableWithNewSamplesTaskTest( + MockedReferenceDatasetsTestCase, +): + @responses.activate @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.UpdateVariantAnnotationsTableWithUpdatedReferenceDataset', ) def test_missing_pedigree( self, mock_update_vat_with_rdc_task, - mock_update_crdqs_task, - mock_update_rdc_task, ) -> None: - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_update_vat_with_rdc_task.return_value = MockCompleteTask() - uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=['R0113_test_project'], - project_remap_paths=[TEST_REMAP], - project_pedigree_paths=['bad_pedigree'], - skip_validation=True, - run_id=TEST_RUN_ID, - ) - worker = luigi.worker.Worker() - worker.add(uvatwns_task) - worker.run() - self.assertFalse(uvatwns_task.complete()) + with mock_clinvar_urls(): + mock_update_vat_with_rdc_task.return_value = MockCompleteTask() + uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + sample_type=SampleType.WGS, + callset_path=TEST_SNV_INDEL_VCF, + project_guids=['R0113_test_project'], + project_remap_paths=[TEST_REMAP], + project_pedigree_paths=['bad_pedigree'], + skip_validation=True, + run_id=TEST_RUN_ID, + ) + worker = luigi.worker.Worker() + worker.add(uvatwns_task) + worker.run() + self.assertFalse(uvatwns_task.complete()) + @responses.activate @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.UpdateVariantAnnotationsTableWithUpdatedReferenceDataset', ) - def test_missing_interval_reference( + def test_missing_interval_reference_dataset( self, - mock_update_vat_with_rdc_task, - mock_update_crdqs_task, - mock_update_rdc_task, + mock_update_vat_with_rd_task, ) -> None: - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_update_vat_with_rdc_task.return_value = MockCompleteTask() - shutil.rmtree( - valid_reference_dataset_collection_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.INTERVAL, - ), - ) - uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=['R0113_test_project'], - project_remap_paths=[TEST_REMAP], - project_pedigree_paths=[TEST_PEDIGREE_3], - skip_validation=True, - run_id=TEST_RUN_ID, - ) - worker = luigi.worker.Worker() - worker.add(uvatwns_task) - worker.run() - self.assertFalse(uvatwns_task.complete()) + with mock_clinvar_urls(): + mock_update_vat_with_rd_task.return_value = MockCompleteTask() + shutil.rmtree( + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.screen, + ), + ) + uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + sample_type=SampleType.WGS, + callset_path=TEST_SNV_INDEL_VCF, + project_guids=['R0113_test_project'], + project_remap_paths=[TEST_REMAP], + project_pedigree_paths=[TEST_PEDIGREE_3], + skip_validation=True, + run_id=TEST_RUN_ID, + ) + worker = luigi.worker.Worker() + worker.add(uvatwns_task) + worker.run() + self.assertFalse(uvatwns_task.complete()) + @responses.activate @patch('v03_pipeline.lib.tasks.write_new_variants_table.register_alleles_in_chunks') @patch('v03_pipeline.lib.tasks.write_new_variants_table.Env') - @patch( - 'v03_pipeline.lib.tasks.validate_callset.UpdatedCachedReferenceDatasetQuery', - ) @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.UpdateVariantAnnotationsTableWithUpdatedReferenceDataset', ) @@ -236,17 +159,11 @@ def test_multiple_update_vat( mock_load_gencode_ensembl_to_refseq_id: Mock, mock_vep: Mock, mock_standard_contigs: Mock, - mock_update_vat_with_rdc_task: Mock, - mock_updated_cached_reference_dataset_query, + mock_update_vat_with_rd_task: Mock, mock_env: Mock, mock_register_alleles: Mock, - mock_update_crdqs_task, - mock_update_rdc_task: Mock, ) -> None: - mock_updated_cached_reference_dataset_query.return_value = MockCompleteTask() - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_update_vat_with_rdc_task.return_value = ( + mock_update_vat_with_rd_task.return_value = ( BaseUpdateVariantAnnotationsTableTask( reference_genome=ReferenceGenome.GRCh38, dataset_type=DatasetType.SNV_INDEL, @@ -348,337 +265,284 @@ def test_multiple_update_vat( ), key='locus', globals=hl.Struct( - paths=hl.Struct( - gnomad_genomes='gs://gcp-public-data--gnomad/release/4.1/ht/genomes/gnomad.genomes.v4.1.sites.ht', - ), versions=hl.Struct( - gnomad_genomes='4.1', + gnomad_genomes='1.0', ), enums=hl.Struct( gnomad_genomes=hl.Struct(), ), ), ) - coding_and_noncoding_variants_ht.write( - cached_reference_dataset_query_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - CachedReferenceDatasetQuery.GNOMAD_CODING_AND_NONCODING_VARIANTS, - ), - ) - worker = luigi.worker.Worker() - uvatwns_task_3 = UpdateVariantAnnotationsTableWithNewSamplesTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=['R0113_test_project'], - project_remap_paths=[TEST_REMAP], - project_pedigree_paths=[TEST_PEDIGREE_3], - skip_validation=False, - run_id=TEST_RUN_ID, - ) - worker.add(uvatwns_task_3) - worker.run() - self.assertTrue(uvatwns_task_3.complete()) - ht = hl.read_table(uvatwns_task_3.output().path) - self.assertEqual(ht.count(), 30) - self.assertEqual( - [ - x - for x in ht.select( - 'gt_stats', - 'CAID', - ).collect() - if x.locus.position <= 871269 # noqa: PLR2004 - ], - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - gt_stats=hl.Struct(AC=0, AN=6, AF=0.0, hom=0), - CAID='CA1', + + with mock_clinvar_urls(): + coding_and_noncoding_variants_ht.write( + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.gnomad_coding_and_noncoding, ), - ], - ) - self.assertEqual( - ht.globals.updates.collect(), - [ - { + overwrite=True, + ) + worker = luigi.worker.Worker() + + uvatwns_task_3 = UpdateVariantAnnotationsTableWithNewSamplesTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + sample_type=SampleType.WGS, + callset_path=TEST_SNV_INDEL_VCF, + project_guids=['R0113_test_project'], + project_remap_paths=[TEST_REMAP], + project_pedigree_paths=[TEST_PEDIGREE_3], + skip_validation=False, + run_id=TEST_RUN_ID, + ) + worker.add(uvatwns_task_3) + worker.run() + self.assertTrue(uvatwns_task_3.complete()) + ht = hl.read_table(uvatwns_task_3.output().path) + self.assertEqual(ht.count(), 30) + self.assertEqual( + [ + x + for x in ht.select( + 'gt_stats', + 'CAID', + ).collect() + if x.locus.position <= 871269 # noqa: PLR2004 + ], + [ hl.Struct( - callset=TEST_SNV_INDEL_VCF, - project_guid='R0113_test_project', - remap_pedigree_hash=hl.eval( - remap_pedigree_hash(TEST_REMAP, TEST_PEDIGREE_3), + locus=hl.Locus( + contig='chr1', + position=871269, + reference_genome='GRCh38', ), + alleles=['A', 'C'], + gt_stats=hl.Struct(AC=0, AN=6, AF=0.0, hom=0), + CAID='CA1', ), - }, - ], - ) - - # Ensure that new variants are added correctly to the table. - uvatwns_task_4 = UpdateVariantAnnotationsTableWithNewSamplesTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=['R0114_project4'], - project_remap_paths=[TEST_REMAP], - project_pedigree_paths=[TEST_PEDIGREE_4], - skip_validation=False, - run_id=TEST_RUN_ID + '-another-run', - ) - worker.add(uvatwns_task_4) - worker.run() - self.assertTrue(uvatwns_task_4.complete()) - ht = hl.read_table(uvatwns_task_4.output().path) - self.assertCountEqual( - [ - x - for x in ht.select( - 'cadd', - 'clinvar', - 'hgmd', - 'variant_id', - 'xpos', - 'gt_stats', - 'screen', - 'CAID', - ).collect() - if x.locus.position <= 878809 # noqa: PLR2004 - ], - [ - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=871269, - reference_genome='GRCh38', - ), - alleles=['A', 'C'], - cadd=hl.Struct(PHRED=2), - clinvar=hl.Struct( - alleleId=None, - conflictingPathogenicities=None, - goldStars=None, - pathogenicity_id=None, - assertion_ids=None, - submitters=None, - conditions=None, - ), - hgmd=hl.Struct( - accession='abcdefg', - class_id=3, - ), - variant_id='1-871269-A-C', - xpos=1000871269, - gt_stats=hl.Struct(AC=1, AN=32, AF=0.03125, hom=0), - screen=hl.Struct(region_type_ids=[1]), - CAID='CA1', - ), - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=874734, - reference_genome='GRCh38', - ), - alleles=['C', 'T'], - cadd=None, - clinvar=None, - hgmd=None, - variant_id='1-874734-C-T', - xpos=1000874734, - gt_stats=hl.Struct(AC=1, AN=32, AF=0.03125, hom=0), - screen=hl.Struct(region_type_ids=[]), - CAID='CA2', - ), - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=876499, - reference_genome='GRCh38', - ), - alleles=['A', 'G'], - cadd=None, - clinvar=None, - hgmd=None, - variant_id='1-876499-A-G', - xpos=1000876499, - gt_stats=hl.Struct(AC=31, AN=32, AF=0.96875, hom=15), - screen=hl.Struct(region_type_ids=[]), - CAID='CA3', - ), - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=878314, - reference_genome='GRCh38', - ), - alleles=['G', 'C'], - cadd=None, - clinvar=None, - hgmd=None, - variant_id='1-878314-G-C', - xpos=1000878314, - gt_stats=hl.Struct(AC=3, AN=32, AF=0.09375, hom=0), - screen=hl.Struct(region_type_ids=[]), - CAID='CA4', - ), - hl.Struct( - locus=hl.Locus( - contig='chr1', - position=878809, - reference_genome='GRCh38', - ), - alleles=['C', 'T'], - cadd=None, - clinvar=None, - hgmd=None, - variant_id='1-878809-C-T', - xpos=1000878809, - gt_stats=hl.Struct(AC=1, AN=32, AF=0.03125, hom=0), - screen=hl.Struct(region_type_ids=[]), - CAID=None, - ), - ], - ) - self.assertCountEqual( - ht.filter( - ht.locus.position <= 878809, # noqa: PLR2004 - ).sorted_transcript_consequences.consequence_term_ids.collect(), - [ - [[9], [23, 26], [23, 13, 26]], - [[9], [23, 26], [23, 13, 26]], - [[9], [23, 26], [23, 13, 26]], - [[9], [23, 26], [23, 13, 26]], - [[9], [23, 26], [23, 13, 26]], - ], - ) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - updates={ + ], + ) + self.assertEqual( + ht.globals.updates.collect(), + [ + { hl.Struct( - callset='v03_pipeline/var/test/callsets/1kg_30variants.vcf', + callset=TEST_SNV_INDEL_VCF, project_guid='R0113_test_project', remap_pedigree_hash=hl.eval( - remap_pedigree_hash( - TEST_REMAP, - TEST_PEDIGREE_3, - ), - ), - ), - hl.Struct( - callset='v03_pipeline/var/test/callsets/1kg_30variants.vcf', - project_guid='R0114_project4', - remap_pedigree_hash=hl.eval( - remap_pedigree_hash( - TEST_REMAP, - TEST_PEDIGREE_4, - ), + remap_pedigree_hash(TEST_REMAP, TEST_PEDIGREE_3), ), ), }, - paths=hl.Struct( - cadd='gs://seqr-reference-data/GRCh37/CADD/CADD_snvs_and_indels.v1.6.ht', - clinvar='https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - dbnsfp='gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.ht', - eigen='gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.ht', - exac='gs://seqr-reference-data/GRCh37/gnomad/ExAC.r1.sites.vep.ht', - gnomad_exomes='gs://gcp-public-data--gnomad/release/4.1/ht/exomes/gnomad.exomes.v4.1.sites.ht', - gnomad_genomes='gs://gcp-public-data--gnomad/release/4.1/ht/genomes/gnomad.genomes.v4.1.sites.ht', - mpc='gs://seqr-reference-data/GRCh37/MPC/fordist_constraint_official_mpc_values.ht', - primate_ai='gs://seqr-reference-data/GRCh37/primate_ai/PrimateAI_scores_v0.2.ht', - splice_ai='gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.ht', - topmed='gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.ht', - gnomad_non_coding_constraint='gs://seqr-reference-data/GRCh38/gnomad_nc_constraint/gnomad_non-coding_constraint_z_scores.ht', - screen='gs://seqr-reference-data/GRCh38/ccREs/GRCh38-ccREs.ht', - hgmd='gs://seqr-reference-data-private/GRCh38/HGMD/HGMD_Pro_2023.1_hg38.vcf.gz', - ), - versions=hl.Struct( - cadd='v1.6', - clinvar='2023-11-26', - dbnsfp='2.9.3', - eigen=None, - exac=None, - gnomad_exomes='4.1', - gnomad_genomes='4.1', - mpc=None, - primate_ai='v0.2', - splice_ai=None, - topmed=None, - gnomad_non_coding_constraint=None, - screen=None, - hgmd='HGMD_Pro_2023', - ), - migrations=[], - enums=hl.Struct( - cadd=hl.Struct(), + ], + ) + + # Ensure that new variants are added correctly to the table. + uvatwns_task_4 = UpdateVariantAnnotationsTableWithNewSamplesTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + sample_type=SampleType.WGS, + callset_path=TEST_SNV_INDEL_VCF, + project_guids=['R0114_project4'], + project_remap_paths=[TEST_REMAP], + project_pedigree_paths=[TEST_PEDIGREE_4], + skip_validation=False, + run_id=TEST_RUN_ID + '-another-run', + ) + worker.add(uvatwns_task_4) + worker.run() + self.assertTrue(uvatwns_task_4.complete()) + ht = hl.read_table(uvatwns_task_4.output().path) + self.assertCountEqual( + [ + x + for x in ht.select( + 'clinvar', + 'hgmd', + 'variant_id', + 'xpos', + 'gt_stats', + 'screen', + 'CAID', + ).collect() + if x.locus.position <= 878809 # noqa: PLR2004 + ], + [ + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=871269, + reference_genome='GRCh38', + ), + alleles=['A', 'C'], clinvar=hl.Struct( - assertion=CLINVAR_ASSERTIONS, - pathogenicity=CLINVAR_PATHOGENICITIES, + alleleId=None, + conflictingPathogenicities=None, + goldStars=None, + pathogenicity_id=None, + assertion_ids=None, + submitters=None, + conditions=None, ), - dbnsfp=hl.Struct( - MutationTaster_pred=['D', 'A', 'N', 'P'], + hgmd=hl.Struct( + accession='abcdefg', + class_id=3, ), - eigen=hl.Struct(), - exac=hl.Struct(), - gnomad_exomes=hl.Struct(), - gnomad_genomes=hl.Struct(), - mpc=hl.Struct(), - primate_ai=hl.Struct(), - splice_ai=hl.Struct( - splice_consequence=[ - 'Acceptor gain', - 'Acceptor loss', - 'Donor gain', - 'Donor loss', - 'No consequence', - ], + variant_id='1-871269-A-C', + xpos=1000871269, + gt_stats=hl.Struct(AC=1, AN=32, AF=0.03125, hom=0), + screen=hl.Struct(region_type_ids=[1]), + CAID='CA1', + ), + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=874734, + reference_genome='GRCh38', ), - topmed=hl.Struct(), - hgmd=hl.Struct( - **{'class': ['DM', 'DM?', 'DP', 'DFP', 'FP', 'R']}, + alleles=['C', 'T'], + clinvar=None, + hgmd=None, + variant_id='1-874734-C-T', + xpos=1000874734, + gt_stats=hl.Struct(AC=1, AN=32, AF=0.03125, hom=0), + screen=hl.Struct(region_type_ids=[]), + CAID='CA2', + ), + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=876499, + reference_genome='GRCh38', ), - gnomad_non_coding_constraint=hl.Struct(), - screen=hl.Struct( - region_type=[ - 'CTCF-bound', - 'CTCF-only', - 'DNase-H3K4me3', - 'PLS', - 'dELS', - 'pELS', - 'DNase-only', - 'low-DNase', - ], + alleles=['A', 'G'], + clinvar=None, + hgmd=None, + variant_id='1-876499-A-G', + xpos=1000876499, + gt_stats=hl.Struct(AC=31, AN=32, AF=0.96875, hom=15), + screen=hl.Struct(region_type_ids=[]), + CAID='CA3', + ), + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=878314, + reference_genome='GRCh38', ), - sorted_motif_feature_consequences=hl.Struct( - consequence_term=MOTIF_CONSEQUENCE_TERMS, + alleles=['G', 'C'], + clinvar=None, + hgmd=None, + variant_id='1-878314-G-C', + xpos=1000878314, + gt_stats=hl.Struct(AC=3, AN=32, AF=0.09375, hom=0), + screen=hl.Struct(region_type_ids=[]), + CAID='CA4', + ), + hl.Struct( + locus=hl.Locus( + contig='chr1', + position=878809, + reference_genome='GRCh38', ), - sorted_regulatory_feature_consequences=hl.Struct( - biotype=REGULATORY_BIOTYPES, - consequence_term=REGULATORY_CONSEQUENCE_TERMS, + alleles=['C', 'T'], + clinvar=None, + hgmd=None, + variant_id='1-878809-C-T', + xpos=1000878809, + gt_stats=hl.Struct(AC=1, AN=32, AF=0.03125, hom=0), + screen=hl.Struct(region_type_ids=[]), + CAID=None, + ), + ], + ) + self.assertCountEqual( + ht.filter( + ht.locus.position <= 878809, # noqa: PLR2004 + ).sorted_transcript_consequences.consequence_term_ids.collect(), + [ + [[9], [23, 26], [23, 13, 26]], + [[9], [23, 26], [23, 13, 26]], + [[9], [23, 26], [23, 13, 26]], + [[9], [23, 26], [23, 13, 26]], + [[9], [23, 26], [23, 13, 26]], + ], + ) + self.assertCountEqual( + ht.globals.collect(), + [ + hl.Struct( + updates={ + hl.Struct( + callset='v03_pipeline/var/test/callsets/1kg_30variants.vcf', + project_guid='R0113_test_project', + remap_pedigree_hash=hl.eval( + remap_pedigree_hash( + TEST_REMAP, + TEST_PEDIGREE_3, + ), + ), + ), + hl.Struct( + callset='v03_pipeline/var/test/callsets/1kg_30variants.vcf', + project_guid='R0114_project4', + remap_pedigree_hash=hl.eval( + remap_pedigree_hash( + TEST_REMAP, + TEST_PEDIGREE_4, + ), + ), + ), + }, + versions=hl.Struct( + clinvar='2024-11-11', + dbnsfp='1.0', + eigen='1.0', + exac='1.0', + gnomad_exomes='1.0', + gnomad_genomes='1.0', + splice_ai='1.0', + topmed='1.0', + gnomad_non_coding_constraint='1.0', + screen='1.0', + hgmd='1.0', ), - sorted_transcript_consequences=hl.Struct( - biotype=BIOTYPES, - consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, - loftee=hl.Struct( - lof_filter=LOF_FILTERS, + migrations=[], + enums=hl.Struct( + clinvar=ReferenceDataset.clinvar.enum_globals, + dbnsfp=ReferenceDataset.dbnsfp.enum_globals, + eigen=hl.Struct(), + exac=hl.Struct(), + gnomad_exomes=hl.Struct(), + gnomad_genomes=hl.Struct(), + splice_ai=ReferenceDataset.splice_ai.enum_globals, + topmed=hl.Struct(), + hgmd=ReferenceDataset.hgmd.enum_globals, + gnomad_non_coding_constraint=hl.Struct(), + screen=ReferenceDataset.screen.enum_globals, + sorted_motif_feature_consequences=hl.Struct( + consequence_term=MOTIF_CONSEQUENCE_TERMS, ), - utrannotator=hl.Struct( - fiveutr_consequence=FIVEUTR_CONSEQUENCES, + sorted_regulatory_feature_consequences=hl.Struct( + biotype=REGULATORY_BIOTYPES, + consequence_term=REGULATORY_CONSEQUENCE_TERMS, + ), + sorted_transcript_consequences=hl.Struct( + biotype=BIOTYPES, + consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, + loftee=hl.Struct( + lof_filter=LOF_FILTERS, + ), + utrannotator=hl.Struct( + fiveutr_consequence=FIVEUTR_CONSEQUENCES, + ), ), ), ), - ), - ], - ) + ], + ) + @responses.activate @patch('v03_pipeline.lib.tasks.write_new_variants_table.register_alleles_in_chunks') @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.UpdateVariantAnnotationsTableWithUpdatedReferenceDataset', @@ -687,14 +551,10 @@ def test_multiple_update_vat( def test_update_vat_grch37( self, mock_vep: Mock, - mock_update_vat_with_rdc_task: Mock, + mock_update_vat_with_rd_task: Mock, mock_register_alleles: Mock, - mock_update_crdqs_task, - mock_update_rdc_task: Mock, ) -> None: - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_update_vat_with_rdc_task.return_value = ( + mock_update_vat_with_rd_task.return_value = ( BaseUpdateVariantAnnotationsTableTask( reference_genome=ReferenceGenome.GRCh37, dataset_type=DatasetType.SNV_INDEL, @@ -702,151 +562,147 @@ def test_update_vat_grch37( ) mock_vep.side_effect = lambda ht, **_: ht.annotate(vep=MOCK_37_VEP_DATA) mock_register_alleles.side_effect = None - worker = luigi.worker.Worker() - uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( - reference_genome=ReferenceGenome.GRCh37, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=['R0113_test_project'], - project_remap_paths=[TEST_REMAP], - project_pedigree_paths=[TEST_PEDIGREE_3], - skip_validation=True, - run_id=TEST_RUN_ID, - ) - worker.add(uvatwns_task) - worker.run() - self.assertTrue(uvatwns_task.complete()) - ht = hl.read_table(uvatwns_task.output().path) - self.assertEqual(ht.count(), 30) - self.assertCountEqual( - ht.globals.paths.collect(), - [ + + with mock_clinvar_urls(ReferenceGenome.GRCh37): + worker = luigi.worker.Worker() + uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( + reference_genome=ReferenceGenome.GRCh37, + dataset_type=DatasetType.SNV_INDEL, + sample_type=SampleType.WGS, + callset_path=TEST_SNV_INDEL_VCF, + project_guids=['R0113_test_project'], + project_remap_paths=[TEST_REMAP], + project_pedigree_paths=[TEST_PEDIGREE_3], + skip_validation=True, + run_id=TEST_RUN_ID, + ) + worker.add(uvatwns_task) + worker.run() + self.assertTrue(uvatwns_task.complete()) + ht = hl.read_table(uvatwns_task.output().path) + self.assertEqual(ht.count(), 30) + self.assertFalse(hasattr(ht, 'rg37_locus')) + self.assertEqual( + ht.collect()[0], hl.Struct( - cadd='gs://seqr-reference-data/GRCh37/CADD/CADD_snvs_and_indels.v1.6.ht', - clinvar='https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz', - dbnsfp='gs://seqr-reference-data/GRCh37/dbNSFP/v2.9.3/dbNSFP2.9.3_variant.ht', - eigen='gs://seqr-reference-data/GRCh37/eigen/EIGEN_coding_noncoding.grch37.ht', - exac='gs://seqr-reference-data/GRCh37/gnomad/ExAC.r1.sites.vep.ht', - hgmd='gs://seqr-reference-data-private/GRCh37/HGMD/HGMD_Pro_2023.1_hg19.vcf.gz', - gnomad_exomes='gs://gcp-public-data--gnomad/release/2.1.1/ht/exomes/gnomad.exomes.r2.1.1.sites.ht', - gnomad_genomes='gs://gcp-public-data--gnomad/release/2.1.1/ht/genomes/gnomad.genomes.r2.1.1.sites.ht', - mpc='gs://seqr-reference-data/GRCh37/MPC/fordist_constraint_official_mpc_values.ht', - primate_ai='gs://seqr-reference-data/GRCh37/primate_ai/PrimateAI_scores_v0.2.ht', - splice_ai='gs://seqr-reference-data/GRCh37/spliceai/spliceai_scores.ht', - topmed='gs://seqr-reference-data/GRCh37/TopMed/bravo-dbsnp-all.removed_chr_prefix.liftunder_GRCh37.ht', - ), - ], - ) - self.assertFalse(hasattr(ht, 'rg37_locus')) - self.assertEqual( - ht.collect()[0], - hl.Struct( - locus=hl.Locus(contig=1, position=871269, reference_genome='GRCh37'), - alleles=['A', 'C'], - rsid=None, - variant_id='1-871269-A-C', - xpos=1000871269, - sorted_transcript_consequences=[ - hl.Struct( - amino_acids='S/L', - canonical=1, - codons='tCg/tTg', - gene_id='ENSG00000188976', - hgvsc='ENST00000327044.6:c.1667C>T', - hgvsp='ENSP00000317992.6:p.Ser556Leu', - transcript_id='ENST00000327044', - biotype_id=39, - consequence_term_ids=[9], - is_lof_nagnag=None, - lof_filter_ids=[0, 1], + locus=hl.Locus( + contig=1, + position=871269, + reference_genome='GRCh37', ), - hl.Struct( - amino_acids=None, - canonical=None, - codons=None, - gene_id='ENSG00000188976', - hgvsc='ENST00000477976.1:n.3114C>T', - hgvsp=None, - transcript_id='ENST00000477976', - biotype_id=38, - consequence_term_ids=[23, 26], - is_lof_nagnag=None, - lof_filter_ids=None, + alleles=['A', 'C'], + rsid=None, + variant_id='1-871269-A-C', + xpos=1000871269, + sorted_transcript_consequences=[ + hl.Struct( + amino_acids='S/L', + canonical=1, + codons='tCg/tTg', + gene_id='ENSG00000188976', + hgvsc='ENST00000327044.6:c.1667C>T', + hgvsp='ENSP00000317992.6:p.Ser556Leu', + transcript_id='ENST00000327044', + biotype_id=39, + consequence_term_ids=[9], + is_lof_nagnag=None, + lof_filter_ids=[0, 1], + ), + hl.Struct( + amino_acids=None, + canonical=None, + codons=None, + gene_id='ENSG00000188976', + hgvsc='ENST00000477976.1:n.3114C>T', + hgvsp=None, + transcript_id='ENST00000477976', + biotype_id=38, + consequence_term_ids=[23, 26], + is_lof_nagnag=None, + lof_filter_ids=None, + ), + hl.Struct( + amino_acids=None, + canonical=None, + codons=None, + gene_id='ENSG00000188976', + hgvsc='ENST00000483767.1:n.523C>T', + hgvsp=None, + transcript_id='ENST00000483767', + biotype_id=38, + consequence_term_ids=[23, 26], + is_lof_nagnag=None, + lof_filter_ids=None, + ), + ], + rg38_locus=hl.Locus( + contig='chr1', + position=935889, + reference_genome='GRCh38', ), - hl.Struct( - amino_acids=None, - canonical=None, - codons=None, - gene_id='ENSG00000188976', - hgvsc='ENST00000483767.1:n.523C>T', - hgvsp=None, - transcript_id='ENST00000483767', - biotype_id=38, - consequence_term_ids=[23, 26], - is_lof_nagnag=None, - lof_filter_ids=None, + clinvar=hl.Struct( + alleleId=None, + conflictingPathogenicities=None, + goldStars=None, + pathogenicity_id=None, + assertion_ids=None, + submitters=None, + conditions=None, ), - ], - rg38_locus=hl.Locus( - contig='chr1', - position=935889, - reference_genome='GRCh38', - ), - cadd=hl.Struct(PHRED=9.699999809265137), - clinvar=hl.Struct( - alleleId=None, - conflictingPathogenicities=None, - goldStars=None, - pathogenicity_id=None, - assertion_ids=None, - submitters=None, - conditions=None, - ), - eigen=hl.Struct(Eigen_phred=1.5880000591278076), - exac=hl.Struct( - AF_POPMAX=0.0004100881633348763, - AF=0.0004633000062312931, - AC_Adj=51, - AC_Het=51, - AC_Hom=0, - AC_Hemi=None, - AN_Adj=108288, - ), - gnomad_exomes=hl.Struct( - AF=0.00012876000255346298, - AN=240758, - AC=31, - Hom=0, - AF_POPMAX_OR_GLOBAL=0.0001119549197028391, - FAF_AF=9.315000352216884e-05, - Hemi=0, - ), - gnomad_genomes=None, - mpc=None, - primate_ai=None, - splice_ai=hl.Struct( - delta_score=0.029999999329447746, - splice_consequence_id=3, - ), - topmed=None, - dbnsfp=hl.Struct( - REVEL_score=0.0430000014603138, - SIFT_score=None, - Polyphen2_HVAR_score=None, - MutationTaster_pred_id=0, + eigen=hl.Struct(Eigen_phred=1.5880000591278076), + exac=hl.Struct( + AF_POPMAX=0.0004100881633348763, + AF=0.0004633000062312931, + AC_Adj=51, + AC_Het=51, + AC_Hom=0, + AC_Hemi=None, + AN_Adj=108288, + ), + gnomad_exomes=hl.Struct( + AF=0.00012876000255346298, + AN=240758, + AC=31, + Hom=0, + AF_POPMAX_OR_GLOBAL=0.0001119549197028391, + FAF_AF=9.315000352216884e-05, + Hemi=0, + ), + gnomad_genomes=hl.Struct( + AF=None, + AN=None, + AC=None, + Hom=None, + AF_POPMAX_OR_GLOBAL=None, + FAF_AF=None, + Hemi=None, + ), + splice_ai=hl.Struct( + delta_score=0.029999999329447746, + splice_consequence_id=3, + ), + topmed=hl.Struct(AC=None, AF=None, AN=None, Hom=None, Het=None), + dbnsfp=hl.Struct( + REVEL_score=0.0430000014603138, + SIFT_score=None, + Polyphen2_HVAR_score=None, + MutationTaster_pred_id=0, + CADD_phred=9.699999809265137, + MPC_score=None, + PrimateAI_score=None, + ), + hgmd=None, + gt_stats=hl.Struct(AC=0, AN=6, AF=0.0, hom=0), + CAID=None, ), - hgmd=None, - gt_stats=hl.Struct(AC=0, AN=6, AF=0.0, hom=0), - CAID=None, - ), - ) + ) + @responses.activate @patch('v03_pipeline.lib.tasks.write_new_variants_table.register_alleles_in_chunks') @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.UpdateVariantAnnotationsTableWithUpdatedReferenceDataset', ) - @patch('v03_pipeline.lib.model.reference_dataset_collection.Env') + @patch('v03_pipeline.lib.reference_datasets.reference_dataset.Env') @patch('v03_pipeline.lib.vep.hl.vep') @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.load_gencode_ensembl_to_refseq_id', @@ -855,177 +711,151 @@ def test_update_vat_without_accessing_private_datasets( self, mock_load_gencode_ensembl_to_refseq_id: Mock, mock_vep: Mock, - mock_rdc_env: Mock, - mock_update_vat_with_rdc_task: Mock, + mock_rd_env: Mock, + mock_update_vat_with_rd_task: Mock, mock_register_alleles: Mock, - mock_update_crdqs_task, - mock_update_rdc_task: Mock, ) -> None: mock_load_gencode_ensembl_to_refseq_id.return_value = hl.dict( {'ENST00000327044': 'NM_015658.4'}, ) - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_update_vat_with_rdc_task.return_value = ( + mock_update_vat_with_rd_task.return_value = ( BaseUpdateVariantAnnotationsTableTask( reference_genome=ReferenceGenome.GRCh38, dataset_type=DatasetType.SNV_INDEL, ) ) shutil.rmtree( - valid_reference_dataset_collection_path( + valid_reference_dataset_path( ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - ReferenceDatasetCollection.HGMD, + ReferenceDataset.hgmd, ), ) - mock_rdc_env.ACCESS_PRIVATE_REFERENCE_DATASETS = False + mock_rd_env.ACCESS_PRIVATE_REFERENCE_DATASETS = False mock_vep.side_effect = lambda ht, **_: ht.annotate(vep=MOCK_38_VEP_DATA) mock_register_alleles.side_effect = None - worker = luigi.worker.Worker() - uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - sample_type=SampleType.WGS, - callset_path=TEST_SNV_INDEL_VCF, - project_guids=['R0113_test_project'], - project_remap_paths=[TEST_REMAP], - project_pedigree_paths=[TEST_PEDIGREE_3], - skip_validation=True, - run_id=TEST_RUN_ID, - ) - worker.add(uvatwns_task) - worker.run() - self.assertTrue(uvatwns_task.complete()) - ht = hl.read_table(uvatwns_task.output().path) - self.assertEqual(ht.count(), 30) - self.assertCountEqual( - ht.globals.versions.collect(), - [ - hl.Struct( - cadd='v1.6', - clinvar='2023-11-26', - dbnsfp='2.9.3', - eigen=None, - exac=None, - gnomad_exomes='4.1', - gnomad_genomes='4.1', - mpc=None, - primate_ai='v0.2', - splice_ai=None, - topmed=None, - gnomad_non_coding_constraint=None, - screen=None, - ), - ], - ) + with mock_clinvar_urls(): + worker = luigi.worker.Worker() + uvatwns_task = UpdateVariantAnnotationsTableWithNewSamplesTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + sample_type=SampleType.WGS, + callset_path=TEST_SNV_INDEL_VCF, + project_guids=['R0113_test_project'], + project_remap_paths=[TEST_REMAP], + project_pedigree_paths=[TEST_PEDIGREE_3], + skip_validation=True, + run_id=TEST_RUN_ID, + ) + worker.add(uvatwns_task) + worker.run() + self.assertTrue(uvatwns_task.complete()) + ht = hl.read_table(uvatwns_task.output().path) + self.assertEqual(ht.count(), 30) + self.assertCountEqual( + ht.globals.versions.collect(), + [ + hl.Struct( + clinvar='2024-11-11', + dbnsfp='1.0', + eigen='1.0', + exac='1.0', + gnomad_exomes='1.0', + gnomad_genomes='1.0', + splice_ai='1.0', + topmed='1.0', + gnomad_non_coding_constraint='1.0', + screen='1.0', + ), + ], + ) + + @responses.activate @patch('v03_pipeline.lib.tasks.write_new_variants_table.register_alleles_in_chunks') @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.UpdateVariantAnnotationsTableWithUpdatedReferenceDataset', ) def test_mito_update_vat( self, - mock_update_vat_with_rdc_task: Mock, + mock_update_vat_with_rd_task: Mock, mock_register_alleles: Mock, - mock_update_crdqs_task, - mock_update_rdc_task: Mock, ) -> None: - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() - mock_update_vat_with_rdc_task.return_value = ( + mock_update_vat_with_rd_task.return_value = ( BaseUpdateVariantAnnotationsTableTask( reference_genome=ReferenceGenome.GRCh38, dataset_type=DatasetType.MITO, ) ) mock_register_alleles.side_effect = None - worker = luigi.worker.Worker() - update_variant_annotations_task = ( - UpdateVariantAnnotationsTableWithNewSamplesTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.MITO, - sample_type=SampleType.WGS, - callset_path=TEST_MITO_MT, - project_guids=['R0115_test_project2'], - project_remap_paths=['not_a_real_file'], - project_pedigree_paths=[TEST_PEDIGREE_5], - skip_validation=True, - run_id=TEST_RUN_ID, + + with mock_clinvar_urls(): + worker = luigi.worker.Worker() + update_variant_annotations_task = ( + UpdateVariantAnnotationsTableWithNewSamplesTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.MITO, + sample_type=SampleType.WGS, + callset_path=TEST_MITO_MT, + project_guids=['R0115_test_project2'], + project_remap_paths=['not_a_real_file'], + project_pedigree_paths=[TEST_PEDIGREE_5], + skip_validation=True, + run_id=TEST_RUN_ID, + ) ) - ) - worker.add(update_variant_annotations_task) - worker.run() - self.assertTrue(update_variant_annotations_task.complete()) - ht = hl.read_table(update_variant_annotations_task.output().path) - self.assertEqual(ht.count(), 5) - self.assertCountEqual( - ht.globals.collect(), - [ - hl.Struct( - paths=hl.Struct( - high_constraint_region_mito='gs://seqr-reference-data/GRCh38/mitochondrial/Helix high constraint intervals Feb-15-2022.tsv', - clinvar_mito='https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz', - dbnsfp_mito='gs://seqr-reference-data/GRCh38/dbNSFP/v4.2/dbNSFP4.2a_variant.with_new_scores.ht', - gnomad_mito='gs://gcp-public-data--gnomad/release/3.1/ht/genomes/gnomad.genomes.v3.1.sites.chrM.ht', - helix_mito='gs://seqr-reference-data/GRCh38/mitochondrial/Helix/HelixMTdb_20200327.ht', - hmtvar='gs://seqr-reference-data/GRCh38/mitochondrial/HmtVar/HmtVar%20Jan.%2010%202022.ht', - mitomap='gs://seqr-reference-data/GRCh38/mitochondrial/MITOMAP/mitomap-confirmed-mutations-2022-02-04.ht', - mitimpact='gs://seqr-reference-data/GRCh38/mitochondrial/MitImpact/MitImpact_db_3.1.3.ht', - local_constraint_mito='gs://seqr-reference-data/GRCh38/mitochondrial/local_constraint.tsv', - ), - versions=hl.Struct( - high_constraint_region_mito='Feb-15-2022', - clinvar_mito='2023-07-22', - dbnsfp_mito='4.2', - gnomad_mito='v3.1', - helix_mito='20200327', - hmtvar='Jan. 10 2022', - mitomap='Feb. 04 2022', - mitimpact='3.1.3', - local_constraint_mito='2024-07-24', - ), - enums=hl.Struct( - high_constraint_region_mito=hl.Struct(), - local_constraint_mito=hl.Struct(), - clinvar_mito=hl.Struct( - assertion=CLINVAR_ASSERTIONS, - pathogenicity=CLINVAR_PATHOGENICITIES, - ), - dbnsfp_mito=hl.Struct( - MutationTaster_pred=['D', 'A', 'N', 'P'], + worker.add(update_variant_annotations_task) + worker.run() + self.assertTrue(update_variant_annotations_task.complete()) + ht = hl.read_table(update_variant_annotations_task.output().path) + self.assertEqual(ht.count(), 5) + self.assertCountEqual( + ht.globals.collect(), + [ + hl.Struct( + versions=hl.Struct( + clinvar='2024-11-11', + dbnsfp='1.0', + gnomad_mito='1.0', + helix_mito='1.0', + hmtvar='1.0', + mitomap='1.0', + mitimpact='1.0', + local_constraint_mito='1.0', ), - gnomad_mito=hl.Struct(), - helix_mito=hl.Struct(), - hmtvar=hl.Struct(), - mitomap=hl.Struct(), - mitimpact=hl.Struct(), - sorted_transcript_consequences=hl.Struct( - biotype=BIOTYPES, - consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, - lof_filter=LOF_FILTERS, + enums=hl.Struct( + local_constraint_mito=hl.Struct(), + clinvar=ReferenceDataset.clinvar.enum_globals, + dbnsfp=ReferenceDataset.dbnsfp.enum_globals, + gnomad_mito=hl.Struct(), + helix_mito=hl.Struct(), + hmtvar=hl.Struct(), + mitomap=hl.Struct(), + mitimpact=hl.Struct(), + sorted_transcript_consequences=hl.Struct( + biotype=BIOTYPES, + consequence_term=TRANSCRIPT_CONSEQUENCE_TERMS, + lof_filter=LOF_FILTERS, + ), + mitotip=hl.Struct(trna_prediction=MITOTIP_PATHOGENICITIES), ), - mitotip=hl.Struct(trna_prediction=MITOTIP_PATHOGENICITIES), - ), - migrations=[], - updates={ - hl.Struct( - callset='v03_pipeline/var/test/callsets/mito_1.mt', - project_guid='R0115_test_project2', - remap_pedigree_hash=hl.eval( - remap_pedigree_hash( - 'not_a_real_file', - TEST_PEDIGREE_5, + migrations=[], + updates={ + hl.Struct( + callset='v03_pipeline/var/test/callsets/mito_1.mt', + project_guid='R0115_test_project2', + remap_pedigree_hash=hl.eval( + remap_pedigree_hash( + 'not_a_real_file', + TEST_PEDIGREE_5, + ), ), ), - ), - }, - ), - ], - ) - self.assertCountEqual( - ht.collect(), - [ + }, + ), + ], + ) + self.assertCountEqual( + ht.collect()[0], hl.Struct( locus=hl.Locus( contig='chrM', @@ -1035,7 +865,6 @@ def test_mito_update_vat( alleles=['T', 'C'], common_low_heteroplasmy=False, haplogroup=hl.Struct(is_defining=False), - high_constraint_region_mito=True, mitotip=hl.Struct(trna_prediction_id=None), rg37_locus=hl.Locus( contig='MT', @@ -1046,152 +875,8 @@ def test_mito_update_vat( sorted_transcript_consequences=None, variant_id='M-3-T-C', xpos=25000000003, - clinvar_mito=None, - dbnsfp_mito=None, - gnomad_mito=None, - helix_mito=None, - hmtvar=None, - mitomap=None, - mitimpact=None, - gt_stats=hl.Struct( - AC_het=1, - AF_het=0.25, - AC_hom=0, - AF_hom=0.0, - AN=4, - ), - local_constraint_mito=None, - ), - hl.Struct( - locus=hl.Locus( - contig='chrM', - position=8, - reference_genome='GRCh38', - ), - alleles=['G', 'T'], - common_low_heteroplasmy=False, - haplogroup=hl.Struct(is_defining=False), - high_constraint_region_mito=True, - mitotip=hl.Struct(trna_prediction_id=None), - rg37_locus=hl.Locus( - contig='MT', - position=8, - reference_genome='GRCh37', - ), - rsid=None, - sorted_transcript_consequences=None, - variant_id='M-8-G-T', - xpos=25000000008, - clinvar_mito=None, - dbnsfp_mito=None, - gnomad_mito=None, - helix_mito=None, - hmtvar=None, - mitomap=None, - mitimpact=None, - gt_stats=hl.Struct( - AC_het=1, - AF_het=0.25, - AC_hom=0, - AF_hom=0.0, - AN=4, - ), - local_constraint_mito=None, - ), - hl.Struct( - locus=hl.Locus( - contig='chrM', - position=12, - reference_genome='GRCh38', - ), - alleles=['T', 'C'], - common_low_heteroplasmy=False, - haplogroup=hl.Struct(is_defining=False), - high_constraint_region_mito=False, - mitotip=hl.Struct(trna_prediction_id=None), - rg37_locus=hl.Locus( - contig='MT', - position=12, - reference_genome='GRCh37', - ), - rsid=None, - sorted_transcript_consequences=None, - variant_id='M-12-T-C', - xpos=25000000012, - clinvar_mito=None, - dbnsfp_mito=None, - gnomad_mito=None, - helix_mito=None, - hmtvar=None, - mitomap=None, - mitimpact=None, - gt_stats=hl.Struct( - AC_het=1, - AF_het=0.25, - AC_hom=0, - AF_hom=0.0, - AN=4, - ), - local_constraint_mito=None, - ), - hl.Struct( - locus=hl.Locus( - contig='chrM', - position=16, - reference_genome='GRCh38', - ), - alleles=['A', 'T'], - common_low_heteroplasmy=False, - haplogroup=hl.Struct(is_defining=True), - high_constraint_region_mito=False, - mitotip=hl.Struct(trna_prediction_id=None), - rg37_locus=hl.Locus( - contig='MT', - position=16, - reference_genome='GRCh37', - ), - rsid='rs1556422363', - sorted_transcript_consequences=None, - variant_id='M-16-A-T', - xpos=25000000016, - clinvar_mito=None, - dbnsfp_mito=None, - gnomad_mito=None, - helix_mito=None, - hmtvar=None, - mitomap=None, - mitimpact=None, - gt_stats=hl.Struct( - AC_het=1, - AF_het=0.25, - AC_hom=0, - AF_hom=0.0, - AN=4, - ), - local_constraint_mito=None, - ), - hl.Struct( - locus=hl.Locus( - contig='chrM', - position=18, - reference_genome='GRCh38', - ), - alleles=['C', 'T'], - common_low_heteroplasmy=False, - haplogroup=hl.Struct(is_defining=False), - high_constraint_region_mito=False, - mitotip=hl.Struct(trna_prediction_id=None), - rg37_locus=hl.Locus( - contig='MT', - position=18, - reference_genome='GRCh37', - ), - rsid=None, - sorted_transcript_consequences=None, - variant_id='M-18-C-T', - xpos=25000000018, - clinvar_mito=None, - dbnsfp_mito=None, + clinvar=None, + dbnsfp=None, gnomad_mito=None, helix_mito=None, hmtvar=None, @@ -1206,8 +891,7 @@ def test_mito_update_vat( ), local_constraint_mito=None, ), - ], - ) + ) @patch( 'v03_pipeline.lib.tasks.write_new_variants_table.load_gencode_gene_symbol_to_gene_id', @@ -1215,11 +899,7 @@ def test_mito_update_vat( def test_sv_update_vat( self, mock_load_gencode: Mock, - mock_update_crdqs_task, - mock_update_rdc_task: Mock, ) -> None: - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() mock_load_gencode.return_value = GENE_ID_MAPPING worker = luigi.worker.Worker() update_variant_annotations_task = ( @@ -1249,7 +929,6 @@ def test_sv_update_vat( ht.globals.collect(), [ hl.Struct( - paths=hl.Struct(), versions=hl.Struct(), enums=hl.Struct( sv_type=SV_TYPES, @@ -1797,11 +1476,7 @@ def test_sv_update_vat( def test_gcnv_update_vat( self, - mock_update_crdqs_task, - mock_update_rdc_task, ) -> None: - mock_update_rdc_task.return_value = MockCompleteTask() - mock_update_crdqs_task.return_value = MockCompleteTask() worker = luigi.worker.Worker() update_variant_annotations_task = ( UpdateVariantAnnotationsTableWithNewSamplesTask( @@ -1830,7 +1505,6 @@ def test_gcnv_update_vat( ht.globals.collect(), [ hl.Struct( - paths=hl.Struct(), versions=hl.Struct(), enums=hl.Struct( sv_type=SV_TYPES, diff --git a/v03_pipeline/lib/tasks/validate_callset.py b/v03_pipeline/lib/tasks/validate_callset.py index 3b2077446..e5601875b 100644 --- a/v03_pipeline/lib/tasks/validate_callset.py +++ b/v03_pipeline/lib/tasks/validate_callset.py @@ -4,23 +4,24 @@ from v03_pipeline.lib.misc.validation import ( SeqrValidationError, - get_validation_dependencies, validate_allele_type, validate_expected_contig_frequency, validate_imputed_sex_ploidy, validate_no_duplicate_variants, validate_sample_type, ) -from v03_pipeline.lib.model import CachedReferenceDatasetQuery from v03_pipeline.lib.model.environment import Env from v03_pipeline.lib.paths import ( imported_callset_path, + sex_check_table_path, + valid_reference_dataset_path, ) +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset from v03_pipeline.lib.tasks.base.base_loading_run_params import BaseLoadingRunParams from v03_pipeline.lib.tasks.base.base_update import BaseUpdateTask from v03_pipeline.lib.tasks.files import CallsetTask, GCSorLocalTarget -from v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query import ( - UpdatedCachedReferenceDatasetQuery, +from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset import ( + UpdatedReferenceDatasetTask, ) from v03_pipeline.lib.tasks.write_imported_callset import WriteImportedCallsetTask from v03_pipeline.lib.tasks.write_sex_check_table import WriteSexCheckTableTask @@ -31,6 +32,28 @@ @luigi.util.inherits(BaseLoadingRunParams) class ValidateCallsetTask(BaseUpdateTask): + def get_validation_dependencies(self) -> dict[str, hl.Table]: + deps = {} + deps['coding_and_noncoding_variants_ht'] = hl.read_table( + valid_reference_dataset_path( + self.reference_genome, + ReferenceDataset.gnomad_coding_and_noncoding, + ), + ) + if ( + Env.CHECK_SEX_AND_RELATEDNESS + and self.dataset_type.check_sex_and_relatedness + and not self.skip_check_sex_and_relatedness + ): + deps['sex_check_ht'] = hl.read_table( + sex_check_table_path( + self.reference_genome, + self.dataset_type, + self.callset_path, + ), + ) + return deps + def complete(self) -> luigi.Target: if super().complete(): mt = hl.read_matrix_table(self.output().path) @@ -57,8 +80,8 @@ def requires(self) -> list[luigi.Task]: *requirements, ( self.clone( - UpdatedCachedReferenceDatasetQuery, - crdq=CachedReferenceDatasetQuery.GNOMAD_CODING_AND_NONCODING_VARIANTS, + UpdatedReferenceDatasetTask, + reference_dataset=ReferenceDataset.gnomad_coding_and_noncoding, ) ), ] @@ -98,9 +121,7 @@ def update_table(self, mt: hl.MatrixTable) -> hl.MatrixTable: callset_path=self.callset_path, validated_sample_type=self.sample_type.value, ) - validation_dependencies = get_validation_dependencies( - **self.param_kwargs, - ) + validation_dependencies = self.get_validation_dependencies() for validation_f in [ validate_allele_type, validate_imputed_sex_ploidy, diff --git a/v03_pipeline/lib/tasks/validate_callset_test.py b/v03_pipeline/lib/tasks/validate_callset_test.py index 991412824..8f3638376 100644 --- a/v03_pipeline/lib/tasks/validate_callset_test.py +++ b/v03_pipeline/lib/tasks/validate_callset_test.py @@ -1,29 +1,27 @@ import json import shutil -from unittest.mock import Mock, patch import luigi.worker from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, DatasetType, ReferenceGenome, SampleType, ) from v03_pipeline.lib.paths import ( - cached_reference_dataset_query_path, + valid_reference_dataset_path, ) +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset from v03_pipeline.lib.tasks.validate_callset import ( ValidateCallsetTask, ) from v03_pipeline.lib.tasks.write_validation_errors_for_run import ( WriteValidationErrorsForRunTask, ) -from v03_pipeline.lib.test.mock_complete_task import MockCompleteTask from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase -TEST_CODING_NONCODING_CRDQ_1 = ( - 'v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht' +TEST_CODING_AND_NONCODING_HT = ( + 'v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht' ) MULTIPLE_VALIDATION_EXCEPTIONS_VCF = ( 'v03_pipeline/var/test/callsets/multiple_validation_exceptions.vcf' @@ -36,22 +34,16 @@ class ValidateCallsetTest(MockedDatarootTestCase): def setUp(self) -> None: super().setUp() shutil.copytree( - TEST_CODING_NONCODING_CRDQ_1, - cached_reference_dataset_query_path( + TEST_CODING_AND_NONCODING_HT, + valid_reference_dataset_path( ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - CachedReferenceDatasetQuery.GNOMAD_CODING_AND_NONCODING_VARIANTS, + ReferenceDataset.gnomad_coding_and_noncoding, ), ) - @patch( - 'v03_pipeline.lib.tasks.validate_callset.UpdatedCachedReferenceDatasetQuery', - ) def test_validate_callset_multiple_exceptions( self, - mock_updated_cached_reference_dataset_query: Mock, ) -> None: - mock_updated_cached_reference_dataset_query.return_value = MockCompleteTask() worker = luigi.worker.Worker() validate_callset_task = ValidateCallsetTask( reference_genome=ReferenceGenome.GRCh38, diff --git a/v03_pipeline/lib/tasks/write_new_variants_table.py b/v03_pipeline/lib/tasks/write_new_variants_table.py index a312084b4..ff31c4ee8 100644 --- a/v03_pipeline/lib/tasks/write_new_variants_table.py +++ b/v03_pipeline/lib/tasks/write_new_variants_table.py @@ -5,25 +5,23 @@ import luigi.util from v03_pipeline.lib.annotations.fields import get_fields -from v03_pipeline.lib.annotations.rdc_dependencies import ( - get_rdc_annotation_dependencies, -) from v03_pipeline.lib.misc.allele_registry import register_alleles_in_chunks from v03_pipeline.lib.misc.callsets import get_callset_ht from v03_pipeline.lib.misc.io import remap_pedigree_hash from v03_pipeline.lib.misc.math import constrain from v03_pipeline.lib.model import ( Env, - ReferenceDatasetCollection, ) from v03_pipeline.lib.paths import ( new_variants_table_path, + valid_reference_dataset_path, variant_annotations_table_path, ) -from v03_pipeline.lib.reference_data.gencode.mapping_gene_ids import ( +from v03_pipeline.lib.reference_datasets.gencode.mapping_gene_ids import ( load_gencode_ensembl_to_refseq_id, load_gencode_gene_symbol_to_gene_id, ) +from v03_pipeline.lib.reference_datasets.reference_dataset import BaseReferenceDataset from v03_pipeline.lib.tasks.base.base_loading_run_params import ( BaseLoadingRunParams, ) @@ -49,7 +47,17 @@ class WriteNewVariantsTableTask(BaseWriteTask): @property def annotation_dependencies(self) -> dict[str, hl.Table]: - deps = get_rdc_annotation_dependencies(self.dataset_type, self.reference_genome) + deps = {} + for ( + reference_dataset + ) in BaseReferenceDataset.for_reference_genome_dataset_type_annotations( + self.reference_genome, + self.dataset_type, + ): + deps[f'{reference_dataset.value}_ht'] = hl.read_table( + valid_reference_dataset_path(self.reference_genome, reference_dataset), + ) + if self.dataset_type.has_gencode_ensembl_to_refseq_id_mapping( self.reference_genome, ): @@ -163,15 +171,26 @@ def create_table(self) -> hl.Table: ), ) - # Join new variants against the reference dataset collections that are not "annotated". - for rdc in ReferenceDatasetCollection.for_reference_genome_dataset_type( + # Join new variants against the reference datasets that are not "annotated". + for ( + reference_dataset + ) in BaseReferenceDataset.for_reference_genome_dataset_type_annotations( self.reference_genome, self.dataset_type, ): - if rdc.requires_annotation: + if reference_dataset.is_keyed_by_interval: continue - rdc_ht = self.annotation_dependencies[f'{rdc.value}_ht'] - new_variants_ht = new_variants_ht.join(rdc_ht, 'left') + reference_dataset_ht = self.annotation_dependencies[ + f'{reference_dataset.value}_ht' + ] + reference_dataset_ht = reference_dataset_ht.select( + **{ + f'{reference_dataset.name}': hl.Struct( + **reference_dataset_ht.row_value, + ), + }, + ) + new_variants_ht = new_variants_ht.join(reference_dataset_ht, 'left') # Register the new variant alleles to the Clingen Allele Registry # and annotate new_variants table with CAID. @@ -197,6 +216,7 @@ def create_table(self) -> hl.Table: new_variants_ht = new_variants_ht.join(ar_ht, 'left') elif self.dataset_type.should_send_to_allele_registry: new_variants_ht = new_variants_ht.annotate(CAID=hl.missing(hl.tstr)) + return new_variants_ht.select_globals( updates={ hl.Struct( diff --git a/v03_pipeline/lib/tasks/write_relatedness_check_table.py b/v03_pipeline/lib/tasks/write_relatedness_check_table.py index edfe0d716..a7056d7b2 100644 --- a/v03_pipeline/lib/tasks/write_relatedness_check_table.py +++ b/v03_pipeline/lib/tasks/write_relatedness_check_table.py @@ -3,15 +3,15 @@ import luigi.util from v03_pipeline.lib.methods.relatedness import call_relatedness -from v03_pipeline.lib.model import CachedReferenceDatasetQuery from v03_pipeline.lib.paths import ( relatedness_check_table_path, ) +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset from v03_pipeline.lib.tasks.base.base_loading_run_params import BaseLoadingRunParams from v03_pipeline.lib.tasks.base.base_write import BaseWriteTask from v03_pipeline.lib.tasks.files import GCSorLocalTarget -from v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query import ( - UpdatedCachedReferenceDatasetQuery, +from v03_pipeline.lib.tasks.reference_data.updated_reference_dataset import ( + UpdatedReferenceDatasetTask, ) from v03_pipeline.lib.tasks.validate_callset import ValidateCallsetTask @@ -31,8 +31,8 @@ def requires(self): return [ self.clone(ValidateCallsetTask), self.clone( - UpdatedCachedReferenceDatasetQuery, - crdq=CachedReferenceDatasetQuery.GNOMAD_QC, + UpdatedReferenceDatasetTask, + reference_dataset=ReferenceDataset.gnomad_qc, ), ] diff --git a/v03_pipeline/lib/tasks/write_relatedness_check_table_test.py b/v03_pipeline/lib/tasks/write_relatedness_check_table_test.py index c96ba9ecb..135710545 100644 --- a/v03_pipeline/lib/tasks/write_relatedness_check_table_test.py +++ b/v03_pipeline/lib/tasks/write_relatedness_check_table_test.py @@ -1,60 +1,40 @@ import shutil -from unittest import mock +from unittest.mock import patch import hail as hl import luigi.worker from v03_pipeline.lib.misc.io import import_vcf from v03_pipeline.lib.model import ( - CachedReferenceDatasetQuery, DatasetType, ReferenceGenome, SampleType, ) from v03_pipeline.lib.paths import ( - cached_reference_dataset_query_path, imported_callset_path, relatedness_check_table_path, + valid_reference_dataset_path, ) +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset from v03_pipeline.lib.tasks.write_relatedness_check_table import ( WriteRelatednessCheckTableTask, ) from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase -TEST_GNOMAD_QC_HT = 'v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht' +TEST_GNOMAD_QC_HT = 'v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht' TEST_VCF = 'v03_pipeline/var/test/callsets/1kg_30variants.vcf' - TEST_RUN_ID = 'manual__2024-04-03' -MOCK_CONFIG = { - 'gnomad_qc': { - '38': { - 'version': '4.0', - 'source_path': TEST_GNOMAD_QC_HT, - 'custom_import': lambda *_: hl.Table.parallelize( - [], - hl.tstruct( - locus=hl.tlocus('GRCh38'), - alleles=hl.tarray(hl.tstr), - ), - key=['locus', 'alleles'], - ), - }, - }, -} - class WriteRelatednessCheckTableTaskTest(MockedDatarootTestCase): def setUp(self) -> None: super().setUp() - self.gnomad_qc_path = cached_reference_dataset_query_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - CachedReferenceDatasetQuery.GNOMAD_QC, - ) shutil.copytree( TEST_GNOMAD_QC_HT, - self.gnomad_qc_path, + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.gnomad_qc, + ), ) # Force imported callset to be complete @@ -69,48 +49,67 @@ def setUp(self) -> None: ), ) - @mock.patch.dict( - 'v03_pipeline.lib.reference_data.compare_globals.CONFIG', - MOCK_CONFIG, - ) - @mock.patch.dict( - 'v03_pipeline.lib.tasks.reference_data.updated_cached_reference_dataset_query.CONFIG', - MOCK_CONFIG, - ) def test_relatedness_check_table_task_gnomad_qc_updated( self, ) -> None: - ht = hl.read_table( - self.gnomad_qc_path, - ) - self.assertEqual( - hl.eval(ht.versions.gnomad_qc), - 'v3.1', - ) - worker = luigi.worker.Worker() - task = WriteRelatednessCheckTableTask( - reference_genome=ReferenceGenome.GRCh38, - dataset_type=DatasetType.SNV_INDEL, - run_id=TEST_RUN_ID, - sample_type=SampleType.WGS, - callset_path=TEST_VCF, - ) - worker.add(task) - worker.run() - self.assertTrue(task.complete()) - ht = hl.read_table(self.gnomad_qc_path) self.assertEqual( - hl.eval(ht.versions.gnomad_qc), - '4.0', - ) - ht = hl.read_table( - relatedness_check_table_path( - ReferenceGenome.GRCh38, - DatasetType.SNV_INDEL, - TEST_VCF, + hl.eval( + hl.read_table( + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.gnomad_qc, + ), + ).version, ), + '1.0', ) - self.assertEqual( - ht.collect(), - [], - ) + with patch.object( + ReferenceDataset, + 'version', + return_value='2.0', + ), patch.object( + ReferenceDataset, + 'get_ht', + lambda *_: hl.Table.parallelize( + [], + hl.tstruct( + locus=hl.tlocus('GRCh38'), + alleles=hl.tarray(hl.tstr), + ), + key=['locus', 'alleles'], + globals=hl.Struct(version='2.0'), + ), + ): + worker = luigi.worker.Worker() + task = WriteRelatednessCheckTableTask( + reference_genome=ReferenceGenome.GRCh38, + dataset_type=DatasetType.SNV_INDEL, + run_id=TEST_RUN_ID, + sample_type=SampleType.WGS, + callset_path=TEST_VCF, + ) + worker.add(task) + worker.run() + self.assertTrue(task.complete()) + self.assertEqual( + hl.eval( + hl.read_table( + valid_reference_dataset_path( + ReferenceGenome.GRCh38, + ReferenceDataset.gnomad_qc, + ), + ).version, + ), + '2.0', + ) + ht = hl.read_table( + relatedness_check_table_path( + ReferenceGenome.GRCh38, + DatasetType.SNV_INDEL, + TEST_VCF, + ), + ) + self.assertEqual( + ht.collect(), + [], + ) diff --git a/v03_pipeline/lib/tasks/write_variant_annotations_vcf_test.py b/v03_pipeline/lib/tasks/write_variant_annotations_vcf_test.py index 4a6b4baec..0ca2ee0ef 100644 --- a/v03_pipeline/lib/tasks/write_variant_annotations_vcf_test.py +++ b/v03_pipeline/lib/tasks/write_variant_annotations_vcf_test.py @@ -41,15 +41,15 @@ class WriteVariantAnnotationsVCFTest(MockedDatarootTestCase): 'v03_pipeline.lib.tasks.write_new_variants_table.load_gencode_gene_symbol_to_gene_id', ) @patch( - 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdateCachedReferenceDatasetQueries', + 'v03_pipeline.lib.tasks.base.base_update_variant_annotations_table.UpdatedReferenceDatasetQueryTask', ) def test_sv_export_vcf( self, - mock_update_crdqs_task: Mock, + mock_rd_query_task: Mock, mock_load_gencode: Mock, ) -> None: mock_load_gencode.return_value = GENE_ID_MAPPING - mock_update_crdqs_task.return_value = MockCompleteTask() + mock_rd_query_task.return_value = MockCompleteTask() worker = luigi.worker.Worker() update_variant_annotations_task = ( UpdateVariantAnnotationsTableWithNewSamplesTask( diff --git a/v03_pipeline/lib/test/mock_clinvar_urls.py b/v03_pipeline/lib/test/mock_clinvar_urls.py new file mode 100644 index 000000000..767fe6cf0 --- /dev/null +++ b/v03_pipeline/lib/test/mock_clinvar_urls.py @@ -0,0 +1,37 @@ +import gzip +import tempfile +from contextlib import contextmanager + +import pysam +import responses + +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.reference_datasets.clinvar import CLINVAR_SUBMISSION_SUMMARY_URL +from v03_pipeline.lib.reference_datasets.reference_dataset import ( + ReferenceDataset, +) + +CLINVAR_VCF = 'v03_pipeline/var/test/reference_datasets/raw/clinvar.vcf' +CLINVAR_SUBMISSION_SUMMARY = ( + 'v03_pipeline/var/test/reference_datasets/raw/submission_summary.txt' +) + + +@contextmanager +def mock_clinvar_urls(reference_genome=ReferenceGenome.GRCh38): + with tempfile.NamedTemporaryFile( + suffix='.vcf.bgz', + ) as f1, open(CLINVAR_SUBMISSION_SUMMARY, 'rb') as f2: + responses.add_passthru('http://localhost') + # pysam is being used as it was the cleanest way to + # get a bgzip formatted file :/ + pysam.tabix_compress(CLINVAR_VCF, f1.name, force=True) + responses.get( + ReferenceDataset.clinvar.path(reference_genome), + body=f1.read(), + ) + responses.get( + CLINVAR_SUBMISSION_SUMMARY_URL, + body=gzip.compress(f2.read()), + ) + yield diff --git a/v03_pipeline/lib/test/mocked_reference_datasets_testcase.py b/v03_pipeline/lib/test/mocked_reference_datasets_testcase.py new file mode 100644 index 000000000..569996158 --- /dev/null +++ b/v03_pipeline/lib/test/mocked_reference_datasets_testcase.py @@ -0,0 +1,40 @@ +import os +import shutil + +import responses + +from v03_pipeline.lib.model.definitions import ReferenceGenome +from v03_pipeline.lib.paths import valid_reference_dataset_path +from v03_pipeline.lib.reference_datasets.reference_dataset import ReferenceDataset +from v03_pipeline.lib.test.mock_clinvar_urls import mock_clinvar_urls +from v03_pipeline.lib.test.mocked_dataroot_testcase import MockedDatarootTestCase + +REFERENCE_DATASETS_PATH = 'v03_pipeline/var/test/reference_datasets' + + +class MockedReferenceDatasetsTestCase(MockedDatarootTestCase): + @responses.activate + def setUp(self) -> None: + super().setUp() + for reference_genome in ReferenceGenome: + with mock_clinvar_urls(reference_genome): + path = os.path.join( + REFERENCE_DATASETS_PATH, + reference_genome.value, + ) + # Use listdir, allowing for missing datasets + # in the tests. + for dataset_name in os.listdir( + path, + ): + # Copy the entire directory tree under + # the dataset name. + shutil.copytree( + os.path.join(path, dataset_name), + os.path.dirname( + valid_reference_dataset_path( + reference_genome, + ReferenceDataset(dataset_name), + ), + ), + ) diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/.README.txt.crc b/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/.README.txt.crc deleted file mode 100644 index 0cd2ba4fdd7ad23eb31e149bd7c3f22a5db1f3c4..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}7lN%{>SJ5pDws diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/.metadata.json.gz.crc deleted file mode 100644 index ae7d0b4c652360fe14cddcbfee89e1678c1d2aed..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}Cs7(PQx$|gx_Va1PO{-p&SF@goM-}4j@#y_BJ(5oM1bZ zLgaU6w}m!o3xb@SZ)WWAPX|Vn2r)cZg%;KH={BoTjv#{B3R{R@UTGvNg_1eM-FM)z ztx?_V1acuGQqIyuILy3lr0;6smo6Tl+*GAdvvSN2O7ehsnSnZK4*|-{}G!Jp&AcpR`?FDOX@(Cj@xukICA4;Kw5an4*)HPeO(YWVm9Bh(1H^?~1S%!l6_qV@-1 KGE_-d0ssI{NRd_m diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index a5f8539c8c2d63b57d571254e52403f1395bb405..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}CV~ic=ET@v#R083zK( diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/metadata.json.gz deleted file mode 100644 index 483a5fe1475e940de35c24cf1be9f7f7e99d5723..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 562 zcmV-20?qv&iwFP!000000NqvJZ`v>r{$KpGA4!`4p(t-n%Qhr5O)%}DRfQb;0$yDQ zInGpu^1t_-kN_c*#-_fMhX|j)@9sX|C*77P`UE5-6UE^f`0U5+d=0Ee9I|+X4GHgw zOj(}cnMw%~GV*{|Eaku~kc8Vh!}rY^_)T+xD6k>C`8bYf26LlxX0~Oz@G!zsWyCT> zlu%S}oOPNBBT|Kti8B3O;p|F><7=(y=6(l!C~T2myCO&Fr}q2q^Xz8TAH8qr6&ry< zEh}=ICkg15km)$`d!uN7`KEc1w2jzF;u&c$y841=IQJf+=bhL_YcdDt@0 zr#7>&KNKPC=vo>kb*!AE2KkMe6~-?vE=#zLN}x$OufH1N;lvW5GqVu1Nz=O%0r_@l&g!DF#I@PjKi9(o)< zG#;=Z8PKbsT(3W8R4pM&bHy!xOEQBQtgvbs`{9)+zLa)X3Juk3)p_t4$I=8Hg!k~$bZE71SY^&55Z7UST3+Fv9hf4?m0BbK1 A{Qv*} diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/parts/.part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.crc b/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/parts/.part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.crc deleted file mode 100644 index 3003ab7a8cfe9e9c224518cb3b175add0b8ef7f9..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}Bhe!L0)T6MO@A diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/parts/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75 b/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/rows/parts/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75 deleted file mode 100644 index b28df25c328107bed57a5177b6494328d7e522e2..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 51 xcmb1RU|FCYp45W)gq diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index a3c8757f4420fd33d0e589cab5d6c5d06bc2d201..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}E4;S!)ac5Oo5M diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/globals/metadata.json.gz deleted file mode 100644 index b6d39681d2c94a980b14e3fa6b55ff772cae7e82..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 322 zcmV-I0loeoiwFP!000000F_c*PlGTN{V#nL6P?*&coW4Z6O&(-feYINLxsd z^uK#A$Q&?Z|(vs5CdV4?vgM@1kK?z5}g|R)DlitgzuI`3rTU&_FvZA$Tcvk>`zbkX+;!@Fr)>xLY18v3whcGs zb_-^7DNk{9J-dy5sDO#RO67?pN^^0*l!0hNV4^-zu$a$p7E{}sb+Wbo(}$ITXJe&dNoo>NMX&&sL{ZHJV-!rWt5(Z zcv~h1<5JYDftH(-#?3m(jr*0-hP!{LBOvOoutgpmk|r}-ui#bxG`XZ9^Sc_Idtc?Q zbyqEs@qTVw+#VGQ4@n{+B5Bjs6q@)xd=&C+PEkK+@LpLRqI5%0=XS0S8Rk6)ncrxC zoz>BwF78}-7}E>C85=g+T^0DJuEjxMFwnukY+0Ry7XTL)iwldz|JrrE%=c!h5!t~2h>YIT&|t$@D{o>+)7{#P%RnUD1|RO^1pJv5QEF>8l1~Qj%0D zCI9=9*eHsFT|(dYl8>esZ7CNQ5Df>TL6+xF&ul!b%5t4e zGKw(LP=N-?z!VxpQB%A1;tT2}mact!+t%Qj?@)9bq1IH8h1CtST`-wcFu~=4>QHjI zaTsK?zegsc+JNMY+m^g&P21(kx5CsI(ALj>jXQ_?fBNf2hj-ADrBJLfm&u~7x&u80w`j~V#GjLN}MkKFF37|5dg6}}IcSz@aU`*|6B|%z)NQX{Y#ZO81 zA;bC4+bl^IHL$pbbM{H^JgScW_eHTc)zrrpf-9J6`WgBLUx}JU=C)bVFSAwXqg5R? r09PH(VDGtBmE2e^nHB8^v``}A69?y4lKk=cM63G+kd-Tmw*mkFEvu=% diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index 682fea6e74bb55e9b2648b1e3a7d2bfe53a4718a..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}7j(_(}fD{NlR+C<+EZ diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/metadata.json.gz deleted file mode 100644 index d37774da987b28605255ea68e1ed47f81f458942..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 586 zcmV-Q0=4}giwFP!000000Ns|~Z`v>r$Nv{UZCcVb3BQ`R!q|jVO%qIeScQ;lU%taL7{&&}rlfEErF3?$)gRLf^N@+`Xr;jw8 zo1R1LVCBe2#n#@yr94(soy^pj=9K_w!6+&}BR9_js?al_&LdP1JIjQGS>t7&f>=gg z9J*4*0Mmw_AL>F5s}0oHJ9Xal-E@69spqWx!N_Zb7nkL9B!>Gy@Cc<&{43o6ijpELYJN!q$0dBEkK=73lQ2f925sJw*N3@H<&^c7G Yp_lLx=e^8- diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/parts/part-0-9e75273d-7113-40e4-a327-453f3451dc8c b/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/rows/parts/part-0-9e75273d-7113-40e4-a327-453f3451dc8c deleted file mode 100644 index 4bc78c56bb23317e319decf2e2b1616d435d9caa..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 51 xcmb1RU|z1|Z5A7KPF diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index 92c2ee4f33acec3fa52c2ba069dcb372b9ce82f9..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}8`;O8I}%$)56wXXzxnxfqkvtI71=&POY*yWcVaaM zhB(PjL&9yVfPWbY7X=78CR-QOz{cu^*^{F7!Q>IdDChaQ5R&S4#Y!QMRAWRB@SQY! zHS!=1Ag9|iW#i=_>x%L%Jl7SN6m?kvGW8w6EH&tGC~%`Bv7m=WF}=1;bG=R z1iu@Q3>^PiYuerok8P`Exv^7#xPUY)1? zGM2xjzA?`=4YW!#JN!8jifVf;N-*v#NbvIRJ;ATx;|-MuDC$OXq)x8eR0J}H(&Hw-a diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/parts/.part-0.crc deleted file mode 100644 index 66c4951841b92bd36f385f7c37d788ca9c45f69f..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}E@^v-XF~EXQyFDE$SF diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/globals/parts/part-0 deleted file mode 100644 index 31232639d8682902191606d0e4982ce0b412c2e0..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 774 zcmV+h1Nr>%0ssI+1^@skwJ-f(LGP`VM&1X$LPEc*?j9v#v$0u`qdr{F=Hs_nH_7FAE z>(x-B7R90~EiW0rg*+|iu#C(3RtnvjbU=-t+kYaimggn&9nE(+Xo=i>3^_*XzWLSn zrqI1JRcUM<5DH3f1$xOgF2;OCG-lh&wqYA>{q5X3O^|13WK9;VJ};TI7xR5~wyDh1 zi5Y2BZDDuCj?w(SE^HPDndeBIH|2kjBfGx|AM%V%Js1lRtwKSf?c6z^b8)VYTIoDP z?|r0gSWafH3!k(I$h{Sn|3C$jBq>Ovm!+Vegr*_|?sPW4ilWlD7QzmKK){A=00?k! zrp;#3;AC(L0w!Lvx>S!o@?HZqQPGLaGzGIpssu>KrO`=t=2*P?X?^XoeAqrE#K;@QKb-Mo-cm( zUUyzMHiZmSk7#sH+c_|#?$5vr{x5mjq^v4XwN>62g|;fR5g;@PAw5=GXbpOo7yXOg zVDRlKy1v}S@wRu63U76IbxJ{7mPR@6Dc@omER;A# z>TF2TDiMxjBNMjwBTCpS7FLb7Ozu*v0oenqly@9idCz!)V+8L!0oejI#YK~}Vx@v4 zSOLPsA~K~UaskED1Z;AdU-srd%pBM;>7;Gstj?`e%j^oRlV1aNGA!^i$3kglSzwJk z4O9Y=?kfB~ngI%iLg3T7$b6GIl@uYIM%z*1IuQ~Pb!u{bEF}n~zgRnF{dG);*->HR zwZu|^K&jBjH3?bf|AJGnraU@IA5#hxRe^N-Ow}aq!5r+1EYV>1HV%C3K1=?g^caAX3whOcipr@%Nm@3x$O&t)Gyp z4r<6AlUp{vJZ=-Bbi+CYauySn?@6oejZqsL1*Hem?WzOr^bIE{dJrB)t5ISNl)OG2#nT$4VTZTE=hDo80facyqsWsdHrFW-mLiZ{_lCro|*XM6K+ z&vbTj7LMvxch^%!TtMy6+0_`R!l>cpclWm%p7!ciC4E$HU1s$q+iN->ug25X@r%dF zvIT{lp3(!bK6+RU=Plz0t6mTLYP?upbqS$yYN9*6ElGlAPg7latcSa4*Y|~f1tXrM zoB>nP3(j8g0OlgYiFp|3tGkhPS)>r%Z=8OdP>ea|J*nLIswbrdLd)zNI2XPfu`y*bDRr1apa6e*T{zr)GZuT%h$S Hj0gY#wW3}2 diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index edeb9708277545d1bc428b01c4304a1c8b86c5ca..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 20 bcmYc;N@ieSU}E@Hx|{dO6$aN%(Exq`K`{nw diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/metadata.json.gz deleted file mode 100644 index 8ab2a9563aa7a4c899e9271bff4aab7ca00c19a0..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 1064 zcmV+@1lRi?iwFP!000000Ns~gZ`(Eyz`u*0HcW0EJ8>Pq*{c*vnG27bmYBhV0qInRJd@~KW z_(~F6tFU;@MuiI(WAFn`B^SFTk^5)|f>VqFomM`p?XjVA^tfT>Af%3attCm) zQsKFcBuybvNux}=EY>z|df0|sN=VFL$^t}!<<)AgWiD`ZH8KOADTx>F6F}FxMDR$c zr8b^>%e&>x_@h>G$8+O)>W$gwGIqSX0%_hwp|uureS2$c%-o%bLc%R$*Et*;Wsu3RMSWF58OKkL`h5 z%^qfpQ$Tt+-SzeSz8>W=3Xbs}M9;f=7_aK9HwQ^nyWf)(2f~Z7kyR3>k#iqr>-+PT zJxirm6Z<%eVd&lbzED8J$*(D7nWEU&6U8ikE!6)nA6CvY^yJ zQXNJwvPgSqg^=H%X=x06NRbM|p+_l-ZM!TIX|A|jU69Xh|9I+76LXZ}Ce48CMIx1>tXf*Pz($r5< z1EU2MEui=&Ctkor6B6x=cnJ{}A+`tKhop8KTFfSAz2T%nrzRb%BKSotgS!5?mbsYdKt(viv-lK%j7h@;7hZ=ITls z0P*uv-#sxoaPH!X(>mi-y0|j>fycz>gL8}z%K^FBTATy`%89l9kHhTVY=LzaK4e_j!>g9)Aq zncDy4q<9#VDzW0j9xsX>Mz3_AoMAX+4?CmbyR+WlGOIrRF@P~@OdWjW4hZ(I*x`Z( i2L|gRFw7jZY^Gc&Nv3ikQYOk%mi-MbD`&m95C8yQc>>7* diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/parts/.part-0-3569201c-d630-43c4-9056-cbace806fe8d.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/parts/.part-0-3569201c-d630-43c4-9056-cbace806fe8d.crc deleted file mode 100644 index dd555f5530818b965a42aefee5e49be3d67965d2..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}8AWR@njo5<~+H diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/parts/part-0-3569201c-d630-43c4-9056-cbace806fe8d b/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/rows/parts/part-0-3569201c-d630-43c4-9056-cbace806fe8d deleted file mode 100644 index 446fb54911ffb22915fe6eeaa755c8b05942e6ed..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 106 zcmWG#U|{e7VvVi(e-%6&nHU&%uq0;`89tq6%Eai%=nRx{sArhweA-??it)Oo@bgcW z#>NZ{=hzqyFtb~JyTK>V5PtHNMSJ973kHV%Ltkx}<&P>D*fKEi0(F7_*kA@m1_l5U CmK@0d diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/.README.txt.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht/.README.txt.crc deleted file mode 100644 index 2796480e9024ef6664a5fc3ea3e7c508fb9fd7d5..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}9KhQ~wPB63_#d diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht/.metadata.json.gz.crc deleted file mode 100644 index 5def68f7f61e58f681686d4086173f9fea428d51..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}DhrId0s>z1|Z5A7KPF diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/README.txt b/v03_pipeline/var/test/reference_data/test_combined_1.ht/README.txt deleted file mode 100644 index 9b284affa..000000000 --- a/v03_pipeline/var/test/reference_data/test_combined_1.ht/README.txt +++ /dev/null @@ -1,3 +0,0 @@ -This folder comprises a Hail (www.hail.is) native Table or MatrixTable. - Written with version 0.2.133-4c60fddb171a - Created at 2024/11/02 15:22:20 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index 92c2ee4f33acec3fa52c2ba069dcb372b9ce82f9..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}8`;O8I}%$)56wXXzxnxfqkvtI71=&POY*yWcVaaM zhB(PjL&9yVfPWbY7X=78CR-QOz{cu^*^{F7!Q>IdDChaQ5R&S4#Y!QMRAWRB@SQY! zHS!=1Ag9|iW#i=_>x%L%Jl7SN6m?kvGW8w6EH&tGC~%`Bv7m=WF}=1;bG=R z1iu@Q3>^PiYuerok8P`Exv^7#xPUY)1? zGM2xjzA?`=4YW!#JN!8jifVf;N-*v#NbvIRJ;ATx;|-MuDC$OXq)x8eR0J}H(&Hw-a diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/parts/.part-0.crc deleted file mode 100644 index 66c4951841b92bd36f385f7c37d788ca9c45f69f..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}E@^v-XF~EXQyFDE$SF diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_data/test_combined_1.ht/globals/parts/part-0 deleted file mode 100644 index 31232639d8682902191606d0e4982ce0b412c2e0..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 774 zcmV+h1Nr>%0ssI+1^@skwJ-f(LGP`VM&1X$LPEc*?j9v#v$0u`qdr{F=Hs_nH_7FAE z>(x-B7R90~EiW0rg*+|iu#C(3RtnvjbU=-t+kYaimggn&9nE(+Xo=i>3^_*XzWLSn zrqI1JRcUM<5DH3f1$xOgF2;OCG-lh&wqYA>{q5X3O^|13WK9;VJ};TI7xR5~wyDh1 zi5Y2BZDDuCj?w(SE^HPDndeBIH|2kjBfGx|AM%V%Js1lRtwKSf?c6z^b8)VYTIoDP z?|r0gSWafH3!k(I$h{Sn|3C$jBq>Ovm!+Vegr*_|?sPW4ilWlD7QzmKK){A=00?k! zrp;#3;AC(L0w!Lvx>S!o@?HZqQPGLaGzGIpssu>KrO`=t=2*P?X?^XoeAqrE#K;@QKb-Mo-cm( zUUyzMHiZmSk7#sH+c_|#?$5vr{x5mjq^v4XwN>62g|;fR5g;@PAw5=GXbpOo7yXOg zVDRlKy1v}S@wRu63U76IbxJ{7mPR@6Dc@omER;A# z>TF2TDiMxjBNMjwBTCpS7FLb7Ozu*v0oenqly@9idCz!)V+8L!0oejI#YK~}Vx@v4 zSOLPsA~K~UaskED1Z;AdU-srd%pBM;>7;Gstj?`e%j^oRlV1aNGA!^i$3kglSzwJk z4O9Y=?kfB~ngI%iLg3T7$b6GIl@uYIM%z*1IuQ~Pb!u{bEF}n~zgRnF{dG);*->HR zwZu|^K&jBjH3?bf|AJGnraU@IA5#hxRe^N-Ow}aq!5r+1EYV>1HV%C3K1=?g^caAX3whOcipr@%Nm@3x$O&t)Gyp z4r<6AlUp{vJZ=-Bbi+CYauySn?@6oejZqsL1*Hem?WzOr^bIE{dJrB)t5ISNl)OG2#nT$4VTZTE=hDo80facyqsWsdHrFW-mLiZ{_lCro|*XM6K+ z&vbTj7LMvxch^%!TtMy6+0_`R!l>cpclWm%p7!ciC4E$HU1s$q+iN->ug25X@r%dF zvIT{lp3(!bK6+RU=Plz0t6mTLYP?upbqS$yYN9*6ElGlAPg7latcSa4*Y|~f1tXrM zoB>nP3(j8g0OlgYiFp|3tGkhPS)>r%Z=8OdP>ea|J*nLIswbrdLd)zNI2XPfu`y*bDRr1apa6e*T{zr)GZuT%h$S Hj0gY#wW3}2 diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index e7c96acca327dcfcffd1c72debd78506f20a762f..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 20 bcmYc;N@ieSU}9+doja8|P~Gz01+_~6Jwyi> diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/metadata.json.gz deleted file mode 100644 index d2c7ccb1c58d4fa54ca801b2151bd026682c12bc..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 1062 zcmV+>1lju^iwFP!000000Ns~uZ`(Eyfd7j=ZJ6BEFUb;rvt7F}<|Kw41jPzNpv9BT zR-!bDF5(6H?>kBsMM_H36nz2e4~BF+-ka_oQI8HM5jaC<&M7XJ0)YBh@+dtDU&SbD4BEF?`7`EHws2VkmLJOb>sSkQrp_{YBha>qFES|e6t@i z{)q@%tFSny!@`A&F$5v=CF8p#k^6WDocS09I<0(G+Y>|Q=rP00flD3t#@)psT}vX; zQsJqMBuzo6q*10_7Hbcq96uj2-W;K$>@PWUWO^-`*G-Gk3@1h)@q6;z$+R3c&Do zQ&}soOc!OW-tEeJySN>X7S-Mvu9~)tV6=Oz$j|W~C3e-XmlNZmB!Wdmf=8I8l=`s% zLoH3PZyQ<^LPM(kiMcP9vgiHoC3G@Vj+W1qrY0{^bV1*uQq2x{ra%|!>U}LbRh8l}B7HDkFSeR+0 zOf}#}KkMi*dE&I{u!8)KD2COFp7aOKGcw zYxW;1oS&Wb&zsx+NhQsx^Gp~4Z$U%mP>S69*y zNRXfUzUU}uz7YZ@Qi%`cp@8@Kz{L}%b;hf7ab@@&kCB-h2SEx_M?yG{iw3fNWZ9ofeQ!ScbqyceIISFsDZ@#Xfa(@DNdkf_q>jf+XulBH;jaW#& zd(#ze1Rzpqdr&XzLtO!@_nMj4^R8$8@=k3_<%vXo#pcDLedRGRuR&EE+e+pr;|URR z#?-erevYe~&g)K>c6;Z&ey=m=4z`^^5ReWbgMQ}{=vnXL@+{Z}z5G-C!;0`7jPXoJ z)&55(#lxUfi4`CAcv18)diD3Bi{X?#>p#T5? diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/parts/.part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.crc b/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/parts/.part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.crc deleted file mode 100644 index dd555f5530818b965a42aefee5e49be3d67965d2..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}8AWR@njo5<~+H diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/parts/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2 b/v03_pipeline/var/test/reference_data/test_combined_1.ht/rows/parts/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2 deleted file mode 100644 index 446fb54911ffb22915fe6eeaa755c8b05942e6ed..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 106 zcmWG#U|{e7VvVi(e-%6&nHU&%uq0;`89tq6%Eai%=nRx{sArhweA-??it)Oo@bgcW z#>NZ{=hzqyFtb~JyTK>V5PtHNMSJ973kHV%Ltkx}<&P>D*fKEi0(F7_*kA@m1_l5U CmK@0d diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/.README.txt.crc b/v03_pipeline/var/test/reference_data/test_combined_2.ht/.README.txt.crc deleted file mode 100644 index 974f9ff55dcde3eea8f1f84ad7ace99fed2968a3..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}EsLPq_#H5ds4G diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_2.ht/.metadata.json.gz.crc deleted file mode 100644 index 522a8fe04d7a2e06ec143a4ae99cb73fbb965ff1..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}BK%I1~#25p4qh diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/README.txt b/v03_pipeline/var/test/reference_data/test_combined_2.ht/README.txt deleted file mode 100644 index 743f636a9..000000000 --- a/v03_pipeline/var/test/reference_data/test_combined_2.ht/README.txt +++ /dev/null @@ -1,3 +0,0 @@ -This folder comprises a Hail (www.hail.is) native Table or MatrixTable. - Written with version 0.2.122-be9d88a80695 - Created at 2024/01/08 10:16:02 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index 5098753b6cdacaa31f7a8665159e10cb402b9054..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}D%`zAPC46OaRX diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/metadata.json.gz deleted file mode 100644 index b6841a0e8574bb40e8825e575d8c76fcc6ec60ea..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 299 zcmV+`0o48t75BnYecpvcIQyDhFwOOg(1 z$$$6WZLaM$Me~$==bX!Fnih;G5Mp?;3N5PX%Y9a%96%{`?(Um@?$@S7)qcO6RI)N%B&z zX~@7dXPD_vGEBzfo5^YH)p@mz{xe67Q546?z>7Yj64pQH934Fq;KSt=@N@Z?3bVwd xtc_x+1?Nemu@pk1b{lPEUMu=hTZt|T4=iO5lfcu2_ttSHd;v)sMjNdH005;MltBOh diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/parts/.part-0.crc deleted file mode 100644 index fd11dbde5ec6bb88e51da5b6dbb1c4019c824e25..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}9hiOppNp4_^Wd diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_data/test_combined_2.ht/globals/parts/part-0 deleted file mode 100644 index 152082d8c8a9cee38cbae151a0e85d000adfd3f3..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 147 zcmb*za+1|HI1YFvq3g6ECYFC-`Sw>^wT%8yHO?C*hw%Bd6*>wDO w$F@g2UCo#s3#hn&7f^adzCyj2`r0HsfVk8b-fuL<$X1 zm!{!=7g*^sjm8Tp@B6(6eW61;E}-}XZMc$%%hgq z*0f!pf;RZ)QHCiE_eI#7YN{=_zQ2O0roTfU@R~-8IQbry`1NEHav{V#snw4@`12FP1sbV91838p=)LXoj=z^ieP zW2Q3X|Gw)40trkS)%sE%5IMdd=lCAwvLlMFfW&01I6MQNeE&3E18WnHBu}s<@gtEL z+vNC6r348%nc)%3I4~DT;@zRbx9uAEusXmfa1fq+7!7C+Q=>P`?22-64Pu!xV&O2M zsNQ<(EEh&(3X!ogVcUB9-U!@jO}CGS(3!W^Vnq!kRK%wxbCzkq@Ozn2SHU_yzxI4; zOdj0#yU&y1GVHyt8kKD@P^d+%Nv(dZ`h1h7pc|L+Fd78y-r#uT5w;YPpM(sM#}l@x z10>OO#II;3>t$)0gQ=KMr7}bJ*(W`$on122 zQOb#s%1H+S*SfD_b;hZ|=8YK8fL2s~LEpU0P(?ot>av9j$0n(ekTiyUZp0FKb<>q? zG~g?RUnX^>533ufdGB0#Q+HGK^-jGc7>y^1BS_?l{SFWd&k zrR*O3mfvoLVbl)-pLN2n-{CO#`zh`CUDiz@3er}m)hibE&k0ov7-Sp8UH*>b1}ivW z<$A{-rH8RATj~Jm96U_oUN>k*eS7>1fCTO^Vn}f01Qh=-b%bJa!_nFWF;or}HgYW- VVx1OhjCP%@{R7$0D73x^001Ux7~B8= diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/.part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.crc b/v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/.part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.crc deleted file mode 100644 index e372e0ff1deb84712bcc8f5f2d16854c476cf6dd..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}8Aqw_Xnb6D9*P diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/.part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.crc b/v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/.part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.crc deleted file mode 100644 index 9fe01ae3cdbc0f6f4085de369af1c48ec2ef52a5..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}DfW?QsGC5H$jA diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408 b/v03_pipeline/var/test/reference_data/test_combined_2.ht/rows/parts/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408 deleted file mode 100644 index 3a360f426cd8d1964da1834ca7584f941f8e6ffc..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 54 zcmY#nU|>*jFpTg1q_mX=OlMumxK`LdS}`dpwCo zf*_m~hmmmG8IT{33A_aaVdmAcZN(=5BSdWI7X8J_*SnNydIo-bH&VyYW zOG-?L$i#`G17sv-p|)I8Nkp>Jp5Bc-H&!=s z3DW{1_CDToIO{Pwk0Tp^0D1s{03lGzhpV88w$<@h)975WTCHh6tGS}_jXpv|K3uQG z3-VX{hFoFxI*p@oGDiY28I>y0i`8Vz+LJMJ6=uydy3z@A8MWHQ-AuC^mtr^UrE?$0 z{x6Tenx}2D=5wp&S2cW|wLg&v6mlA!D^wEV;KPdA*jra_rRvJMb04=rnQ|-k>J?wL z88vNfb3zJhFI5!HU@f_7Nicen@>gTHktgPyrpcafrO}@Vh~yZ)eQ4r}h5l;3rwOwc zG1HrnBTGr&H^chgG`@GPDwQroA|dIuPX4N``I>0Uy1(3}Z5qd|!<}0xD)KCyFba!S z$zKiItoc5B+te0qac}9Mt~-2m^x)t&c3bQh&G75SMrm+)mh^d3ei&J@JDl+&PuVnd zq43}+5+dBrofAG4=lZI>oM#EXe>e>6xy)7J5eE^u*P_=$5&;H-Nx+;|c7l!&nu-YA z>1>A9M5T{yjG%)60D%AxhuUnY4oAZw0Fd~r)vfyUl4on-7>(wWp`kD|9nWUdJe!ZF z^U-ilHYP}!Q3jL|WiS93uPTjeS#ZieJrk|+xy4Sa+23CODwU6RMWr91EZB@c?$e49 z)V-btR9kUID`rrCd$GqtC&GGD%(p)4RI3A*=Zv4d*PqvwP2)nz zDM}hqMIc%veMx?bwKH$MUQTR|>s_d|m_oaQ&B$_<7 zJO(HS?bPA`l~T+YL0l-NdUniV+5tDr044STa4Nlgjk53`efA-ratm_zH)S$NW|fU3q>O? cW|xt>4*s$9b11|V#n W?JW%@wHAyLh5{x3>B-&(2 zljL?L_`gs7kS*IS>}3yo3A*Rrb9HpA8=bs8p%A}-RD?4xxVW56zxdt}5!!*?`PuYr zGMjz+j4#gjNwS+HhT@{BXjwv% z6lsVNv4|3sU>DF_q+rDre&fx5s2o_9j%1hXqSMxdjk-D->FdFc!WwTh)qWrm5zAc=S)GZvN?N#DN2*E_>eXfs`0;$ax8}{tJPTArv)2OU^ zL`s8tgF4I88PhN!4WE zA+>HA$Zk_*w!X@CDbdEf(E&M2h^}U4@Cka?ZczrYIF`=04!AclT*lEubQgZNX+KWnQY5Ol8bV3q1U1X8&bb2 kAlL;517%S2Sx*Q6091opZ~y=R diff --git a/v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index 5806303366e79ecd614b787f2fbc10c918006280..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 20 bcmYc;N@ieSU}C6zXL>^YQFs0q*?IZ^LcIro diff --git a/v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_combined_37.ht/rows/metadata.json.gz deleted file mode 100644 index c22d07b9f585fe64ceff1b498eac9a7b714748b1..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 1027 zcmV+e1pNCSiwFP!000000Nqz@Z<|OE{x5vGQ*qBWP15kqPH|(&Nh~{xPOU7oW(Rv) zSU7fF#TW74Zx#&f0%@B{?Md|qi=CZ!=6N(cS)9aRxz?Ohcm%q7c(+a=7+bdGe}<#w zezc+_I0!tGD1r!twZW?(p%5rV%YCXm+#ge*SM>#IffC`(D{o2!tYvlx;%+)wG`2Im)qQEn)Zwus|4;llHM@SZywYAunz5Kj@Ch@NM37{lsfiTl zwxdNMlEgQ4D^(mB0skC;<8wWMxh<141`|q6z5V^ry3kBOZqqFnVQKIl8Oq0>f}G@jv%KFTP=bP1zXV8a&1MMV`o zEMbyFhaWV0Xt9Gq4hA=v)L<|}FEKQj)JCvIWHXA-{&!Ll_@g7XB=}wX{&^5_Ae@&pPf19U9%5D(!%tGcGwdjaoaLiKP-Z&C974#ypk0= zR|b(H5tP4v{a2x`q!Ey?*sQ*5-O{NyWSNlI%Eyu=&)?X%)d@WGEH-%K{;X9kx05gw zAnP;;=Slg_&6s(<>ErRH4y_7jN|?oMGT`MDEU#>fDfhW2Ol!Xd27HwLd;cDv|y$uA$%zEYkka@!xMJKKdg*~Pp6qOF)%tZIXc!ea9vlnVP-z6U|`GO zyx2WN`2eFcgR!wRP-dF*X?q1J#_N{C&p%lj8#6SVV`DhL%x?MZ2A@1b_{mom?U9Eq z7#R8wef_}b&M@!2;D-=~o99HC9Gw|~x>=YVomm{6S>1sa*g`E}5Mgj?ZFs;Y%D3Q~ zfX<$Z1cSv4C%D-57p@U1I=Jt#A?ty(1uQJ=PHZ6#1(lUnjJ!Y(fB?t?3JgG>000xF BKBfQw diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/.README.txt.crc b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/.README.txt.crc deleted file mode 100644 index 50813df17212c2627d337264bb97fc170a78575c..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}C7>+pq}$6J-Ob diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/.metadata.json.gz.crc deleted file mode 100644 index 2c25d63fb387cba0a11a787ce80db2f21c4fe4f1..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}AWp7%jXmyU`l}B02<< diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index 45f3552fc338fb0380cd33f0dd8291397a915fd6..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}6wg%q#=|4}tEXz)T-nN{~_xKp{;Uzu@0|kmy{ICiws{5aJ%L?VCNW|&|8<9Re^OGzU zN{@^5a1=Z=6{;^Mg1gW|%zRj6;BZN{mUO6qb}k)3TT>RGR)wb06WV}nk{#yycZ?OZ z2RqcBEub0znQOxxkxZ?P+f)&lmlAS5#5l5`4Mj9px~M?X32Ku)TBlj?e{PSaF5X^! zAlm4F5py~tM1x}@cvvJPqN@h#drd&>@&hEhN}XrB60p4*92<9L-@G+JtE zVqpT#q4m?JBhNY-Y~8j#*84uzqd?C%%8)kO(8(jxIoMZ6=lXJVT91Wh_0asf!^qzC z4hW+AkMG%qbbiT6Fyly0!Z;Jd*#N!~y)Gu*dU5b(BR3KRfr|#6^(6?}y54NiMt`&4 z7TLw+=WL)7A$K}gK}*KNU-)ZAO=}M>^&f7yxxT)-naUm=saNScb-Xk}5i48&;Ug-r z^i}8R_|+kN@$r-JuaIB&V78c*wNair2uDovSQepC-GeqVuND9PUMV&QPuRluc%=2y SfXF(pY5xKOjn(PF1^@sb;N^t? diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/parts/.part-0.crc deleted file mode 100644 index ebb29bf6d5622d53707cf0217854e890e2d646c1..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}7+xwkyg*zGDgiA#DVm diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/globals/parts/part-0 deleted file mode 100644 index 20df267218f337deccc6fcd5043625104ff6e5ed..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 708 zcmV;#0z3V$0ssJm1pojjwJ-f(g9Hr~0OqH8C7|Bu0dy95)g-4AeoJ?*7G&Z$sNs2E z1Qx&gO~G`axP{N;*aC0>YyfotEhHfL(@|8#)S0LX=dom>Dr|iDOy#G`pghw=O%$xG z7SK#n9(gpjwY*MQzq6;|nI0=5kEOHai>+j1_>%l|AUmbI{C!==c{bIg()%Qj!A?E$E>E+b}V*%n(C`de}OlgELw=J}I-lkRj3-Uf1q@fJcCNDy^Ggjxv z6oV*=6rz;P$%bR|v55jAfzQW%04Pezm?RH1Q3juB!c`o?nV32goPvO#F447Ym5gDL zULJy3wHLe>Ofl~Zc|NrDg4Hc-eHpI<-ufkfwVHwD`BlJTtG8-3j(v94piHF+e7$Oy zEyP0BLZJEH`ujRxcr7zn9;z4o>d>KOy*D1*mwa6u1X08p=xA`Poht;d{We6Xp+I7k zA;us91w$e-s_8I7q!eBU62M|npkp9z1jYzXg)=cTY0Gt60g0j579sW1$=@F);dL#AR=H#+B93&q$kS4_8=_AM=;{I(B|SRh1KSqsg#B0+=p3p6 q>?^Xa%u_FzW^IS}&O_O^1Y8R?Qw0qG00000001bpFa00@0RRB3RZ6G; diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/.index.crc b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/.index.crc deleted file mode 100644 index e1e6a76c1ada56e4fd03e209a3de9abe334b1dab..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}D%1@i7Dd6Mh4T diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/.metadata.json.gz.crc deleted file mode 100644 index 5925629957f8c6802c3238b48f037dfb27c39fa3..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}8A1Fm?t26SV`8 diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/index b/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/index/part-0-f96f626e-c873-4613-a02b-88ee1e3f2923.idx/index deleted file mode 100644 index 3ba37303e7dcafed0effe3a70320e66164754f56..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 130 zcma!MU|^UA#2Q=m|0+x?Wnyq)W?@?~UVV02`3W^lI(DPaP#Ll_;mFgm|; zU&qAk{)z=C;LPZ_fZ6?4$P^YXh6e{4O0{VzGSlDk@WT?sTLNZ{Dcx<^tFqm`WZ_f*-VNv9}-5n$#x3j16Rka12I7`i-_P#OH8@$ zB+Z(p878_0N>oqczt;&oPj88|P}!4NV~x-%zL44g?M4;U(`vyVxyIT|X^WKx8D$dU za3Liwc9pz+d39=U0C56N3Drfn0AZy-+MfMJH|OHil_mNOMYuDpTp`RDFi?9W#fzu7 zO!(dYzWe&LOeVAUp4XEwB)MX)<8aOOWQ>fABcntc{^P}A=w+{~lSZ^H&$+JFAX1&@*9U2-iP;=lR=Ps5}0`=t3V&wwgYw@$jQ~Vkwti?@@NH1zk(8wIP zG5@^B^yAxmyy@KT*vis8phTpZ;J~#C!gEl)oo>l-YTwkSUtgAa(nHYRg6b1uDX@aR)Zm+3e2{N3n&ZyQOGS;Io7MFZAFT~v{V#di)UXw38O<9on2_3PDrgU@P-K!D@M@gM zHdPtQf8TWiNu0Q9sn(bF0oC{WT%T*ldxw(^IG!`*Av{5N_v`yCfuQ5Kj{g8V$Gdk} z9Bd`731h%OItx?T>O5u<^k zeD#y4b-V&j1I$teHQbYm&CaKlFk7yQx)$y#X5mZ`A&&;nIT9&BXiz9Cab&uME4NET zpg>fMIYNwin02G6OWpNGA$+5mGKd9u;|YsN);isq4M{vZI^DE5WKS@G%#blX z*ZTL9YEa49*9q2?g%y(_CVIf#HH?;8IHP~jpm%xM?KRzgOBH6WC;A}(Ae*{p zs{arK)wf~|!}KW2E=d7Q&koAJeEFK0YbmlT%C5p!-H+8nAfiGlWY1H6mf)`7H7s#q zxusC!z1Q8Ex^WcA<7tRx5YFSA=4Q?`UDyt)5=RAvajNL*Pd(L&sv0%bs`()7!{dY# zc8jEW8=JCx0=S(&8D0H|GVp7g-DF`dd0)omIRxNg651;|k%R?&gLqBYbqIYF=dI|f7xg+lXa)Um-PT2~-EEQc z&a&0-1GJ$Vb*?(y?B(+3j_?%>@zW*L{(DE_H;dB1@^=pYlQS5z25WJHDIm50e?_zz9_5|xZbUnO zC=PXuF>gw??OYl@KB`ofRA3Oi?bD4vEE0$Y@Dl*xktP# diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/globals/metadata.json.gz deleted file mode 100644 index c4fac28d709de768e0a957b73aec9035f72623ff..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 294 zcmV+>0onc^iwFP!000000F{zoPlGTN#lOpa6%(D=Vt7RH$;4zBA2ep^+B;SUg|uY} zl74sZWxB!q&r8cWzf(?2J20X|h~dsEw5a9}*IA8n0ud}W*h2jDOfy+2l&nMCeFW~? z8r9X_APQL`=Oj&p!_3=8`mPp!?c$M?o2nFQkt2=Rd)f%whG>h{>B{^SN7IzQvIj4O zz%94$El2&nj+3fQKIru`w(IjPGw85vRx7ke2YQ<(=ab9iU_~I;x=;foDJr?4Z;hZO ziM9D=;)FFznq={UzV4-FgYewV-#B* sI8P=mOCdCBH_=8GjiO)ETXZ^RM<;+UOlZ>huPwq3M_mwdq7mc zHPxsu73chl0;?J|Xxk9hlke@~r{-1Idj0;mO30=L5sN#um3Ze>JpSJoVQ;FrvC{f> zGJ)-IhauoKjaG5;Gc57j37zahlEE6h11>U7;^4ZZ3fWkxg;m`%bWjlHyM$13^2c8u Ot?mo5v{V#sn)TGT9A>?gzY(lE038p=)s*tg-i`Udf zj+x3({`+i)k0j8oY#i;e{3aNVmJ2t&Bw|0BF^6KxQ>~cJo6P}pgIn3 zM#C`5hKC~$kp(Ykqhx3l9}HQmXl}tN|7i-+UN26M+rE{Wy4ExMj7`9g-OkE>j0tWx z+3_&mVQ#SsiqBXt41Rfe<;pE;Zh^T)zG%D6O){=ZJ>&GLUR|1aCOUT8@07yp+t$|2 zN&e5ts65G733{)Gg^(I=YZ^>db$2Qthk+a8*26Z&d_GNk{Uh~`?kfqD1|@Lu$>iYg z0^;-(h|`g937j!rz^t>@<6zaue+p4erhrIAPfMSftG&-wmX7^(-F3bFE_tVherDv? z7)EnFe|*?Gw}*RZ*H_5WEV0rb%tlK4hO>YCz|T@384dzJjQh+_LY(?I7-W71ES&=m z2?c7HU3KX9(L^$Yi>+}WcD004jkAJqT= diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/.README.txt.crc b/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/.README.txt.crc deleted file mode 100644 index 10a80e0784c8833aa9f1b0ffc1dfe623cbd4af69..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}C76Ab1=A5&;7p diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/.metadata.json.gz.crc deleted file mode 100644 index 07a31a4483de2b05e10654420173909660e46af8..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}Bi}qSqM!6b%EB diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/README.txt b/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/README.txt deleted file mode 100644 index 17101d717..000000000 --- a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/README.txt +++ /dev/null @@ -1,3 +0,0 @@ -This folder comprises a Hail (www.hail.is) native Table or MatrixTable. - Written with version 0.2.120-f00f916faf78 - Created at 2024/03/28 16:27:47 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index 96183ebc10b3c45b5a23a57bde3b2cd8b9b0448c..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}7+TcqbeH6AlB0 diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/metadata.json.gz deleted file mode 100644 index 63a2ed3a619d4b4042a52386257480415ab1a68e..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 312 zcmV-80muFyiwFP!000000F{zmPlGTNhW|@%#YAVe816*z%EV+CFI*PWwP#QVg|uY} z68^jA4CHHziCi7tkCTVenrWT^^VzGBU_h?lpP~Z9f_d!o02T8$Tbx8LL6FY;`nT{_ z6(Apb15-c}I#*%9Ekw?g!qo+LiS;K^5@(VWrMamZ?+C`KT>l*cU1paOV?m_cT0eX+ zhF6=exosLxatFgIB0YAb6UOmcoRyA8Y?sWIY32+dEk_&fF}f~6?8^Nf%7d0ssJSO_4bO diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/parts/.part-0.crc deleted file mode 100644 index 001836877abb4bac2c47fbf1dcc7625e65d6a3d7..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}E?aD0u<^6I}yo diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/globals/parts/part-0 deleted file mode 100644 index 596900d4910a8fe47a993c4f2578e546cae5c825..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 162 zcmeBWU|^UC#2Q=m|0+ymWntKKTt?2}w9S_8zzTYiwFP!000000F6@5OT#b}{$FxZh@F}|Su>q_@Dg6$_v5{MWIz=cP<((kTuDUci&?(JsHR+eHBmOnC(}H;sj{rPoz5#- zEtUu)4HamR3{0Ux6fL#2Cx3N4#nQdEw|xs9c@3B0Ak>};vaz~jb_gPqF7=@)L(S#E z;c@)mAjq!yg@l?3Y7Dv4n)cfhdFLE1sfDxl9eT1AiggB4AwYoHVXcnIAtK~CoL3L) zRXtl?C*Gwm{yvv7|KPCiI0nZ>3SN>&X=k}}&Zogt(<%wl8bk(k3L=)0HX)$(+x;q0 zHZ8E2@j2MUcfPBR|JOyb7qwJdZvBb{Q{A}3Xz-GtRYZQ8CH{803qD%4;Q4UjaBhy8 kOIFE^<&s&^KS2*AB0e?nq2Q#Cev{V#snw4{UZ(Ijt$u?eY~CYbiH3L#_PfY;z4 z+e~H1|Gn1{k0taE_!)QP?OpV$yv(MYbHHc->h^51X zq-y7FGA)eABqC#JI)U{L+z8w$MRyO!)S0(hu6ey1Wel_~$`1`$O0}K~UN*FDyx)iM zYJ=Uo-*;ao!&N7IU-A~Lm?-4(zXCn*D* z=@FaOk`@k}ia(gp@AW$U^Rl1X@VxZOI#d93-K@0kK@q|^q@iJa#B#$ZNWW3DLj30D zW1jBN6ljv$;-9>7dXkAy+TbdmigW9CJ+x%a1XV6Z;#c&U^|B<JtuY zXP3-XNjVWxy6Pz4S|^rMoz2vk=8YK8fRZ%*ioSW7p>jS2^0I|;$IcUx!n|?V=SD1$ zS2tZjqkv_KUmog8AC?=adGB0#Q+HGK^-jGcf z3wJr=eeFK{mLIi(pwptEAJA^k@3w=!A4ZFopKuoTGl|%hB5TV5l5Q ZY~&~$Vx1OhjB-rYJ_C2*Sp(S!000tSAv^#8 diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/.README.txt.crc b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/.README.txt.crc deleted file mode 100644 index 0cf6165af98d846e1b718f4f035af38ef8c627c5..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}A`G`FaHa6Gj8) diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/.metadata.json.gz.crc deleted file mode 100644 index 4d8d0ca9f6488a7cf1a5d01869cfbc5f9102e899..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}E@d_2o7I6y*d# diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/README.txt b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/README.txt deleted file mode 100644 index 0da7c8796..000000000 --- a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/README.txt +++ /dev/null @@ -1,3 +0,0 @@ -This folder comprises a Hail (www.hail.is) native Table or MatrixTable. - Written with version 0.2.120-f00f916faf78 - Created at 2024/03/28 16:28:17 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index 0f75b1ff0fb2a5ffe10232c1fe8d20398c6cb433..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}C8JC;bZm6OIGN diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/globals/metadata.json.gz deleted file mode 100644 index a5bde9f31a283b2a9c26cd42ed5558f5081ed5d4..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 312 zcmV-80muFyiwFP!000000F{zmPlGTNhW|@%#YAVe816*z%EV+CFI*PWwP#QVg|uY} zlKywk85<1zNaX79ew;j%woLN^Sje7@1OsyU`jAv0W-MUuJ7`$ATH_>234(MMwx_~N zQ-Qqi4NL(^=v>4Rw~#nf3)fcMWi}j1Sz1U^l;)=K!5)lNrT#kvx-Nc642sCPrO=%* zyjh*fYwrWe9SqwDc7KE&VV-ZsS?Tl)?rl$u#qTILeo&`s&TjvMYt^ zy_*W79(r~z&-B0ARVW{`a@T`)se1cgr^P=7XBTYp6JloA$R%N)FA;u`nHT615CFS| Ifsug$099Q+mjD0& diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/metadata.json.gz deleted file mode 100644 index 83bfe37a071dfeb4550e9cb7717acf9a835bb133..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 338 zcmV-Y0j>TYiwFP!000000F6@5OT#b}{$FxZh@F}|Su>q_@Dg6$_v5{MWIz=cP<((kTuDUci&?(JsHR+eHBmOnC(}H;sj{rPoz5#- zEtUu)4HamR3{0Ux6fL#2Cx3N4#nQdEw|xs9c@3B0Ak>};vaz~jb_gPqF7=@)L(S#E z;c@)mAjq!yg@l?3Y7Dv4n)cfhdFLE1sfDxl9eT1AiggB4AwYoHVXcnIAtK~CoL3L) zRXtl=C*Gwm{yvv7|KPCiI0nZ>3SN>&X=k}}&Zogt(<%wl8bk(k3L=)0HX)$(+x;q0 zHZ8E2@j2MUcfPBR|JOyb7qwJdZvBb{Q{A}3Xz-GtRYZQ8CH^+s1s|=adpP+{l5uY0PP;k=6zY(SG6BP9ix|9L{0F~6C5&!@I diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index db072ce74ee0db4564d7d42c35f8d4b54346c049..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}6yV3$mVcdtw6s9<&6f diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/rows/metadata.json.gz deleted file mode 100644 index 33aecb87cab2c1e2990993beeb961f9b182f6c7a..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 585 zcmV-P0=E4hiwFP!000000Ns^cZ`v>vhX0GNHZAFvq=b;W(Xk1snx-;M+A4$``vklO z2iaySL;3GLPDn@slSZZ9l#7VGK40(2BY96GT?2{9RC0I(e)r?kYy-?E4p}_Fj>PvQ z%h^`rJIyl4K$AHhv77_5fh6A76~6B_z;BuZi~=j+$-BvzYM2?dWoBQNi#rm_r4dW3 z2}#w?+2mRnkxNvj(gc3t9D1YhQ7O8+uS4g~YPsfdyBK8*w6@uGIhIka7XrhoYK5Kc z&tbOCRJZo|=F8p9Dj2?PnpQ0_P{`$>;YqLR^w_G^DcPnesNS+}Cu7f##>XRvu$9Po zD`bo_Jz&vJP-4ZI_=^GIU=W0-X}`4LW$J}_=m6-tpIO)+nh+L`o`-3zm9!|xzEQJ6 z{p#v{8E(}KXj(GyU!b(VV=9z3xXQZ zE9J$ZtJ)}Fx#H)Cx{$+W10D8Go%enBU0+V>IV=A#@*3gAr9JbyDR*$(PPa-yZ1D|M z=CiOjGhWvA#r@z8Mg!_6;JHZ>M((u-i5mss(B*6x1c5&q0cYh!{`-!~C5-c}ftOkz$A`BFYtY94dQ4#RbA_jUD( zx+D#82sxx^c1QL6-=eAILVd4_GbJ2x{eE*kNn^Vf49{&pOKM=XHzUJcm zmdWK=k_D``B4TR-^4#Cw6tpe(Gl@MN_)1ZMlflS!Dwm@D^U|HInb%#zxefn098F%~ RtHBHO0|NojnN0Q8DE;Fq4cgJQfPo~ zsZ0M|%F3obm?b{JJ@=f$fy*APxqxg7+HfTixt+$*45JO@;sm1cC>l-T@l_6Vd|k}h zbe`WJj5Mr4gJfVb4I-~!|c8nnNXBcRkX;Y zk{MG4m!(^zv6^yedyYe}VH!XVRfmU=|FHnsRA=jZPWR2BuQik|g1;rX)0(!)!8+|a zTu=+glRls&i8BJdr6R>z?Se%B$afXny$MWC!|VEzvr2Os#IL)GqlohLg{ zO||8g`x`bxhQZ)9v=+Ye+br;tN!t5pRfFTh4~HW-XnwK^ZY-C~iuM^=C=hW^2JTUC S(EGpZrQH`FU6-CT0{{Th8?w{@ diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index 017eef36d4f7bc8016369fbe313d6751d556149f..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}8AQz)`t9Y+XD6Ald|< diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/metadata.json.gz deleted file mode 100644 index 668d507fd9bddca275517f8e013e45fea4ff4925..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 643 zcmV-}0(|`+iwFP!000000NqwyZ`v>v{V#snw9xUH(!7n1PDs@>m1z${D01u@@EUAn zo2nG$zxO(UkOaEIp4RdL9N&9=?(w;aH$>6}kbsONhezOdzrK!_z}mzk;XSNLFd;%( z&<`{rQ^c6!z+40gHoMfH>m~5(VgaMTLAdg1IG_f`R{!Z zRIk0IGQtWa5gAF_X{R1Yer}48ELm|dp6AEr{@r%}UJ=qeRrl}L|VJlo+veEX( zORAjpHxa^>9%BP<20Qf zbNODVl1#ZsC3#L9iGs`ll^Th6;%XE@(&Rgu@!wpU?WeP+)e&E2V-Ut7fv|SQhwjdi zH}2%T2@UwfB;s#*1-$>-{VA}R_DD#v)1>bRFhNUYA(?` zH+WXYU5DJCMzhgsx4Wzpbs9A4MUCzizijlm&l`TT-(@U{nBVVYH>N*#RL)?aR+78? z4auzU`YA3}K?%$)jSSdaw~*S-(j@3#w%UFZ`QRA59(m;=w=N&&#jzzQu%Rmw)5Ji! dFZmxadM(f)sneHQt5H~V@iTJbe$31Y001twM6v(? diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/parts/.part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.crc b/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/parts/.part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.crc deleted file mode 100644 index a4b13f78f85ea891e2fa7aecc6b2add89427b83c..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}D(e=gW#P diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/parts/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683 b/v03_pipeline/var/test/reference_data/test_interval_1.ht/rows/parts/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683 deleted file mode 100644 index 1d5c3980168f154018f031c1c8b8b1b40a775acd..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 60 zcmY#qU|^5}VvVi(e-)%IGB7Z*Bxe*E-kD|!rhiQ{WdzDP*fTOR@B$ei09MKX6b1m5 C%?w}w diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/.README.txt.crc b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/.README.txt.crc deleted file mode 100644 index 6f6856dcc34b8f296875750274c8ce308560eafb..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}7++o^=!e5vl`4 diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/.metadata.json.gz.crc deleted file mode 100644 index 0cfa73ca9dbac98379d835afc61516a1000b4ccb..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}8AotG)^V5`P0B diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/README.txt b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/README.txt deleted file mode 100644 index 6e5794f88..000000000 --- a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/README.txt +++ /dev/null @@ -1,3 +0,0 @@ -This folder comprises a Hail (www.hail.is) native Table or MatrixTable. - Written with version 0.2.114-cc8d36408b36 - Created at 2023/07/25 00:52:16 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/.metadata.json.gz.crc deleted file mode 100644 index 69cec0890730fb40bc51bc3d714a066814a6c26a..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}DH>_WB0^5<>%U diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/metadata.json.gz deleted file mode 100644 index 7375289521bb14abb4b38e69f36b4724792019b2..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 314 zcmV-A0mc3wiwFP!000000G*OiPlGTNh5t)m#YEBQ!Xp#o!xEEWe9#!uwRcblg|uY} z68^jQZgar)GUTb8bH1CNwk^}V02Z)UBf)@NzCXnk2%mW@*+9dB&o5kwQi33z1??a3 z)>I(pM*~wp5;|8=$SuUq)WWqDcbN@FQl`=jM{4T~SK1{8QdR1tP_Cq)(%dwiw*_NS z{uc|nE~bHfu diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/globals/parts/part-0 deleted file mode 100644 index 785c16ba791482e6df6a798a40f31af7db30d227..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 153 zcmW-bF%H5Y6hLusaxw7&T;Lb@wP|HFQCE$t1Ge8nQfUOzcnH0#o2$f^yy0!{#Ta{M zzsK37k6MUGj{t#g6v(#(ok>RQPg`3y;_lMe+R=BxnL!+Aa8GVsZ+ScVn1XRW@hpYO t43Y2X3aSE>h((IiV#z)$o~r~|1FEw^jT?=!Q0q!IHORG=lFjqWpC7^1D@FhS diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/.index.crc b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/.index.crc deleted file mode 100644 index 24103e0dfcc56efcd02e1a288882aba80d7de9de..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}Cs@Z}%Yp6`llX diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/.metadata.json.gz.crc deleted file mode 100644 index 4af1b141a9bf7b8f739525a7ce9638693651ac47..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}D%RT$2C*5#<8k diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/index b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/index deleted file mode 100644 index 9755f9219cbf6d8d2fdafe9892bb71c0aadda914..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 162 zcmXBO%?d$L6vgrNuiGg~I_F+RW}bjN0#hR&V|7c$cR~!rguH`^ynuoC@(31ot=(^r z_Bu#5H!j_Q8Ab`h-SlVLeFfQ9T=K@a{yy=W-efE?#w-`qSA4z=?U26 gO6OeV#lBJ~-3u)rK5lgVzU91wpL+f!e*0VF12Nkbi~s-t diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/index/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.idx/metadata.json.gz deleted file mode 100644 index 81af190beddb060cd4c4a3ccf3b187b456a38578..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 178 zcmV;j08RfNiwFP!0000009B2_3c@fDME_+^3OTe^tIbtVL_tyUA|7JgHo>xyWGj}^ ze>aL3Zv*emOb4Z{@q|8DMq-R%YgAd&Cb0L|A`$N&HU diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/metadata.json.gz deleted file mode 100644 index cd05038116aa45e98b68b7a7929d516bc1d31e92..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 323 zcmV-J0lfYniwFP!000000F{!>YQr!Pgx^J{7S!PU-~?Y?D5X7=r1TPkQRIy+DoaLM zCm7>*ujE<@>B+rGv)_!CK^-{>3E3Hp6IWo0iMX7U}2Onf~4bA{}5m#|E-BY=+C=^H8zCjB)Nn9`B VTA{%|e4N4B{{Ygsr3A480003BlQ;kX diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/.metadata.json.gz.crc deleted file mode 100644 index 3b2baab1038ecf0ea3952ea62a9fbb5b4adf1438..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 16 XcmYc;N@ieSU}ErJXBx7cUHAb2AU_0R diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/metadata.json.gz deleted file mode 100644 index e131355c96a41e961fa60f2c2e4123a180a5e065..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 605 zcmV-j0;2sNiwFP!000000Ns|sj@mE~hTp}fR!z19SfIHTN?W9DRYBE55sHjGA?^kn zIcBvi%DZ=*Ku7{zk=}MWk>4}pncrU&?}?;qAU>H$4v)a6zdp@2z}mzi!9DDVKPN&O z&<`{w3)Gn6z-$EZ_s7(qyAAMBv4BxvC0uzo9?}G6MsJzfAM5yKF)>t|io%pMV&!A1 zkW}xSjY@fw5(=G|K=8*LfL_-@k`sLRs5RZqD^TK!|1v2} z9TkQNm|{pOr84wTvO?+N5~nAZN*?77mAttU>Y|ymR%%VW7N*e!N#pN`;{P$4?WaLr zrZRg8f>6W|)DHQa?Ob`u9On~gU`q4L^EO1gFt>v9xv6za>xNzp|D_53(f_M_AJD93 zAHi()lPlHal-QaZgy%CJR$}iU_Q7qr?Y>9*JY2f{ka=zwx*eA?ujkUxV`1x-wiX;PcriDJjHFsFNrzR*a3_5H*D?5H1R!b1>b<@gSr@WfP0pg#6h6Y r#x~z7>8V)LHhQ&N{!KIC4&1G!RH8%GYS9zOH?bj diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/parts/.part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.crc b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/parts/.part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0.crc deleted file mode 100644 index 55bcdaf13f679e66fe053b3e200d96bf8ce8c9db..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 12 TcmYc;N@ieSU}8v?e)kgq5;6mz diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/parts/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0 b/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/rows/parts/part-0-271a7dfb-7fc1-4e43-ac16-af1cf05d0ae0 deleted file mode 100644 index 3bb0e2240e4f18d053bd73f7423eeedd93b0a950..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 113 zcmYdcU|`q?#1k1A7+8`sihLQ_7))3g83h>x+1MDwIDs@Tkmd){5`VJZZ diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..0e7ebec47a30d3b6a7c072dfb06d76ea34176df4 GIT binary patch literal 12 TcmYc;N@ieSU}A{i-fsZ_5Vr!A literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/._SUCCESS.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/._SUCCESS.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/._SUCCESS.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..a91293ea43aabcf30e7d615061df6c09ecbac702 GIT binary patch literal 12 TcmYc;N@ieSU}8A3q?;806PE)s literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/README.txt similarity index 78% rename from v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/README.txt rename to v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/README.txt index ae1371f9b..a31bb6640 100644 --- a/v03_pipeline/var/test/reference_data/test_combined_mito_1.ht/README.txt +++ b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/README.txt @@ -1,3 +1,3 @@ This folder comprises a Hail (www.hail.is) native Table or MatrixTable. Written with version 0.2.132-678e1f52b999 - Created at 2024/11/07 10:28:31 \ No newline at end of file + Created at 2024/11/21 18:34:48 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..28013b1bcd69bf9575b63521d37f3ee66ef63015 GIT binary patch literal 12 TcmYc;N@ieSU}89?u{sj~63+uY literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..1a57f3c95643e27773ec8de216fb631c498d0035 GIT binary patch literal 298 zcmV+_0oDE=iwFP!000000F_ZsOT#b}|1R%p1)Xe2ZdU6_5LWTv7?LG#U0j=%BpuY! z@9ukRoon@CPRZ|YlGlQj1%iNQr_rHazCNTCsu@J^zQ+c{YE3&;YE+DaXul0FO@;dY zM34#@BIY8ArN=b5T7~w2&YRK-0*#ffiO|-C7b8t2!>-KG8f8Kwk~`;Qvl{g4NP3Sh zaK?n`I&M7-t&=`F8k-GmfA;iYz3#kag!dhrdb>q8`GBV+zMkF2KU4&2XL5ZeiSttJ zDP|?betwbaGpKg7YMgM23rCE09L<_w*UYD literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..4a2b0ad2139ae28c6b4f708159ad249403b8cfac GIT binary patch literal 12 TcmYc;N@ieSU}BiFX|n_X6NCd) literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..1abb6a203abca8fb8013f09bbe023fca790f33bc GIT binary patch literal 360 zcmV-u0hj($0RRBQ0ssIgwJ-f(!vQS{0HzW;BM_0>u#S||ri^40WhsqF=_6PSz@9L6 zty9eH4-oc19pjz^HUKXGEdXsfVp1S+79a{vzI<=tg=-^}k?XsxcL~QLVdmw9PB;ln zHyua6DjKB}L_H7Xc_@~OL!|)-tvw7V5M zwK4j#{$kqHeLaS5^_jj2>BKEL`g2pl41NUZ%aUiK!!-2ByiWS+n~vVumYD&x7;gBy z7*WRVtKrWw%AsbK&aKxLgBaWmz2SFhuL<+HRmY4Tn;Y^nG(TIdk&`~l=a!nz=s5ri zBp^^CQjmB9TM85m=CO23$!hFfanLRJkwHdS4?zdiNkCt&mFy4}x-?u$;0#bN8QLg^ zFMIR9goY3D4pe2GXm?pX3#NI2ap-T7iIWV3xd$i{h2`W;IROm-00000001bpFa00@ G0RRBz_?y1~ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..3dfccd27db1e08d973ae1f380ee91f0c8053ffd4 GIT binary patch literal 12 TcmYc;N@ieSU}BiN=IVL?6j1~S literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f5eb925f2e1cae6dcc1aa27dc8f82bbd5be66107 GIT binary patch literal 12 TcmYc;N@ieSU}A8tHVOv-5V!(@ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/index/part-0-16d3574b-02c6-4ade-8054-836f2bbce002.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..e543726c14fb614af82ee1f4d053e0ec1fdaf6a9 GIT binary patch literal 138 zcmYdhU|?7T#2Q=m|0*on%EVB@%)r3Sz-ah%nkf?lqa%}}BZD))yIhDaqq7a8yH`jA zlcO`EV;ZxgGmt7}add8FWOZkBoW{t*pujP!CB02?vc&u)3{8o%7G9N0INSh)&9lAb UfmR5?tPz^Xz`)1|xYr`NE{x3dl;L4h?Uh+0N7-f4X?bu5R#h8;+W;Cc- z2_gS|YHKeUgT07;H;?Dop@MPPeS^|W3Y1KyaWundPZ+;|=y4Q{#&Ps8na*HbJVn`j zK1UcSC_sUrpmz%Rt|n%0#TV2{49RS~X=|{|hdXp7ml@&7#;Asx?sSPeGErg>O{D|p z0%Zw;QD)lYSt-(PkGd2Stw9;*yute(e?gmUxulK8gmd7aQ>^46gnA}<57>|$?(Iqv%umy+ijLzF(Tf0)j%v{1Ol zGbz?hRxx92`vzs(I&_ zTT~wY@AG17s)>D!e~$0Qb-*f63(pM0f_^(5r!Gjz+BF7lBF^B+bgRmlwzH##w{Orw ajuJaA=v{x5#ow9tiwoAOpVHX&8h1l1l^A!O_m@EROs zo2d-tzwbDK+|f#(GC*+3E;&lMgHJK%S@1Eav2(DP|BrW)o(?U^~I<$|(>#nOm%fKftH zb#QjE7DmJpEmLWHFL6$-gUPK@^ze8JT%7Dv9%O+IRth7*jF9Ue)NJB4NWrjHky?SW zE5#g^qEu!tP2PRstmBB!4ORLaa4C^$4a!1CTY)z9EgZdHZG|zQ%3w}|AaI_}-1H@u z$v-D1D;(b2=Z9~zyNy5mm~)h+V<3`HV(7mF<$0~ll!toBeSZSV}9Gi48Y4lcOH zmzK>yPw-Y-n>cY8YA`trDrIcwDYw)iNm|8U%u$-6qF#&s+D*w(H89oCyc6}|(n%V9 zN71}aO=&*`vRyx^T$iDUU|BotrV2C5+bS;Wp@7ATwVPzd!7)IvXZez#>$+|?DQ;NU z`~d@Q2_82A$zN2%9}PXx literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/parts/.part-0-16d3574b-02c6-4ade-8054-836f2bbce002.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/parts/.part-0-16d3574b-02c6-4ade-8054-836f2bbce002.crc new file mode 100644 index 0000000000000000000000000000000000000000..c0689358515006ac3a30558f8ebda24635a49eda GIT binary patch literal 12 TcmYc;N@ieSU}CV1*l7U(5hen` literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/parts/part-0-16d3574b-02c6-4ade-8054-836f2bbce002 b/v03_pipeline/var/test/reference_datasets/GRCh37/clinvar/2024-11-11.ht/rows/parts/part-0-16d3574b-02c6-4ade-8054-836f2bbce002 new file mode 100644 index 0000000000000000000000000000000000000000..a0f3274abe66575c7efa8cb869707804c6d7aa7a GIT binary patch literal 100 zcmeZgU|?7Y#2Q=m|0=8$Wnyq)V*JC%X!vxRDH8*uBa@?}vwH|5BO{ZeGb6L3GmE1$ xt2?7312=<$fXP$CVz(|{)6_Y^?A5OX4#-ZBoWMFUU6vPU90-7oWME`q0039u78U>i literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..b308c3ea696c7953533552704fc91370b5b80dea GIT binary patch literal 12 TcmYc;N@ieSU}De~OT7pH5D5aZ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/._SUCCESS.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/._SUCCESS.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/._SUCCESS.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..4df064edbe2106af2ba8cf72f65f3a35022c0d26 GIT binary patch literal 12 TcmYc;N@ieSU}C6!akm`+6m0|q literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/README.txt new file mode 100644 index 000000000..706fac138 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 18:28:11 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..9fda0e926d17c5ad57238bdb4257ca19f0bbe498 GIT binary patch literal 12 TcmYc;N@ieSU}E?cntK5N6cGc* literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..ff78cc258f3a50ab11cf766b32943b8a82c19b92 GIT binary patch literal 291 zcmV+;0o?u{iwFP!000000F_Z+OT#b_|1NoTf=)IhZ&vG*BCNv)#~4|1n{jPgF3F&l zes}NII=2dfdAj@kyCl~LrwR~Kyr2ODGkbl=D$rvw60055LN1n6>C%8^T*&6zFs~~x z_eX+Mzz{JPX`(!20izDh9?;v;Ndm1?SdYNP!pk$wa||kwXs&z!Tvc=rncg9)deOU< zZOaq({e(d_wbpI>lOJj%^t;W?i6qHOy`z{j(VD@` zev)81ncPf&+unRI2kSq5bZ1b-(An~#M^p;Kqs_t7x&$|uSAy5}F;i{}Ng17Csu4aA pNwH)CXS%{Vokv5fI8gL?aKuv9Fw%0G3IRQP+AsDWPSZ64005>OhM@oe literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..097026b06e614a288ce16114a9787e1bc10ba372 GIT binary patch literal 12 TcmYc;N@ieSU}AXoppOv%6x9Qe literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..0e97c1b5e71ea981a3644b664ad36601cf6412cb GIT binary patch literal 51 ycmb1RU|x7#$h?7y}r1fwCX~R>8o?zyJWVGY04Y literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..1f307251a5ee7d13cb486565a3c0ce15ac9edeb1 GIT binary patch literal 12 TcmYc;N@ieSU}A`J*|-t_5x)a7 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f5eb925f2e1cae6dcc1aa27dc8f82bbd5be66107 GIT binary patch literal 12 TcmYc;N@ieSU}A8tHVOv-5V!(@ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..346dfe243ae21315fd971827040f140ef244b31f GIT binary patch literal 138 zcmYdhU|?7R#2Q=m|0*om%EVB@%)r3Sz-ah%nkf?lqa%}}BZITByIP1Pqq7gAds0XR zlcO`EV;>{4qcf0R$l~a{mDQcm@f0Hug9672Z{wbsNB^j>3GiIzS^JGanoWjHM($1w VlRVH0A(%Bn6B!s78G$?$TL7jG9}oZl literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/index/part-0-67410585-d883-48cc-8d33-933fff287418.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..2d8bd5ec52a289cce7bb10a504c2361ae82d00be GIT binary patch literal 186 zcmV;r07d^FiwFP!0000009B5`3c@fDME_+^3OTg4RLxBUJt!zDUc^JJ+a?$iNw#1p z{dc$CyetDV^LBb@jKwR4XuJbiRavt5YX)3--v(h$g znIVM#zJj8G>|!^Bp7WeT+m`~J7zgqL+At{)9*ze7*h2|r{2`*(p5N;a{FmWq4E@ua zf0#@r$U_=Vph3i7hz3p+wTXitszWS2xO$r_aL%S$6lX3El;gG4S!|2jN2((*^-!8YQS?YsU7&jlIx=IKh%L`QA%`Qv YZn{vq@i+PU*Q%rCFM<6P(B~ z)ligw?{xwp52hVPn$564<1Lxjm2cVJb@WD3;aX4 z1<`cAt$Opa_4{5?U2#bx1=0y6Fo&~qePhrY4xMm|Xf9+nsML^rM-}uYB}1%d@%uE3 z!W5?Z`@`4yU~6_iW-L>Z=*Y#U$ngrpe-iu;zY4o5x#zg$sXgpH4`fLV(^0?CwE7p5 zLEJ6(SY-RHzzqdrhC5&ercAuoH6%ShodJf%A8v<|;-!`ED literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/parts/.part-0-67410585-d883-48cc-8d33-933fff287418.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/parts/.part-0-67410585-d883-48cc-8d33-933fff287418.crc new file mode 100644 index 0000000000000000000000000000000000000000..6ad8acdbf33b27ed95ec0aed7044913e1f30cb06 GIT binary patch literal 12 TcmYc;N@ieSU}De+e(?eT5zzyN literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/parts/part-0-67410585-d883-48cc-8d33-933fff287418 b/v03_pipeline/var/test/reference_datasets/GRCh37/dbnsfp/1.0.ht/rows/parts/part-0-67410585-d883-48cc-8d33-933fff287418 new file mode 100644 index 0000000000000000000000000000000000000000..95bff341dda94cda2a6bd3ac9905f41427f74101 GIT binary patch literal 128 zcmd01U|@I%#2Q=m|0+EE%EYjQnX!h^@aZ&DCI&`FCP&Ak3I?_e&h8;OjK;>&j_&i` z3x?c0Ckm8tW^`nBbY^jMW_4$TsAAw}P!O=#TUacouvLDFL-j%49eI9!(Y?&AjgAfF U2a?~;DB}fM1p;8}7#JBC01>_;?*IS* literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..2f0fb448775907d4aeb05cf8d194bcacba5de68b GIT binary patch literal 12 TcmYc;N@ieSU}8|dy}A?t5@-Wi literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/._SUCCESS.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/._SUCCESS.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/._SUCCESS.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..5a70299a257f94deb094651db1e9178879442571 GIT binary patch literal 12 TcmYc;N@ieSU}9L(eEcr}6eI*D literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/README.txt new file mode 100644 index 000000000..2fe7da443 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 18:02:58 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..0822dac8bac09b1f1548c652060d964053d59bc4 GIT binary patch literal 12 TcmYc;N@ieSU}CV#bPNLk5Q_qN literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f5eb925f2e1cae6dcc1aa27dc8f82bbd5be66107 GIT binary patch literal 12 TcmYc;N@ieSU}A8tHVOv-5V!(@ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/index/part-0-04c0af8a-a562-4e97-a303-1047deca5f45.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..73b40e8a5264299f738a9050f72970bc4e401087 GIT binary patch literal 138 zcmYdhU|?7T#2Q=m|0*on%EVB@%)r3Sz-ah%nkf?lqa%}}BZD)ayG)1{qq7C0yJtug zlcO`EV>Yv+GmxreadhrrWOZkBoWaP$pujP!CB02?vc&u)3{8o%7G9N0INSh)&9lAb UfmR5?tPz^Xz`)1|0{9@s-)^BiY!9LHYx!G zxRGV?@5RI|+=n0xMJ26|L+0p$-E%aufAIxF%yMh)fSQbhNYe3BLU-E=mGta`|I zi^T$AWS{~Al7TG^h*DG61o%=7h@-m@?>Y@Z`5X>?E!2(*vUa9rt{*QMj$GClWP7kl z^B#XgM>ayS)?z9I2(Vjh)G-lJA@9R_^SW9##q2h5BT37s{0&R|Cd;-%kWm_f0j?s>#nE(4?YVVaGAFt>=wOdXcvJA<(UU*^SG4+H L?*(xqC;|Wg-8Pfm literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..3c794c785b0126bfbe421d0b6c58bdbdc3db0ae5 GIT binary patch literal 16 XcmYc;N@ieSU}9Lx`{3$kuZR5rCsPIw literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..6e0dec6fea867c3a11200eb8ab82c8c61f5fb835 GIT binary patch literal 581 zcmV-L0=oSliwFP!000000NqthZ`wc*{V#iJRXg<;AMq_toeC+cDu{Y$gshEs4BKqh zXuVXS$p5{wHW=FwDMZR891uJE=FPmFhh|G8eF74aiDd8$Z2IflYz3)DERsCJhJ^P- zWE7vPin*g0&zKmVQL$4(P!IXZ#-m zhJ%4WJWl&vXjY}3nTIlfKJ_z8^Fa~Z%($mvvX50GS0H{;y~Oy%#n%dMq5^1E4fa*j zZjUzOs?Z5%&-E;afrX|j@KC>c#~64MzL!%n{9re`f=7YFTgOXJIK-0X{yCxYIko1Oq?7}+ozK*MFC`fa~ zPIq*s2^;wx;~mTHi|$K4-=?ReyddNwIu{@9jr*i}gTw1|SxREaS5)a6ZWb%EkDuF9i6XkxJ;Gm+AC TOtn^qvn>AuP=vwCdk6pkGFcpG literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/parts/.part-0-04c0af8a-a562-4e97-a303-1047deca5f45.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/parts/.part-0-04c0af8a-a562-4e97-a303-1047deca5f45.crc new file mode 100644 index 0000000000000000000000000000000000000000..80f391d3e5417c3df01adb3edaaa21243f8d8bf7 GIT binary patch literal 12 TcmYc;N@ieSU}D&$#vKm;5ibIL literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/parts/part-0-04c0af8a-a562-4e97-a303-1047deca5f45 b/v03_pipeline/var/test/reference_datasets/GRCh37/eigen/1.0.ht/rows/parts/part-0-04c0af8a-a562-4e97-a303-1047deca5f45 new file mode 100644 index 0000000000000000000000000000000000000000..0e47e70164decc65618b1f2b31da23974278c528 GIT binary patch literal 107 zcmWGzU|?7U#2Q=m|0=BVWn$RE#K^*E_;i{n69c0olcS@vdk7-~qw_T9)AsHmEKH8h zjLeSCERN2s?u?EMJPZl~#{G(i+q=yhZ7vu!aLn4%kaVE%LBiu1Uc5k4K>%zr10w?i E0DO`gXaE2J literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..f971c085311911b2fe55a0a8b5154bf76e582a1a GIT binary patch literal 12 TcmYc;N@ieSU}7+|o%;b literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/README.txt new file mode 100644 index 000000000..3d248eeaf --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 18:12:31 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/index/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/index/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..231013f32bdc59e772defc9dbc38856b3208e4a7 GIT binary patch literal 12 TcmYc;N@ieSU}Csv`l=fM6T$TYiwFP!000000F6;wOT#b}{x5k_sKd6nOWsOlD&EEv5ymK4cD6;+q$H`M zl>T?qtXVgodZAn5ypEET@Ru`7)Mvb(HO$bO|+U! zCMZA(HlRRI(3t|B6~yeUxJRwTkd=)$RRNZHZ;N`%<%)3pW>iT{-96HlOl$)XWv)Hf zG{HAic)=wtbwD@=4tg0Vx%Udk;HUPyxSOZDcy#G?I8lmJi$m@*NT%8Bb@rTm;whKJ z#F0h%YnmnNFZ>|7;vURNA@frvXTEyrO-*Bar7|z10AV!3H9EBEu7c!k`}2?6)Jxt9 zpl*W4v3Y%CQMv!W&xx(6Aj&XfN10Jsc7NysW}ZqN^Eb@v7h$+;gOn`58sIGA2zI7( k)tYI;1U0;RgbLOO+x0+W_vig?_0HP-0`JIuu5$ta01kenEdT%j literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..a71b9e605029f4673dcdf35f08b7182784265593 GIT binary patch literal 16 XcmYc;N@ieSU}A9pkds}pWR4I3BvhX0GNHZAE0354bj7;H#gi=djSRVXs?3Ahao za?DhQ^51uyKu7|UN~PYE3ltxpkN347DcKQ0S3o>65)58|O@4iwt|4d=gZR&|CEg>6 z;$V~Dnnn>skdPT}1u=s_ClYUW=y1PX1M3$Xm<2k7J6{Gpn!r@aO`vvJJFgrGVxjoL z!z?bS+#2gR;flusBO{^gW@_w5dxNf&bo;nZoEhDrKfW9P>^^vdVH8u$#Zc+F?|0dg zmv_;tAE3&scvab3@nU&fDonMNKYJ?`6F2(*)Avb#X}g~bk-0`p4kN8B0yL5%2fZ*1lRo4Jd6&p3`Yskn5reludcpk={gz#4KwIoMPeDdkjq5*oW10n zwH*V2PjHc6UHaP>nTfnU4qQwsCQ`x2MWkPyU@omX@rgoXVIc&E1O zrt6k3$Ml?){}}m*>d@zs=_!SQseT~eZ(uc=f+u3!529iSIx!(@T&UfW^Iv|8@m zbedskc4)^n9kAQ3<5+<+ce8JVrwJ7c=*1hsbp4Jb3Oi_H`4<|abv2Iw literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/parts/.part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/parts/.part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7.crc new file mode 100644 index 0000000000000000000000000000000000000000..0e6bf3ecb7e19b0c6f48b2c00f62c12d8d937919 GIT binary patch literal 12 TcmYc;N@ieSU}8|;d9@P&5_$u9 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/parts/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7 b/v03_pipeline/var/test/reference_datasets/GRCh37/exac/1.0.ht/rows/parts/part-0-dc3793f5-157b-42ff-8a87-4e367441c4b7 new file mode 100644 index 0000000000000000000000000000000000000000..ccdb1b1487402c44062e1d04246477c5faa74ca1 GIT binary patch literal 119 zcmYddU|`q|#2Q=m|0-->%fwK^%=m|q(eUXsQziyRM{x5vmR3>fh+D&|0nslnvsco3{VG*(nJ`%TJ zQ`1dmgj)1kJ$58vG>KQlpm{L)rW9SxdOnyK zow%TCBV@(D(4rA^%mbeWmGRjZT~xUoFhrfKq`G@~(OyH`6{9!+>&NM)!P5F9EtVuH zjZFJkih0wUpL0QUB2}kS>&OLz>~zC3ZcwAP?Sr2h*5N#}3E^D&f(c~e99AgDDRN~_JRe%8tc1`xt zT&?P}TuX8;#~q-!0mHpYaIf*3OK&c_x#Z?@TQg^e*{UJm*5}!sOiKOk`}BHo@#THw z@JVI6mr3ejwM5bkFL;ecx^JK^w?h?eHdMPr@qeD^(uB(ryJ9;e(E+)O@q_Gegin_i zs*aq!v)v7mzh{4k=-y-GAb1}P`Y1gRqld(HM;aW)WtxI=zHRO%qfh5wM?018B@3Rh zj>#yK$unl^8O2r@^1oJvT%6Vu6aDo?g0ocy{L0Hz!am;e9( literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..8318eee0b03c5ca223652e155b6ace139af17e19 GIT binary patch literal 2599 zcmV+?3fT1$3IG5IF#rH4wJ-f(2rwm30IF8p6F)HKQ3p|X!|qBc?M_&A6>s*0YeJkH z{9;D14YDPJY)L9@6ER8x`hFus2F`N;tuZZeC)h5hf(xWjg&9fPDM|q*0U!ZDvM}Jk za9t7D;A4bT0#~FG_`hD+>3xOxud6ZkFGUyYLQxP_FAw@6lAi_a0}6b82f(ygLu%|6 zA<3e}@|aRgUiGoDeqLdu{)t~%X?$ShoM6|#qKo9f$U?fp>g`P^Feiq}hA|@c!27a> zE!0=lqvVK-p9&rW{)JyzDE9XD_5cC@sJExB5u7+~?vs+LeAW0}OduYg;2YvC!D)Fm<0B=Mh z7Euze{||m4FZ8)bT5{Hh{xONNwtMi4I^G(s?auT;Nt6I-J>TfGZTexLI~~kieYV4A zk}`)+G&A>e_{0BS1DU#~x$EtsO*=8`&0E}Y<{H|u^L6)A2ZO?D?or%kO|0h5TikKx z8rre*b@x+ey@OI$8hj`d1F4upnK&q&y1Y63b9yO0+baiMK!@ znWk~ltN8y+HQ+a~8eA1l;{XWDj;Tz~vcLi0Y-&d4-T2Azn$xp@o+M!rMwnx&0iOj^ zwV(|Q4^4C#HCrjDhWsc@(6gPbnmfsk%vn1ac@TYiWXHGZz$if17ne~?X zHAdDu$J4JucI>B?v7Z`8)UQQ;bewH;oQXht4``iMq_If3XMWz&=_RqR_eY+pfKq2Y2M52ZYJQ{bEfWT?s~gI-eRtURLr9;f47@f zd~@$o@)mcTxrTP^eBESnIINKL^6IkCwZ)~Zu;rB$B9p=jUR+&WTuLrw7gsWfOa{A@ zQHU8d6qtH><^N47NF%wrd>vhw>JQ$Tgc!kRme9bFu=s>m&@ zV{<>#cAlzbvtw@EoyCqh>N-mtoy}(lF*Y|XZ#Y6YX5}X4dCdhoGC5{${F-4M&BRI^ zH*TjLjydx%tuin&2`zW^i=)ya@a;Jtw#ESw4cnIbmq_RrUFIlFE8h zb)2hsJH<7QYRY;)b)2Di>%=uSX1#^HC1$;g9cL!qE^&>ARzaX}{X)e&j+!ve*S#~% z-HzU_UZ(ChXz;u}gUNcIiTV{zn!NqgX6Upb({YxwMhvvqY2CDCcsr-+S2h5C9plCL zD-tT^SD?^2Uzf`?_dC7aE}6Qk*x|iBPyL!C>+J;US1V8E_ER^J`c=!6x{foJC`zDx zOG~9?!rLrVze=(Cb;;AOM}DPZZuKN|zHWhO?gaF94P@%3YC7=t3}tNkDkr9=7a+CN zaRvZ2g!TdLR9ob2fT)j}dZO{4iCcF!iNCf{F(1Oi`MNWluM0QL-H6_9T&C_II^^CS zg1V?uw1!$DA|j%oAV~`8q6-j)LrK<3N)Hl1Q4|M)AP0gJ!x)88L_{QznUP3H#5t63 zh7SG&ln=ybT*6TBwq8f`6|%3%aRW_bx;pa3$W1uhf>f^lVn8l9CLt-zH50GOk35ZC zEe60rH#r(_cyONFj8B#to%E$9{Y)bNBA6W*AMsAljCsFo${><9kPOT>)zlP&Js5)@ z=P-^~1OO8+3=eF02@wb6E}F#TlIP0Vqy7>#ahB{e-`n|VwX7(pMZDFa$vM=aVh`a|blt)?%m!@7UMQG_@T z;t4&#=Ybdv;2eNio?H-_Pm4*gEto7@!%8a9oL|JQME1@SyITQ(`?jGq;0xoW5kqWe27kDZ#Co%wE5 zHSjW%7}b~E76Ww#(6r~})^4LJo!L_}3a*>MlNqiR$@k~I z&Z#sz*>!VOA7hRA*41$Ikj=+yM}xaxT$A+!V6Vk^rQT-hg=u?=KU%h1f zw@V;TTjE|-bSgVvEb&DMt|WvKvRJ5OHc6%Q=n6^_r<~N(eih@MNet+Tu#gL-*2}s4 zNSfb?$&PlD^@%b#Gi@{ye}sh+b7zt!?+h=F{FisQ4RnCtu($U%)?*mGU0f#9_Q%rs{I75?Cxn9<7>g zx#~jLCq;3vh2FC3Tqfh9=i#e^E{|=|c}WxjYJ;a*xg2-Ze7MuY;Ix)nxMd z=dk*FdK3xbz{T-Ts4UDnv)3|uxv@9Gb6^x6js$fO$Xu1}l8l`BJt zMk*0QZZdZyWUU%>-m${9@@0e}z?BE!1+#u)`L;Z11aS2@-kvoH;WP&%C<9S>ViB#U zc;kE`Ni2z2^+k+o29YHDAbwRTk|H$eR|$vr#kDsb!|;_*M7+f)XKL^e_Er4gutjO2=LzLXa&!}k+>JMRm~9Gs8%#r$xKP96K^hrS-%}|@^=Qvx^K$b&@-;@(f*APX- zYdfewn&6A!i@2(CnZED}9sw3aLj3s!If|s2f<;O+f)bE1x@uAbDhDNf(1~O0+9YwZ zJaZ;4gV~c3Lo92_@tDE)OTs+{XO|e3k?7O0000004TLD J{U87V008tT=Xn4C literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_hgmd_37.ht/index/part-0-595a2be1-bb68-41eb-8367-dc7333299edc.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/index/part-0-690f60f1-5897-4a95-9d74-fce92d3e5de7.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..44bf874737b8e9e5109f18ec12e571b8ffa2b6a4 GIT binary patch literal 604 zcmV-i0;ByOiwFP!000000F6|^Zrd;n{TH7$Kr7m&NCw}w1Ra8HSegvG1cpG%bfQ(3 z97#?a1pfEYj_piAf#zU9Bp;t>@`=YNmkRkFt&>Ja@pv-2|IDH-NOg8bw>S4Uccam# z2Qj|Qvn;)v+(E=5i#b}<0v)%gcnN+>;Io$?9#W$BhY~6ByD^R%Wl~V_%G*l#qs{x% z9;^iI9ZwH!LOheguPG??hRwAf-6t$Z@RdbgU`ALMKth@mMHkW>l&4RjLKnAOY?sB4 z%77m~uo6;U!0TX3og4o1X$ksyRTZIo^=lbA<~p@V<><}SV4jD;B|^3^?oFo0RcJrW zE(-|l+2T}roiLBibE!SrJt!A{u@(;Nd1tSG7*V+o-yvYY8rLwD;6;g92jK#i{b(0}=MsmGA zr)Igu9f{6E_eU4HYyK`dfg0{t-n8^B{g$RAp}hlw4ZefOA-{qnla738)ed*i2t{l5 zjE8uw8lAB2RV2)sm%bToBo1`c<1R)jOcu^khn~8oB_IsxMI*iKC_YA~_un2h6wO{( z{+UgC$f`uqu9jh+^!S>*=H>dl7EDPcSTCC%aAB&hhK^v-ho)`hlB~zCZphwpg6EQ+bWSd*ao{B|X_sr5f_AtD@$1pojxej}Rz literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f14ed6e1a3cd463b531520fc5a7fbd2b71d62c13 GIT binary patch literal 16 XcmYc;N@ieSU}ET+esE(;*}X>qDCh== literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_coding_and_noncoding/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..4745a0f5d6f0711b8851274cae6b391f56f1fb9b GIT binary patch literal 596 zcmV-a0;~NWiwFP!000000NqtlZ`v>r{x5#ow4@6Gl9D%I3?Wt1RL~w)A>_sv@S57l zF;f}xfA2YgkOU?{rM{Gh2%qo1`##?%-Hr(Q3?w2G!Qchh{jYD+HN-w~$l?jMBzhpJ zj5iscX_`U`8JXi%EE&W$kwm)#!fd++HY_hN3v38)zKjMmgQ-@VSnqPUs2Yi-(0mzT z78g`)owdxk=2Bo}B6Qm~&c1at>MKRJ5BtQqgEY8UBAApZP~iwE8X;qO{N3}YF}Sne zPviUHsvUkVI}{Bt5kX4)Q}`C1rp-o)D~-^k{=q z{tpv^Znqtrmi^8&%S*4ULrp-Ro0Z1?Pz1N*Z)lhtV!0a>q(7-%Vf^OiYo2bS_zFor z^NWm7oqEDmrWc&O6k`rL4*D&_LotGyXYeL^FB2I@CP@amq(Oy{n(k{W9gLZubJI~r zMMY_&1N&>OSF$?c)L`>L+^0ban*Ko3yvio@@vLBmE9EGlzhES&q?`@kdNqGe6%+%m;4QmuTw`c#DcG>(%-q=xwwYy z!S&pr=OuxcwA}7(FLXP!*L8a=?6^q`J)gCq%OK2G^q+ewmN1YT!EF7GWEv5yuwtR( zpTfb|wS`;`_9aov^McSE7d|$hVZyB69}b6?dtOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..1d4cef49544aa9f0c5eb07095e6d48a2259f3ac4 GIT binary patch literal 12 TcmYc;N@ieSU}8A3ob@UI6J!H` literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f5eb925f2e1cae6dcc1aa27dc8f82bbd5be66107 GIT binary patch literal 12 TcmYc;N@ieSU}A8tHVOv-5V!(@ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/index/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..17abb0bb15f4cdc3324388d8ff19d3a4f092ddde GIT binary patch literal 138 zcmYdhU|?7b#2Q=m|0*or%EVB@%)r3Sz-ah%nkf?lqa%}}BZD))yIhDaqq7a8drU|c zlcO`EV+FILGmvU$WN~z!&g#zSxPp;~L4jk1w{g$RmftF@0z8*_)_!AnaDsytkKn&*g@4vec?fb=^JEUYXcq5M@?4 zu4(WWRAj|vS{9gc4jhU#R`Tc+mLb#jycsW|Lom2?Iua^HtJNuYjl-D4Tv8qQID8|| z!WL0iOl0PuIGNApui@l34eX${b#Y`Y$Cs)&$PE(oXQh?9}A~iZSX-h$~ zw)w@!Q{*Je98x>K^U$2WF{nKL-)F_flvAadHiJx3S$2Qu0_Kj2Eb}+a=~te&ZiAG} t!5H8w;tY1CYt@z&nu%2N>IEv;B4U;UjoF^_+r?XJ^8?e=`}uJK005_^rSt#* literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..96657c79c5793517b4857a6f97fd44b58c5eb6ec GIT binary patch literal 16 XcmYc;N@ieSU}6yZxk&KM9LE0uByk0S literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..782f0d8ffad2b065efedbc30710486eb314fbb2c GIT binary patch literal 630 zcmV-+0*U<}iwFP!000000Ns{dZ`v>vhX0GNHZ637@YUP}25lu)5LDB&3PmP90k6S9 zj+shP{`-y-$QM*9m3mVyP;z`e-q(Jlcuxdf1F^}SVDJoV`s>GR173p|#C?Jtu^&kg zdfN!!X%IjF5t-wX7c%g4BC+?!4iCExuzt3HS)fC>@@>$g5zLg_dTO7vvx^Zg6pAlx z%;JK|ov{fcu6QUgawnA4h>gQ&d*Db(caMj}x#0}$!7vCZCK=AXou(U3|M+wpuCw6K zos1_xo!{fVgTJ!h2H4so^E>lW7hP#9|?{|-h<7DoF2%gwaEvN~TgiK#+NTb?xWx$L-45n zQfxC-Cz>j1J_&saq@ck)Li0YFl6exud43XHU7rWwRu21=!Zh-+#HBMNc%fwH6{AQ!19w3%7#l`aOvhcF@Ms+XOzO1mjc_ zwH)y!wpp*&%{aep;P3}ba0nhZ0m0tX!|HY#R$cEA7hbSs2yikU{XrHR=w><+YsFM4 Qse8upe--uh&=Uy&03*jPssI20 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/parts/.part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/parts/.part-0-5419bf36-548c-4524-b44c-cd77ed3f191e.crc new file mode 100644 index 0000000000000000000000000000000000000000..d293389ed168ed4a87a14e18b26b56363ca9b8da GIT binary patch literal 12 TcmYc;N@ieSU}8u)dr%($6DR{* literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/parts/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_exomes/1.0.ht/rows/parts/part-0-5419bf36-548c-4524-b44c-cd77ed3f191e new file mode 100644 index 0000000000000000000000000000000000000000..14036f84e1c94b573c62ab10d9c7e50c4f532951 GIT binary patch literal 123 zcmYdeU|`q{#2Q=m|0?W0%fzsSneh)Jqv6wOrc4Zsj!ce@&h8tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..3dfccd27db1e08d973ae1f380ee91f0c8053ffd4 GIT binary patch literal 12 TcmYc;N@ieSU}BiN=IVL?6j1~S literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f5eb925f2e1cae6dcc1aa27dc8f82bbd5be66107 GIT binary patch literal 12 TcmYc;N@ieSU}A8tHVOv-5V!(@ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/index/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..e543726c14fb614af82ee1f4d053e0ec1fdaf6a9 GIT binary patch literal 138 zcmYdhU|?7T#2Q=m|0*on%EVB@%)r3Sz-ah%nkf?lqa%}}BZD))yIhDaqq7a8yH`jA zlcO`EV;ZxgGmt7}add8FWOZkBoW{t*pujP!CB02?vc&u)3{8o%7G9N0INSh)&9lAb UfmR5?tPz^Xz`)1|naDsytkKn&*g@4vec?fb=^JEUYXcq5M@?4 zu4(WWRAj|vS{9gc4jhU#R`Tc+mLb#jycsW|Lom2?Iua^HtJNuYjl-D4Tv8qQID8|| z!WL0iOl0PuIGNApui@l34eX${b#Y`Y$Cs)&$PE(oXQh?9}A~iZSX-h$~ zw)w@!Q{*Je98x>K^U$2WF{nKL-)F_flvAadHiJx3S$2Qu0_Kj2Eb}+a=~te&ZiAG} t!5H8w;tY1CYt@z&nu%2N>IEv;B4U;UjoF^_+r?XJ^8?e=`}uJK005_^rSt#* literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..73b9be35b6abece7829d3f4cf096dd9e1c7dff3e GIT binary patch literal 16 XcmYc;N@ieSU}9*?$q$Ne?`s7BB6bAl literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..d6d0a81ca6f18cb88a8712243d6a95bc2b8bb757 GIT binary patch literal 630 zcmV-+0*U<}iwFP!000000Ns{dZ`v>vhX0GNHZ63J@R_^7phaQ@K{ZXQP-Nm0@ERQC zn5h)yzwbDKBm^pzO1&u;C^G zZyVt|4FU)tB6D2wLI$2rB=-K;;bFG{*2@+!3v>uqzV*8_f|-(APwkU-b}{0GLh+@I zSzJ)LGd5ww6%PeQhC*3pY#c`0eMd^Vdpso04QF8Y2SG?N$#CxNG~IA|$EUk+odpN( zWIXxl{C3Av_jWYyI-|UHfUCt`cd+J%!dM&mcQA5dVWj`R+)jHdtMxULm@33XE|!VF zg1$^%uI_blC2c3ta%GFH?}IFz#Hin`neA@AWDsw~16D}sqK7Axc#^cx!5M$Sgl3~* zHP6fbM2aP)SJvTv09}_WWBo%DT(ej5Fg(T*6eR=`s#X}kx%r-?>u3PfPk?(9g+=6i zE+ge~_M9@-tQ!b>gfHpwodAm~_D2yM8=>zRUMSgRMOQ+Y z**^jA+;-V?+4A+6Ub6Cvkbu%4_wPLE2 Q)IH<)KR*uTkP`_20BS}mP5=M^ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/parts/.part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/parts/.part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8.crc new file mode 100644 index 0000000000000000000000000000000000000000..c0689358515006ac3a30558f8ebda24635a49eda GIT binary patch literal 12 TcmYc;N@ieSU}CV1*l7U(5hen` literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/parts/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8 b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_genomes/1.0.ht/rows/parts/part-0-ef7f1a2e-5a3b-443d-992c-32cbd5d9ceb8 new file mode 100644 index 0000000000000000000000000000000000000000..a0f3274abe66575c7efa8cb869707804c6d7aa7a GIT binary patch literal 100 zcmeZgU|?7Y#2Q=m|0=8$Wnyq)V*JC%X!vxRDH8*uBa@?}vwH|5BO{ZeGb6L3GmE1$ xt2?7312=<$fXP$CVz(|{)6_Y^?A5OX4#-ZBoWMFUU6vPU90-7oWME`q0039u78U>i literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..04a6ed182e311a29cdcdd10be0383fe08992b688 GIT binary patch literal 12 TcmYc;N@ieSU}8`>yCE9@5;6lr literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/._SUCCESS.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/._SUCCESS.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/._SUCCESS.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..3619124aff1abbe46f26b17ee6f80736241301c4 GIT binary patch literal 12 TcmYc;N@ieSU}6aK{;dZ95#0j? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/README.txt similarity index 78% rename from v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/README.txt rename to v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/README.txt index 9aea8fa4b..97552d24a 100644 --- a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/README.txt +++ b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/README.txt @@ -1,3 +1,3 @@ This folder comprises a Hail (www.hail.is) native Table or MatrixTable. Written with version 0.2.133-4c60fddb171a - Created at 2024/11/02 13:12:12 \ No newline at end of file + Created at 2024/11/23 12:20:19 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..ed78b289266aa1790fc8fbdd2a2105fe8fccecdf GIT binary patch literal 12 TcmYc;N@ieSU}DHLIpGQb5tRbv literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..10338f5f259b1468c2dfeb6f31a4ec0cd962146e GIT binary patch literal 254 zcmVc#I}A-zhz$#rv`SK> zQuV*j?lv~cX*t>Nea}hUfYV2U1ePAj6RuunSxIID19(4B4axeAD^nmD(IIJm0?&0x z_;fLtB9Yj+OO+0kMPHfNoX7`TxP(RPw69wsPv2yZ;nuocM{KI?mVEEZMXDY~kLuSN zfcdaF{y|h;m;(>Ff;J*%_RPR^G8s>AW3M4%7yW0BUrhSgPquRS6Q%ZXZgUE}Mv!KB z53V>v_)zY6P2ULmisvv^T-(NbYVX=R#@(Fe)2pDQ)j_sszS E04>mRLjV8( literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..748a41603f7d8f434943b1b117f36754a0ac1d5a GIT binary patch literal 12 TcmYc;N@ieSU}BIy6y^c|5h4Pz literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..7b9ae4ad7c263def2614a8404535d91bce0b3b5b GIT binary patch literal 12 TcmYc;N@ieSU}Bi|S|tf-nVwoj_?ukQyWv0BW5I A^#A|> literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/index/part-0-60dc0150-c0ed-4ee2-aa12-a4459d0ae33b.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..5f7a34128fb937f0389896b2ad710b51382ba266 GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6;s^%tw9uyQ6FXADs+a?$iNw#1p z{dc$CyetDV^LBb@jKv#6+dM z5-t?FntJQ&F3Q7rC%f}T=ZmRO)}qBI*CD8M=O7aw0|-#gc~ZnL{~@F2_Lw5bnE>OS n_&F)k9QbK=gZFf7nut6j&zVF_tvW)k^uzE490YGC&j0`bbSY6H literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..74c5ccad05fc9b71ca9fad9e97e106e5004ba35d GIT binary patch literal 473 zcmV;~0Ve(*iwFP!000000F6^!Z=)~}{4YG+Np97IDmn2+fVAk+d_a_!C_;`|8mo@Y z8FLki`0q8;IBKes?ge>gXJ*&daB&LCAo~llqJlf#Wa^H8A?JiCyMoTc{rG<5x_6Ic z@=)aYaWoxa2RRZ7kiZF4nFN-VSe-QaifW3&M{Tc~5_D(V6_-~gj+l9=k~L8mGo&ji ze{InlG~S*GQdbBw1`MhLlHz@!_Z@H1eSbV9!O5Ne9C)5DC2sbgMp{(S+_<7YwH`QU z0;|a*(ic{GsF#AWWXcOMjPL{F6m#W{56GXJ+8AhL>GeG{!xd~2%-_=bZ2bd9%Tl}E z;^EAj3Bdq!6t0aym|84ah6bFcR`ZNYCc4*4iI8C7m>(rTu7?Byio)VAi|BUy4*GV1 zm(5l3ZU1HT|hMmLo4Vzw`K!>KRR?iQk0Bt!EGbBb)?( literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/gnomad_qc/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..8fbf7562a95dddb77d3aacb21bdfe68c0b44d013 GIT binary patch literal 813 zcmV+|1Je8-iwFP!000000Ns{LZ`wc*fd7k6t!h%9De;YmNs$NvjB;p%(Bd5@+c>+f zy;Pyd|Gl#|enS)@M2Z0inlCHPer#vM_} z-W#MgQMQ%d8>Ny%G?Yz^SgkA9`GW8#Yx|s~8@SB#7=NfJhJDN?gbRf0!&wq!Xk}7! zI!wePD|7rwJXc7(K+6_BWb!fJ_y81h4Fot&rns4y)n1hFNIN(4oL{w@sd?R7j_=I% za?u-4*VcGxX7$c)@q9`-A7)V&KamFU*(B0Jp8t;uc9$=^&Ak#7k=R%T$Z!SX^u8uwGSs%GB(q}xW{@lHgR#mB~RLv)$4_6&0?i8VU zA5Bp{ANX;8B3#$FQDI#gc9p_d^0CClIefrvNsm`_CWLAG1@NkEmzyrPd_JbfS~=0k zXCxOd?VbPp(H&&BQ;ZU7D!4~{Ib!OK7GHAJp;oV5)~Q`@bm}#`4z$(=Xx0d6;8(ld zxuSIft=1-b0sf7DYY)&DTTYe#UJE2fP>sc}%lIQE7_&C8mIPnTXfzvFUDX|M`WE|c rw{g{JD+k7HcT17gC=<2?xJEt6AN=gLo*{&uIq~nP=9NXKel2R_|RbKft&X3yyHuu5c|yN<&}rQ1L2vL4u9 j_DvFquzX_Qf9R`5Ouv8wlW!mcFVLkR0Cp_{BLf2fH2*@? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..b0c586c924118c9a31d342520800864de1046d2f GIT binary patch literal 12 TcmYc;N@ieSU}E^MuDu)p67B;t literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/._SUCCESS.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_hgmd_1.ht/._SUCCESS.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/._SUCCESS.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..a907128509ab678d5754c1aa0783881fa477d068 GIT binary patch literal 12 TcmYc;N@ieSU}A`SSbiP=6IcVl literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/README.txt new file mode 100644 index 000000000..1cbf42ffc --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 18:19:21 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_hgmd_1.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/test_hgmd_1.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..83b98791e07aeb11c3a03fd3b50d4cecdd64d1f7 GIT binary patch literal 12 TcmYc;N@ieSU}9hhN(le}4~_ys literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..f4a48b8e45d113a8b50f6ecca6df61eac5356005 GIT binary patch literal 280 zcmV+z0q6c7iwFP!000000F{wTOT$1A$G^*5ji8kl*_+XL5`;!PSc+_Sr*&71FbG=bUU-r{ucpn0gqf6CQ1i z-3_!6r>FG(z}Up?7Tx3mPg#0ByG?(r1k{Hq^sh)-RBBI{QE08(Isl>!lNnB)29G$F0X(;=W{9T4$~@H%~lIOkV#l7q1F9htSX|W e2R~Z$TzJ+p&al!kO@`n+ce-z9+2_w20ssKw%!n@l literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..9c24a2765f4fc635c83e90ddc19106eebbf9df08 GIT binary patch literal 12 TcmYc;N@ieSU}DgcFPskm5C#I2 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..50fd51ef86d0b018d8991e218e5ba65bfb03675d GIT binary patch literal 60 zcmY#qU|^5}VvVi(e-)%IGB7YQ8|oRbF}e6MyZG8Oxdbq~xCJn|1uzCN@B)>A09Yvl HBLf2fG~o&A literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..13f00a251039a4ad147575decc1f8e9f0bf93c4b GIT binary patch literal 12 TcmYc;N@ieSU}BgRIr|a-68i(2 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..d05ce69d6bfae819659d76653375508ea008edd0 GIT binary patch literal 12 TcmYc;N@ieSU}DhTev<_N5(fhM literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..0a9ded614e6f4256899eed06b9077182b70ce738 GIT binary patch literal 19 Ycmd;QU|?VZVvVi(e-)Sn85kHD02Jy1PXGV_ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/index/part-0-182502ba-0456-4d1b-a8ac-1cdd20cfa893.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..f4cbe64c4ce5d27fb4cf7d8a003b3c9c0d16da70 GIT binary patch literal 182 zcmV;n07?HJiwFP!0000009B1K3&JoEh5yS<3K@!)s@X))K|xV*5r=EMHo=fcas@-_ ze>Zh=c^tg&zSl!zEZrzX;~mJVD$1pRI-pb9xOR(*eE?Y&t4e?-(`p7e6Y!#RO>Y7T z7Y?1`-g?zVc^GrDJ8$%SF%@VnS&DKU0;YQgv+*&20CdhXa-RH`jGfz4iXaOC#y#nC kLe?4h-MJ0k^Ra0n@ee`H(z;{HAN??V0XHd31I_>d0B>APAOHXW literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..a9d578a809ece22878537a4fae721097f7fb00cd GIT binary patch literal 317 zcmV-D0mA+tiwFP!000000F6=4Yr`NE{$D(8;7XgYw(&MP7-Kt(cI;A0Fy^I|Sq*Ad zLhyf|v9WD-+C_Zt`|&=X9jds1{0Fq*N+L2}q{$Ma1Lfk{MAJz!nWf3Ye6fUC^_Ub{ zmLZHZRG>jJFu4Yi*VG<7`P%gqOABvr+ZsIb*%#ejsDcWzvASWl8;0~HmwODdIhu$u zL5(3#TGMuW#&^!)Z)nL@DApKEg#ZC&hqXFIQ9#Ihf8M>USBG?QA9-UdOqtJkJ3P$G zc}ZTSO%vyQex{DKk|3=?q(kTE;VS8|f71Hvc@-&}8d%uMWfhU{d{>?RuM1=^s;Rcz z`h+=C&5&U*_=wOdAb-swe>>gvK3dh_`7m<0GzZO?RdQpwWLC6q&_aobUoSYn|ELeY P8KmwPh1}EAMgjl;-btCs literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..4ed1c5fa38ad139a332d32ceb79d42aac57c2e36 GIT binary patch literal 16 XcmYc;N@ieSU}6Y1w3?xlDaZr>9KHiN literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/hgmd/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..f90d95be0d9d9528b91647b285aee723bab6b7c5 GIT binary patch literal 576 zcmV-G0>AwqiwFP!000000NqtlYuhjo{x5pknBB!r;<7|9rHB?Lt}Ct6iG zNXlpy{NH=BV>@;-SPFfqAB_F!?z{W$KG2>h`T!&$6UE^X`1Hr8*#=lhJhHfl9ZBwp z%-Gi8Ka~_Gftl4?X7>eM(t=o~tXMfr zD5`hfCNsi{Od&E+wjbu+p*8{^wWhnfDs=8q#=saCTxBy!jadku9Fkope-53rdboA( zH(#bV>;8wgb&0Y71`4%0rZ16(y&+$2(-d@P8n@$75XPhC$RlhmBtHrnVH@|z*$P@Z za3=m@Kr|foqtmos#_}Td!aNiJ^r4%Xy9Y%GxBrfYNre@5&>+82yGH!#>U|OJPzf|G zmi}K(INo|Hw6QqJ$8yclz(d2eR@}tmN%Dd&%DpU24cNL9s#Ip_A^WICp3l$usgZId zq&89|;8OF|U7hsQ>E@Lf(1KP}ensCr&rn4@4eGpw3db%|Az{&3_Nfs|>h}i2$RF@v>C>1pKVY1PJ(kk=x?kMXzo%5KV3chYclmo_EHXG@<-Lv{ zrH8R+H>0(KhtQ1PdA#D$ar=~`A zdn8DO3=wmdXToCC+R9uZ@;SU988Ch*l!;H_nM>aj-4B zj$7=87Nc;wYP%Vt{|@aVd+`vsit9DH(P$5Ob~(AuzNrM{n>Ol)Br}!VQp|~H#bBX7 zNidsEuV%k(FFvTf^`Aa^Gm3KT?D)_SwQ%8H8}zg;;LG_1@Y>$z!fr6DqE$?_-~*8q oOD43cFSL~=Dq6*!qQAoq literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f5eb925f2e1cae6dcc1aa27dc8f82bbd5be66107 GIT binary patch literal 12 TcmYc;N@ieSU}A8tHVOv-5V!(@ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..c05ecffd0b102cca641f706c77f24009cd497947 GIT binary patch literal 138 zcmYdhU|?7R#2Q=m|0*om%EVB@%)r3Sz-ah%nkf?lqa%}}BZITByIP1Pqq7gAdt68X zlcO`EV-q8@qcf16%Hrs}lGUBjaUUZOg9672Z{wbsNB^j>3GiIzS^JGanoWjHM($1w VlRVH0A(%Bn6B!s78G$?$TL6>>9@GE; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/index/part-0-592864e3-2b8f-4984-b6ac-79d57ab6be5e.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..2d8bd5ec52a289cce7bb10a504c2361ae82d00be GIT binary patch literal 186 zcmV;r07d^FiwFP!0000009B5`3c@fDME_+^3OTg4RLxBUJt!zDUc^JJ+a?$iNw#1p z{dc$CyetDV^LBb@jKwR4XuJbiRavs2Mh3QEKr*nU0a0q|_5r?B1LEi=#JfgAP~N*kyAx_d1z9>%GuQT)bVn|C7-W60 zNwZpT1|?I{!dKlzl6zxlvpT_R@9`%zWF-`4nn&e#SmIY%w(f$A(hv+Vh&UH}(~zpT hbzCwhnipuGLL|%^eE5*$kAEGl{s(y^btq&4006!-ojw2n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..ccd12753145d93a7cbff5eb44ea4c7b42a2febbd GIT binary patch literal 16 XcmYc;N@ieSU}AWum)tB8J~I#iBG&|x literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/splice_ai/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..1a0ea5cb8d8e60a3ff1f5ce3c27b00724f7c8756 GIT binary patch literal 601 zcmV-f0;c^RiwFP!000000Ns{RZ`v>r$G?l8HZ^G>fdF|c9i5P>X@Y7Gs}OSH3wU)M zW;;_E%6IQMfj|PAN~OM(2Na*r_rL$;C(MRO`UE5-W69td*yP8T=?Y?xIAs0^YZ5*X zk;a+EH5CE`XfngCIAsvqL=tXx74Fw7U^mqUW`Pai&c$d*HB61lVzVjAg@=(im4+`u z%;J)&wX;e!H$0UX8B61PxwDlHN7qWx^}{xC<}eV3&UKtB2uHUfrI_vLmEbX8i&Dey z3?x2t7H&(~w*G9}sx@?HzdwDQ+${ax=Sr~D!$d9@b-Dkm?Zu)jNkBD5bvGKez5Z~& z<&b#E1v^L?Vq1@>T1$#-aLWH;LeTB{!SS)5+OXp2ndi_V(5L3f+>-C8_v^(7cVNLQjD_%};^r zJmCW7jhB5)VHtT};z|t#_&dW+cXTF%mHirc$F`fMo0iYV^pur<82O0g;?UlB*Az3@ zPp4ibC${*CDs#*2jf|JHUAQf`JMj8J2YlCy`ia{a^gC`8(Aez_Shq)`Ac8Iw7x=FU zl?xcAnPm3-hG>Hotg-UC$A^+&RBdA~JA6slZnawNeE)X<#qTlUn&9guAla*G`2C*e nx9p6#j77!};6fh#K@}U>Vm6Z7j;S%q*39!4Fl?nFv&iCF&Ro|QyN573_aFMo z=pMr4=*;K{)Wg8S?C8wm=*;TQ2vTOt%)rf{AYie#u$b?{2kuX{R>|rzr$i4ms5`jd RGCa%+(g6fun-~}w7y!y8AF}`e literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..3a1d6c82b7ca9ea0efc3447f1149c924e4e4816a GIT binary patch literal 12 TcmYc;N@ieSU}9*_*;NAo62Jps literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/._SUCCESS.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_interval_1.ht/._SUCCESS.crc rename to v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/._SUCCESS.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..8c7544dcbe2014bc58c5223caef314a5f09d4551 GIT binary patch literal 12 TcmYc;N@ieSU}8|cof`}Q5z+$t literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/README.txt new file mode 100644 index 000000000..664e0a800 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 18:15:58 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/test_interval_1.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..93f7867efba3bfdb7fc4bb2ead48ee0801204618 GIT binary patch literal 12 TcmYc;N@ieSU}6YgcETJ05|slt literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..86fea99366eebf7228fe85122e048c8a693c65f0 GIT binary patch literal 12 TcmYc;N@ieSU}EU4kLU#e5^e)5 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/index/part-0-c09ec7db-1671-4dc3-95d4-6426532e00f1.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..eed268fe3b4c53a02cd7132f2a4149f1c41932d3 GIT binary patch literal 137 zcmYddU|?7X#2Q=m|0*n8%fyhv%)r3Sz-ah%nkf?lqa%}}BZD)ayG)1{qq7C0yK6`g zlcO`EV?48?Gmy$b!T+!VdP;@;86+dM z5-t?FntJQ&F3Q8Glihiv=ZmRO)}qBI*CD8M=O7aw0|-#gc~Zoc|B%shdrT2zo(nMU oiJy}q&57U5cJQ8#O%suKIwH-H*~ylBIX!$rO^^L%NyG zW(Xq@VG3s0|flWp%@BH$2i`x!hxrO=TkI z9R7xu6hg7aU@8O%Fl(&UF)9K>-umCmb1P)b_$A6kAr=25b$frXiz2NCtoqw4s7Ul4m! zO||9L52s|R8G0B3J|dL|=5JWk7fG`2gS4u_8{jJ93=XDi)s`E}C9|S^g%-Am_<@7- T8;Sm~%wToDp9xgHMFIc-;JcOj literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..8f4c9c0ade2cd5c07341658ed3edefa2d03ef504 GIT binary patch literal 16 XcmYc;N@ieSU}9kS&%-UqJ$nlP95VyF literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh37/topmed/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..2157672461830463d7b6b31cf3a8622bc43afdd6 GIT binary patch literal 600 zcmV-e0;l~SiwFP!000000Ns{dZ`v>vhX0GNHZAG$0in$uFxV`3@X z;)HEeoYOdl7*aCBC6;htCXs~uvckh|1AJI5U>2AVu6!8{XbMxUwoLEya$z&V5~;;9 z#4I7H+F6?<6GWsJ~S3}T; zVP;zUduZ*bsXO!k>HB23@_U~v(Lxgwgp{I_ z5w;RBZ=?*cuP2l}N?MrURQ$t)pxgC>(`~;~&GW4n?n942*X^B|`JoA62x)n^En_(_ z3gVwsuP}af^)*j7(HLlyWAQ3hnqWjirFt&-b3sDTu@K1=C&iTiDaf@sMmWnwU4Vys~_d^*jPiGk6P$61f9G literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/README.txt new file mode 100644 index 000000000..4341aff5f --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 17:40:02 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/_SUCCESS similarity index 100% rename from v03_pipeline/var/test/reference_data/test_interval_mito_1.ht/_SUCCESS rename to v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/_SUCCESS diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..28013b1bcd69bf9575b63521d37f3ee66ef63015 GIT binary patch literal 12 TcmYc;N@ieSU}89?u{sj~63+uY literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..1a57f3c95643e27773ec8de216fb631c498d0035 GIT binary patch literal 298 zcmV+_0oDE=iwFP!000000F_ZsOT#b}|1R%p1)Xe2ZdU6_5LWTv7?LG#U0j=%BpuY! z@9ukRoon@CPRZ|YlGlQj1%iNQr_rHazCNTCsu@J^zQ+c{YE3&;YE+DaXul0FO@;dY zM34#@BIY8ArN=b5T7~w2&YRK-0*#ffiO|-C7b8t2!>-KG8f8Kwk~`;Qvl{g4NP3Sh zaK?n`I&M7-t&=`F8k-GmfA;iYz3#kag!dhrdb>q8`GBV+zMkF2KU4&2XL5ZeiSttJ zDP|?betwbaGpKg7YMgM23rCE09L<_w*UYD literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..4a2b0ad2139ae28c6b4f708159ad249403b8cfac GIT binary patch literal 12 TcmYc;N@ieSU}BiFX|n_X6NCd) literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..1abb6a203abca8fb8013f09bbe023fca790f33bc GIT binary patch literal 360 zcmV-u0hj($0RRBQ0ssIgwJ-f(!vQS{0HzW;BM_0>u#S||ri^40WhsqF=_6PSz@9L6 zty9eH4-oc19pjz^HUKXGEdXsfVp1S+79a{vzI<=tg=-^}k?XsxcL~QLVdmw9PB;ln zHyua6DjKB}L_H7Xc_@~OL!|)-tvw7V5M zwK4j#{$kqHeLaS5^_jj2>BKEL`g2pl41NUZ%aUiK!!-2ByiWS+n~vVumYD&x7;gBy z7*WRVtKrWw%AsbK&aKxLgBaWmz2SFhuL<+HRmY4Tn;Y^nG(TIdk&`~l=a!nz=s5ri zBp^^CQjmB9TM85m=CO23$!hFfanLRJkwHdS4?zdiNkCt&mFy4}x-?u$;0#bN8QLg^ zFMIR9goY3D4pe2GXm?pX3#NI2ap-T7iIWV3xd$i{h2`W;IROm-00000001bpFa00@ G0RRBz_?y1~ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..88704d58d3d431b333e17d3c6aa29acf6b0b171a GIT binary patch literal 12 TcmYc;N@ieSU}E4qoLT|^5kCUu literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..2cbc16fbb138cd4c174b13bf1615d50e2c7f9aac GIT binary patch literal 73 zcmY#jU|D6A literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/gnomad_qc_crdq.ht/index/part-0-fc4518f0-e0cb-4157-b60d-b6ab4c5f4a75.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..d981b2bc003abdf429edbb0cb53d7221950d904b GIT binary patch literal 12 TcmYc;N@ieSU}Cs8^M3;X6+r|s literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..cc8b04991a11eb0dee87d4dbad4ecd063a46d5f6 GIT binary patch literal 12 TcmYc;N@ieSU}C8G7`_|;6PN?0 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/index/part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..a167f2c9dc3d5866a297822bccb8211fdec35853 GIT binary patch literal 129 zcma!IU|^UC#2Q=m|0+z(WnxfaW?@?~UVV02`3W^m^Zkz)d~Ll_;6ot@mn zncZ_(fLv!r$7=U3#*mqe+zduP8yF8fFsNufec%Gab%y?gjE3yy*$MJMlZ9ZW3jyt6 KWCZe135mxfJlibWLvr z2^R`|Lxc5AALU`Xv)y^4SLIwNYtdqq>kw4BbC8LT5dxYr`NE{x3dl;L4h^UVIxJjIuqHcI>5uV$De^GaA&a zgpmI}wY8Uw!Cpkao5yo(QO*Tyzd;+WBqHO^f*Wc!%^}uo=jnwKP9`_ zY=$t>kb?%vz-%>$ZAIVgkrV9R0t4YQmoY>h6aYb`}OX1HQ$e> z_fglsBA|SB3Fw-><5-{9>=AJ?^^{7fNJnOSL(%f0)j%bWpq| zFex_muH@Fb_Vvnkb!OaqOusd59!!;$k|3=?q(#S3gsr5dAJOXbWgaE#3Rsx(d9+dA zIaD3~?+apQs;GNRcuwfXb-*%G^T71OqJA^P!^hM|X;t~wz)i>*JlSql88>cv%!=j> b8psfF1B1?endlE6G+4VIj literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..d0ca49c72e6c85ee4e376f083e2c22b4e105f8e6 GIT binary patch literal 16 XcmYc;N@ieSU}Ct+DKEelV^;?N9D@U9 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..4481918794e3fef528a503598efc8480f5852ec3 GIT binary patch literal 718 zcmV;<0x|s`iwFP!000000NqwwZ`v>v{x5#ow4__Ogpjww*o4$g6I6Rxg^**PfY;PP zj+x3({`-!TG~`ybRO?H5K*_m&=gW6wABm)QKzuTn9PWWHetur=fc1z)*0*pV{tby@ zwomY#MiE4ikQKUE%z>Fe;vXL>TpxD8r=}B{8tS{6O`M)LIk_ytwj$z{kO|Uui&bhtfiW({Ukq@E!@k>2`;};(rJk9G zkb&McGgI@!A%xkt#ticZTef}*BJ+*VesS?J3pdsXXqYY8vo^1)_E4xq<2Sg^mF*5J zT<`>6IyM76{%dV*YQ;4@^&0f&Zc2%&g{gt&ji?Wo zPSWT*isof%%K9mg-TKMox(-DI>&C}!t1!2`uHy0@3RtXoH=;8|SdL#4UVH53(9Mz0 zmuc51j|}n_m5Zdk@GQ(YII&Z&k`hyVN0t5}%q0i!J?6#!WYduY)Zxr_oxrgNETHz- z8wR#B9MbV<6hIJ$SUHP!Y=FRH0S9}`$3uG%P|ps+KC@xudc6=I9oNflP`?*cZeW53 zDmUGZB*BwZ8yfQa7Cz*9M~rN%9MhuEJMq1d({sn^=}v}lFEL_DP#^>(|1WifWYM0( zG6#k{L1~+r7RHiVPa*$S>*%}$biKr&USf*Q>*wP&+*7TUX_BV@0P{7XmH!F=055xD A@c;k- literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/.part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/.part-0-a71ea1dc-61b1-4cba-985b-155a977bebff.crc new file mode 100644 index 0000000000000000000000000000000000000000..457206bd300fab3169b5233282cb22010ff92ee8 GIT binary patch literal 12 TcmYc;N@ieSU}E^I)DjB-6F&nx literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/.part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/.part-1-eeb8cbde-9d95-4ba8-bf3c-e7682fbf3168.crc new file mode 100644 index 0000000000000000000000000000000000000000..1050557d2751e00b4f428f4ab73dc755fdd70204 GIT binary patch literal 12 TcmYc;N@ieSU}AVzvvDr~6w3q? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff b/v03_pipeline/var/test/reference_datasets/GRCh38/clinvar/2024-11-11.ht/rows/parts/part-0-a71ea1dc-61b1-4cba-985b-155a977bebff new file mode 100644 index 0000000000000000000000000000000000000000..cca9ee963362b13e166e6afb58d1307492e16e70 GIT binary patch literal 52 zcmb1VU|C%8^T*&6zFs~~x z_eX+Mzz{JPX`(!20izDh9?;v;Ndm1?SdYNP!pk$wa||kwXs&z!Tvc=rncg9)deOU< zZOaq({e(d_wbpI>lOJj%^t;W?i6qHOy`z{j(VD@` zev)81ncPf&+unRI2kSq5bZ1b-(An~#M^p;Kqs_t7x&$|uSAy5}F;i{}Ng17Csu4aA pNwH)CXS%{Vokv5fI8gL?aKuv9Fw%0G3IRQP+AsDWPSZ64005>OhM@oe literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..097026b06e614a288ce16114a9787e1bc10ba372 GIT binary patch literal 12 TcmYc;N@ieSU}AXoppOv%6x9Qe literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..0e97c1b5e71ea981a3644b664ad36601cf6412cb GIT binary patch literal 51 ycmb1RU|x7#$h?7y}r1fwCX~R>8o?zyJWVGY04Y literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..88704d58d3d431b333e17d3c6aa29acf6b0b171a GIT binary patch literal 12 TcmYc;N@ieSU}E4qoLT|^5kCUu literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..5532412215f7e55602c6570cb0300b9d3e270fee GIT binary patch literal 12 TcmYc;N@ieSU}E69!7C2{5NQHQ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..2cbc16fbb138cd4c174b13bf1615d50e2c7f9aac GIT binary patch literal 73 zcmY#jU|D6A literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/index/part-0-113d0935-f89b-4d20-9f25-225c16c2f941.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..847060b9ae64f4f4e7be710d9828b72ff537dd5b GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6;O3h6KJt!zDUc^IMw@ol4l5D|H z`tNSNc^MXF_RS2?8jCj!(Rc^4EVH5#PzU82ZCtli4fz1F$X2BQji%ECaw*`2>6+dM z5-t?_h6d}KKFY(Sv)y^4^J*@XwP-QQbqFfmImpDv2m+LIo)mHAKV@?~UVV02`3W^fk>QD6eHLl_+moE_W) znB5atfLv!r#{%~T#*hh&+zduP8yF8fFsNufec%Gab%y?gjE3yy*$MJMlZ9ZW3jyt6 KWCZe135mxfJlibWLvr z2^R`|Lxc5AALU`Xv)y^4SLIwNYtdqq>kw4BbC8LT5dxYr-%T{x5mjARDvpqIr|5o6Lcz_#zTwd$eI`5|VVI zNdNni&N|xdVK1fmZs+8joX~*_pdVl?R}zuOr+zfRC}UjwfavZvy6yF&o5!aC^bQZv zemEQ=j0_yWfK0$r1A=PiGB5r?y~MG-k9Va8&#c4YbQEgO1X(##B<^%B5{_IRF~}mf z4fj?!<~+G()`6kgKuSi&uyXsY-c=QThmvfCN(zga5Fo(ruu-2)M9q+S_+C9P zq_)w6A7q}>x7RsMRO-9y@bPForaBAbxCAZYZZC%C%)w~Z#;eM2B~iIn5~Q<;oY40| z>R-w7u&Cu`95<4chNRy0Wx5sR>%cB+uEv>TgPSMMEMFOq=@*J0+oMCjlX`TwYB;JTbxu>9|HgYR~XJf literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..8b18cd63187ad63750fb0842f3c76dac1aa15df1 GIT binary patch literal 16 XcmYc;N@ieSU}A6&Y0|sV+j>8x4PW~@q=%mJ0l)ylj1K2jWo{JtD8 z=v`fT>m{b1)fx{fh(x%e5_-duBUQZq-t$^ubfmsNe4PxIR`=tPv79grL?{Kmo}sHs z)<3zbqpF^%=ZLD|sY=sRES;0I*aQKHl14{v-!QxVlS?E1GK|<+NFV8ZKs<$@g)&aV zKN!$%wXAkE?PnrcmU>|xz8avba%QSNI7FznvNXf+V9SXaBcMUH ziZ9yF^JNS|kw{d)W4inr5kiaRl!t3^G?)lBu)PUH_;qAB?|xmPiJhm literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/.part-0-113d0935-f89b-4d20-9f25-225c16c2f941.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/.part-0-113d0935-f89b-4d20-9f25-225c16c2f941.crc new file mode 100644 index 0000000000000000000000000000000000000000..76c67a13e224ed59f630e5a72b5d25afd09e3a73 GIT binary patch literal 12 TcmYc;N@ieSU}ErVQF;jg5pe?; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/.part-1-a918a0a7-ef41-490f-9d13-73a3e17beead.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/.part-1-a918a0a7-ef41-490f-9d13-73a3e17beead.crc new file mode 100644 index 0000000000000000000000000000000000000000..3a792717daaea22856fab94f8e9135c3ef1b731e GIT binary patch literal 12 TcmYc;N@ieSU}BKic>Fj35=aA* literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/part-0-113d0935-f89b-4d20-9f25-225c16c2f941 b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/part-0-113d0935-f89b-4d20-9f25-225c16c2f941 new file mode 100644 index 0000000000000000000000000000000000000000..c1df695a437734fa5dfbf17ab7b44f3b64e86987 GIT binary patch literal 61 zcmdO3U|^62VvVi(e-)%}GB7ZHVPi?oC^CFH&6J6O(UH-a;i!UvEl`C6122#S0kB#I HMg|4|jh+k7 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/part-1-a918a0a7-ef41-490f-9d13-73a3e17beead b/v03_pipeline/var/test/reference_datasets/GRCh38/dbnsfp/1.0.ht/rows/parts/part-1-a918a0a7-ef41-490f-9d13-73a3e17beead new file mode 100644 index 0000000000000000000000000000000000000000..85a1e8b949043967ee31ae4fbe039be1c74e1116 GIT binary patch literal 98 zcmeZeU|?7X#2Q=m|0*nGWnu_nV*JC-lAKZG%gDsQ=*Z~I8IaEQ>c7wv{;~FA!cJt-)7v=7ca0FVHX$02|1_$iM&q(UKGR literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..da33c11b7e5f66bbb2d6d2445746e91630960d0a GIT binary patch literal 12 TcmYc;N@ieSU}DJT(?1LV5O)Hm literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..d48f4ee9d3ce873752b54c5cf001895ae09c0299 GIT binary patch literal 12 TcmYc;N@ieSU}6x8lY9pN5CQ_S literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/README.txt new file mode 100644 index 000000000..ad58e0ac8 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 11:56:06 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_clinvar_path_variants_crdq.ht/index/part-0-9e75273d-7113-40e4-a327-453f3451dc8c.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/index/part-0-24084335-917b-4b51-8a30-4fe509d64745.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..fe5b0259546131a43884fa7218910270d5ce8d7b GIT binary patch literal 312 zcmV-80muFyiwFP!000000F9B$Zo?oDMgN6WE77R7)jZf$?qks|s-)^BiY!9LF)9HB zxRGV?@5RI|+=n0xMJ26|L;0o4x}%aufAIxF%yMh)fSQbhNYe3BLU-E=mGtbWM1 zi^T$AWS|BEl7X!Zh)Pq}1o%=7h@;yO?>Y@Z`5X@YUZ^b3G#A0ssJ}J$riwFP!000000NqthkDD+M{VzVXDr`f5d~n;O*(%bis* zq|q+Je<}nB$jAzhqLe{o0!eVF7{2Ydz^3&9MuCCwsT9sTczm!p$c7DGrj>CZW9G8nB9q#ns`(E*Rgo3?7exv z`@EPY?&y8ZT5@6_mzy@@DdBn-$Dlf*?`IRIKbka07Ksur*h$C)OFtrWD`;)NCI5{9 z1K)QC=V?DO&5G13^H2uRr*39$9u&dNm^&Kg6;@1Lf%r;wg80qN$0FRI0%%;!_MZlR zI^3A6OoyC3m(v`07W%3+$tD(0f>$(4Zl!UYfv!8DQl^?7vrlT|`RbB~8Yw42YV%YA zt~FfU)mcxSZr+FiH7H5N7xc}`3{})qATL{}aBLWJ0bys@=SD1%S2tZ*MnRM+cDbo5 zeOTW>%X{a_ySlrouXpMtDSr|27R`&7_QI`F+(C0YZI+xE@-0>Rj++gOOV|9^J-hGq zMxN{X_ShM&ZEx*6_K3PY+l#^PjoH8(dVaBxe=expz$D#CX7Udt)5u_imFpWnN)Kbz zC{Yf!Q+b#KqoLCujPqv2V*|GsF(o*10+Rg~9xfxXV`%N7F;osUHZvs*k{c~lYh^0s N`5&g<=Xjn70044(5n=!U literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/parts/.part-0-24084335-917b-4b51-8a30-4fe509d64745.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/parts/.part-0-24084335-917b-4b51-8a30-4fe509d64745.crc new file mode 100644 index 0000000000000000000000000000000000000000..541eeefc3a87f19a83846a73f459ff9001cf09c2 GIT binary patch literal 12 TcmYc;N@ieSU}D&<=v4p!5z_+E literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/parts/part-0-24084335-917b-4b51-8a30-4fe509d64745 b/v03_pipeline/var/test/reference_datasets/GRCh38/eigen/1.0.ht/rows/parts/part-0-24084335-917b-4b51-8a30-4fe509d64745 new file mode 100644 index 0000000000000000000000000000000000000000..0ab20ea110c6b5cc8a75f52125362c8a38146365 GIT binary patch literal 54 zcmY#nU|tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/index/part-0-3569201c-d630-43c4-9056-cbace806fe8d.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/index/part-0-018c9528-a303-4d50-8cf8-eb42ad4d7486.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..a4b9862660656bdd0a1520b5297faf9d059abbbd GIT binary patch literal 337 zcmV-X0j~ZZiwFP!000000F6;wYr-%T{x5mjAd}fLFL^5+Zeur583U0L>ru0%NlDU4 zDgEzDV`FtsdkHz;Eji~@Xv;VxKcF;|0+GQm3`Q93DdQc8ZhJwm9|kvr;RyQMyI?&Y zj}b-+wxB?ypc4f=$*I{}afe!op=%p&${Z}SXo_mb<(hIbH>yZY)jracOzbd-BGaC$ zYy1r*S#p^c8dJ`JgI-}J4_@II^3t4F53_h5j;_6qL`u z7+9pc#!0mKB2S_%9>JUyGCOB7=Bu~PRM)oGRA#vpAdE(&LdQ1URnV+yfA(=7d&xY9 z)J^a-Hm|QODi8nnIk7e6R2gRMD3eqc?H}5Jsi$Jc{0;N^Wq;M=D4B!R!$ragY(1B% j4bz5+)bR2dO4uM`w*$3ZpZB}XJ8S(5YXKm%asmJV=fs(B literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..97942d24469e57904a243a7144871106b06f06bf GIT binary patch literal 16 XcmYc;N@ieSU}8AiA0~5AV(mHrB;f^3 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/exac/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..175ed76404b0e5cec47d27954e9f3ea6f91a16ba GIT binary patch literal 615 zcmV-t0+{_DiwFP!000000Ns{dZ`v>vhX0GNHZ61zKAPqZ7;H#gi(s0jRVXs{3Ahao za?DhQ^51uykWZL2s`aK^p!oQFysz^}@qtLX0pgO0b4zQ$JrR@D@FGYS?a>-kG<*r^lSfzI~oTe#cT|fo`+taKe}lj zv-$$Myb3p!y^YA1ccsGExcPN-Q}J%0Pf9PW z!xI6zsaMA4hbDwUu;yWs#S$VF1XHTl7{9vun53I%0Mt*o`zIL7;Q2yD+7tXaC9T)7 zkogK_MK12RuLyE%g}xu*%g>ZhDML+<)ukD6yf}wrVWngw=P)yHDZ9C<6HPTWZv;RM zO48sCxp^5-$vg$}JVFVvo-YFMYA^c~#5D5i(4}J(uu$>yqAuhxKY#i>+ z^_-Qz7c#I}A-zhz$#rv`SK> zQuV*j?lv~cX*t>Nea}hUfYV2U1ePAj6RuunSxIID19(4B4axeAD^nmD(IIJm0?&0x z_;fLtB9Yj+OO+0kMPHfNoX7`TxP(RPw69wsPv2yZ;nuocM{KI?mVEEZMXDY~kLuSN zfcdaF{y|h;m;(>Ff;J*%_RPR^G8s>AW3M4%7yW0BUrhSgPquRS6Q%ZXZgUE}Mv!KB z53V>v_)zY6P2ULmisvv^T-(NbYVX=R#@(Fe)2pDQ)j_sszS E04>mRLjV8( literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-345f1488-be53-4c4b-8207-b052e86084d6.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..13f00a251039a4ad147575decc1f8e9f0bf93c4b GIT binary patch literal 12 TcmYc;N@ieSU}BgRIr|a-68i(2 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..9c040e1020c294e61dd75a274ce43016da1eea27 GIT binary patch literal 12 TcmYc;N@ieSU}6Xr^1ciJ5OV^+ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..0a9ded614e6f4256899eed06b9077182b70ce738 GIT binary patch literal 19 Ycmd;QU|?VZVvVi(e-)Sn85kHD02Jy1PXGV_ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..56e0d13b418c54a1bcf2c3d7353dcbf2f3385376 GIT binary patch literal 181 zcmV;m080NKiwFP!0000009B1K3&JoEh5yS<3K@!)QnQJmgMy;sA`aJhZGs_@Y7T z7Y@DR!Ftt4d6;sxJ8$%|o(r^=EJe8v0n~| literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/index/part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/index/part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..5d1f2440f71eba4b39abe0f273c4944d80dfc357 GIT binary patch literal 294 zcmV+>0onc^iwFP!000000F9B&Zo?oDgx`fzE77QS)ckO4D)rDFs-)^AiY$WH2`XU` zCXFnMcki0oiIP(<2xh+-cA;gJHcl#jA;{xF%&AgiA8 z?R-9GoO!6ggH=!$9!%k6*b{tF2?=sb@u6`*mCx?b?hJ0F5v$;jDzyD2-BH_J*@~yY=pMx!8|q_grB~7V(T|Kw56mUMn$D0}2lgE!a>q(Xzk0)7HvE z*VFRrWicQt2P%zqo@L-8sd)OoFTrGT(gz)AzCz)#@1YO)7)py&xeXimb;h$zm*mj_ sIp8YeEcUKzs?>GRRt3|%K?5Zd^p+4KrS<;wA;}tl0oq(Fz6k;V01dp482|tP literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..08f95ba005da264158d34a545cbff0df7542b9b2 GIT binary patch literal 16 XcmYc;N@ieSU}8A0^0g~5H}xCu literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..a8dd96ccee2ab0811e2ebfab8a9c78d5829aff78 GIT binary patch literal 587 zcmV-R0<`@fiwFP!000000NqtxZ__Xo{V#q3k&Glw(kyRu6jW(Ks6{+5O_gKctYMj5 zIWAhM^5403`jMm^tiaRyP_@VRo^y}yQL-U|J^)F{L~wWnKK=TBwgeUshs^I`P0|}u zO17%VEdoq(VCI;ln_b_R^%D5i{s61MKzQ$rL?%KHg4#j7Lf8r^OQ6CLQuMLfNdDMLcEqmD_p6W7tHmIBz2iD&YV%~PJO|Zt zcs(BZK{7fVIfN~WlDA4msCbWLoubTuv*ITf#8EVekK4Xgn%mYh`_LxPhkj>mK4_va zo9ua*>@d4n1?6X|7l>b6ytU;9#aqbjBA>P0_9pW}ReE0V$7XeL=wNirw%;j5*EgN5 z<4O9@$*3Kf=N0I^9u-1rx~*w3)%ESEfE)&HjoS>{8uRHi?e&k;d%7J9B%^ZS&FI~d88;6`mjDvb=r zaPwxxA5B(qMz&DCk=SOHq{+|^;!EQk1V6i{5UgBr46PMJ|FFl#hRz_hbM8cOczO0j ZyPjuFWNwXWtqf~j{{rJu^3|gV002&=CBy&# literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/.part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-0-345f1488-be53-4c4b-8207-b052e86084d6.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/.part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-0-345f1488-be53-4c4b-8207-b052e86084d6.crc diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c.crc new file mode 100644 index 0000000000000000000000000000000000000000..c2966988d737bb6b66c9ac057718eee2ab7b485b GIT binary patch literal 12 TcmYc;N@ieSU}E4kQ1}P{4|M{G literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/.part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/.part-1-76400422-6fd3-4b0f-9c37-42546b3e19ff.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/.part-1-f6cdce1a-0e07-4a8e-80de-c0b568f5fa07.crc diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-0-345f1488-be53-4c4b-8207-b052e86084d6 similarity index 100% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/rows/parts/part-0-86ec8a00-137f-41a6-a098-8ef6bea1cded rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-0-345f1488-be53-4c4b-8207-b052e86084d6 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_coding_and_noncoding/1.0.ht/rows/parts/part-0-90a40f33-45f1-4319-b895-a6f9f6f3364c new file mode 100644 index 0000000000000000000000000000000000000000..7e0d00872a3923d250bbbe7b14ece6b492e8f33e GIT binary patch literal 35 ecmd;MU|?VbVvVi(e-#)xfh=Ai1_7`LNDcrtOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht/index/part-0-1d126232-414b-4ffa-aa43-9ed52895fbf2.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/index/part-0-3ff9afe8-37ef-4f6d-a894-cfc7eb27f97d.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_exomes/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c11d3ce6f58dba0613ab8850a7a1f36975eb3705 GIT binary patch literal 346 zcmV-g0j2&QiwFP!000000F6;wPr@)1{x5yfq!QU8m%h!Jff0p(F(!m8WgMWfYnQH* zEX)48UAKbx-*l_U^s&Q{=wUh z$76(%f;}h@3FNT?9%ob^jrf8ZiKbfvhX0GNHZAGKNBPKIV9-`#1;I2;t59U(WAGXr zLoOKHk@UqegO(JpjS%>et4X}Q`fLUNdxbkVxqY=!s+WLB*v~x-!Ka^T5 zO-x9tcGf10gchO1$eq-UdTbqLW5894?jEw#x#bR>!7vCZrdjU2lNMNRKYO|j*LiU0 zO~#XN?oV$#^=?Mvo;xaPhq!v&>lW7HP#7C8e-2(sE=K16!_Bn6YIHv4B2$T&DC9B` zc%(0vSEzegTt!<+TA^&QOlVzS zH(IALL6Hm=Mm^!36(O`^iX{?BaY|iU@WX08ObTk3|vZX zuIfZnMa??_P=k^*xJPbY2UIdofjo~;LafI{0AA&=PeDv0Zw_5LMu8tHc3#wl9OgSv zV(-*>)pynP<)ogo@`91~2rq8hD?d-k9u&J%?h+Hzd_$EU3iIX1%Yb>Yzt|1VyOcx6 zZnS~hP2OTQ?R1;A&;2$m>TTX_v*e=idrIXJdf`?wQ@)JVvVi(e-)Gj85tNESdue}44+OjWny$>bUwh$Zu#v7pFBhO$yXNbk%ui9 P7(};sGb1;5)==t literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..f51ca0ebae6700f40e7c08ecc29fdf4b46618bb7 GIT binary patch literal 12 TcmYc;N@ieSU}Ctl?U6G86&VBa literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..b608109630c25297c26441b2b974ebf6b6612f7c GIT binary patch literal 12 TcmYc;N@ieSU}Crsu%ZS46MO@I literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/README.txt new file mode 100644 index 000000000..92e64ba24 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 15:52:38 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-20336911-c437-4deb-9fa4-7c7fe61f0408.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/index/part-0-7791073a-d4da-48f7-903f-59f1ac95d459.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c11d3ce6f58dba0613ab8850a7a1f36975eb3705 GIT binary patch literal 346 zcmV-g0j2&QiwFP!000000F6;wPr@)1{x5yfq!QU8m%h!Jff0p(F(!m8WgMWfYnQH* zEX)48UAKbx-*l_U^s&Q{=wUh z$76(%f;}h@3FNT?9%ob^jrf8ZiKbfv$G?l8HZAG6U-A|hv`DNVn5JnJirn}Fyaoq3 zW-3GZ?mJG%6)KHteJKwpIX;*F-~LJQkqEj4Vv&hp@Cci!FmQDuv5sknZ-*VQVYYx-phLLwX*8fAES225>KM1PN+CB8im!Ey z3n~xBE(p2efxyT_sAeNFPO~v!OGyt8Y3j>aD@n_hJ?=dZWaTiXqd~pV8{|s{aW~v&g_Hq$ctVM#q(>cG@;{i+X}6o5 z^Rl0zVsYt}b+{iux7Er>|Ih^2_*Fbi(pXHR1bWl5ju~{ zP&u4ECzN$s2J&7aEYHOi>kT1}jKK3kP?-`cM4;%Yx-cV(R+nJRtrU!86s88QB{x%b zrm3RlodBpn3hLh@H?IRKo+m+EMkprM@wg98<*?5|Od@X%T{1?38%TCp)Ri1&J5XZp z+((2cew1H5P=JjYLxM+6K(PPPhYQ!=GekQUhRz|2jr0f|iL@80l+=@==r7DX JwO%#}001Q^DD?mU literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/rows/parts/.part-0-7791073a-d4da-48f7-903f-59f1ac95d459.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_genomes/1.0.ht/rows/parts/.part-0-7791073a-d4da-48f7-903f-59f1ac95d459.crc new file mode 100644 index 0000000000000000000000000000000000000000..b3dd3cfa7b3663c0f3b1408b4259e1d9aa9ad55b GIT binary patch literal 12 TcmYc;N@ieSU}C6lbY|cMN`e4b0Rtlg0|4?b2}u9| literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..615160675971560be3a8ffbb12e7ed632f010089 GIT binary patch literal 12 TcmYc;N@ieSU}E?h{_!CI6&wUv literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..9deb961327b7478481bfee0d48d4fb059878ed10 GIT binary patch literal 12 TcmYc;N@ieSU}Bh@o^u=k5}*TU literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/README.txt new file mode 100644 index 000000000..000c2be8a --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 17:27:14 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..9b404cc0630c825e14481ececd37a01be089c7a2 GIT binary patch literal 12 TcmYc;N@ieSU}BhN-gX865>*3Q literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..46cb776986309b649bc5ca8fd1afd9c6fb8fe64e GIT binary patch literal 12 TcmYc;N@ieSU}9+8Yx)}i6CeYr literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..9188ce12d9c82bc08422c1f8d11696973ca5639a GIT binary patch literal 127 zcma!GU|^U6#2Q=m|0+z0WnvIvW?@?~UVV02`3W^k7dF=GO JBanw;GyvaM8*u;t literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/index/part-0-bccae774-994f-469e-9b30-01becb2109a0.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..ebb08f484c48feab9295bd61abbf10cf62547913 GIT binary patch literal 186 zcmV;r07d^FiwFP!0000009B5`3c@fDME_+^3OQ6QrOizQJt!zDUc^IMw@ol4l5D|H z`tNSNc^MXF_RS2?8jCj!(Rc^4D6+g1PzU82ZCtli4fy~v&#FRzM$>5mxfJlibWLvr z2^R`|Lxc5AALU`v+3vj2t8y-swP-QQbqFfmImpDv2m+LIo)mHAKV}(CDNl8*? zDf#cFS+hBxd@22|?K!7KJ1!vq0&TdGh>XT*GQp^zT%1XCJ4gn@G`Sg#CotUIC!6VX ziZIf!0}Yaa$u)?)qPFnnGwUsuZhXCMD)7pC2im<*8!E`s>YCZM|40Y9++&b+X`<#F z{)C3CgkrV9R0t4Y)>x}UbOeRGbkD2DMOLJf>*ymht!cA5_O9_P%L}mNN!m2Qvv~k> zp=f}wJzvuAybwozbQdzJ{FU*b&p79YW2&r_1ZfQ-EjrE>UL`HNc^4lKS+p!GU}0sa zJw?CstUCNZ7tB6XQEj>PyUUrX`+w*Q-lCKRPF1Gc+w@?-)40ItxV)c^nh literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..7823d5e0796c65b79ae3e1c907c38ada5dfcfdf8 GIT binary patch literal 16 XcmYc;N@ieSU}9*@3ULmoSUw*BAvOfd literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..b47131dc1a55906da3739c246ae7d199a4fa3e16 GIT binary patch literal 613 zcmV-r0-F6FiwFP!000000Ns{bZ`v>v$G?l8HZAG6m6p5#g9)jtCaCtX3L(ciAzqDx z95a=neD@wZ7m~oFQmHTHA;RbSKlV?u4@A-p5Rc4c0M8K2e|=kQfpv&OR!^`e-Xjq) z+aQTO(9sVozQfn2UjT{Tq|O*KUgK{Yv5G5#(L zK{fXLZZc|jZbwI#Ls-N`Pzf2K>?c%J3-XO|!T(`E|N6SyKTZ3c=)k64n1@>cbkodC z%@2p*hO)*Cv%+SfQy_j)9ijc|>Z=VmRsl4$XuXQJDqJB~iC*#GIXi0HXv8|fMRsB} zG|}U|7ptighhb8{iZZB_v8H)$r9+x7&%t+`qcTNFU7`L`QcG1Qni^_82z?rqq~ac- zc^gf(p8|QFAB*cM65i7eZK$Ujy%S?B>wTkuS&e+$jGU z$cj{GAVv296(o=4(cD(J)!fm-eV-#*~dhTG*3*BCS0Peu=w%k_R z2j+L$tpRP>m%yJnm1`KqI~kb#14%SiFoR|92Kc*0q38>b}G)dDJ4S^%|4G91Me?KQx literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/parts/.part-0-bccae774-994f-469e-9b30-01becb2109a0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/parts/.part-0-bccae774-994f-469e-9b30-01becb2109a0.crc new file mode 100644 index 0000000000000000000000000000000000000000..0e6384955385e1aac667972fc61af08e7bcbe6ca GIT binary patch literal 12 TcmYc;N@ieSU}AU|S9%Tr6bb{% literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/parts/part-0-bccae774-994f-469e-9b30-01becb2109a0 b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_mito/1.0.ht/rows/parts/part-0-bccae774-994f-469e-9b30-01becb2109a0 new file mode 100644 index 0000000000000000000000000000000000000000..ea9c8eeed7f768a7e70a7e09d7987e37df335ab9 GIT binary patch literal 107 zcmWGzU|`^9WMI(Ps{c2EpTU)hp@fN%g(W$o$d{3cfzgrCnW67DH-mc!6Ob3e=;-Xu z%mQRNyN59FF+31R;NS3p&Gq5plv~Uu%-a~YG5%^eYj?g$fhmBU7icmFfJ|0k09pY6 Du|gO- literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..26c13730600130a1f7c536d7179eab2b1f798e27 GIT binary patch literal 12 TcmYc;N@ieSU}C6`{E-d-65a!H literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..08d9b4aea8c77e9f64423a2e2811b3ce051ac302 GIT binary patch literal 12 TcmYc;N@ieSU}D&_#Q!A#6XOHZ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/README.txt new file mode 100644 index 000000000..147b7af6a --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 10:26:06 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_interval_1.ht/index/part-0-1224c3b3-ab5b-49d7-8d6d-6084ccbbc683.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/index/part-0-17cf743a-b6dc-4c51-ae0c-c4ffa69513ba.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..27165cc50f87fca71ac8d62e28feded3785920bb GIT binary patch literal 299 zcmV+`0o48=)r8p3n8R*kI`6K(v~eq z`rR!^x6G5b_kRES-s^~J#-Tid(o713*PA#>0@M)3FCe;KMXNB5?$(#alO>Hof2>`Wa|h8eeQNoBvxumrq&Dw~8z!jX!z&E1N7!8vth=!H&!00}?GI@_aP}1f002X=i{JnN literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_non_coding_constraint/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..95e91fc0baf0108a12ed56fce965e098ffa4ece7 GIT binary patch literal 16 XcmYc;N@ieSU}AXqrNHFKp~3r$Nv{UZCcV1egvAg(a{O1nkJa`Fock?FW@z` zk!`9{l>fcw1VR#+3VT}16Z!h?eE0Py;4P8#0Z2rKlEV}5(XX%L6|g4p$m||AB$^N* zE$By@kSS_RabPZjMB9Dp&&>+>O|gJc;3Qo6H0VF8NS3TQnDKr;r4pdB&?*eW+)q?t?=JM@wyn;aDV*|FBlS-*9-IXkD zoTk%LP#%mb*_4}9vgb^Ki)M~msWtIVs74bcO}-jXZ92J=}vj%j$^30iZS-9P?}s9Zo_ttEH) zTauyg<|(czQAx~^#tzt02iV#kX%h9iK{IS%KlmEF9(nB|ca)Fw;y4l%+Q1cwX<{Hf dl>Co6dM(f?sq>dwtI=4b_ys%c*p3{R0DWEz ANB{r; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..1601350d233095b994868a7839af8082e9f1beda GIT binary patch literal 12 TcmYc;N@ieSU}CVXJ$o7e5>Nw; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..be4fcf6c1bb8c35e2a97d6d8e21372fe2ff6ba0a GIT binary patch literal 12 TcmYc;N@ieSU}8{8I<*P_5o`l9 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/README.txt similarity index 78% rename from v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/README.txt rename to v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/README.txt index 1b764aef2..ca607da80 100644 --- a/v03_pipeline/var/test/reference_data/test_combined_1.ht.ht/README.txt +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/README.txt @@ -1,3 +1,3 @@ This folder comprises a Hail (www.hail.is) native Table or MatrixTable. Written with version 0.2.133-4c60fddb171a - Created at 2024/11/02 13:13:26 \ No newline at end of file + Created at 2024/11/23 12:20:50 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..ed78b289266aa1790fc8fbdd2a2105fe8fccecdf GIT binary patch literal 12 TcmYc;N@ieSU}DHLIpGQb5tRbv literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..10338f5f259b1468c2dfeb6f31a4ec0cd962146e GIT binary patch literal 254 zcmVc#I}A-zhz$#rv`SK> zQuV*j?lv~cX*t>Nea}hUfYV2U1ePAj6RuunSxIID19(4B4axeAD^nmD(IIJm0?&0x z_;fLtB9Yj+OO+0kMPHfNoX7`TxP(RPw69wsPv2yZ;nuocM{KI?mVEEZMXDY~kLuSN zfcdaF{y|h;m;(>Ff;J*%_RPR^G8s>AW3M4%7yW0BUrhSgPquRS6Q%ZXZgUE}Mv!KB z53V>v_)zY6P2ULmisvv^T-(NbYVX=R#@(Fe)2pDQ)j_sszS E04>mRLjV8( literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..d5055a353b519890868ee563c213517b35c0c1f7 GIT binary patch literal 12 TcmYc;N@ieSU}AWy7WxhV6Fvix literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..5532412215f7e55602c6570cb0300b9d3e270fee GIT binary patch literal 12 TcmYc;N@ieSU}E69!7C2{5NQHQ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..256ce27b134f5bf0b18aebd3bd8332a56f020392 GIT binary patch literal 73 zcmY#jU|lzK= literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/index/part-0-46f30121-756f-4290-b7f1-e0f9993c9593.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..847060b9ae64f4f4e7be710d9828b72ff537dd5b GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6;O3h6KJt!zDUc^IMw@ol4l5D|H z`tNSNc^MXF_RS2?8jCj!(Rc^4EVH5#PzU82ZCtli4fz1F$X2BQji%ECaw*`2>6+dM z5-t?_h6d}KKFY(Sv)y^4^J*@XwP-QQbqFfmImpDv2m+LIo)mHAKVi;t4PM9$tcV7r>U5Y>#Evj z^NbTp9Ms@I3usCQs&nKIo@v0K?PNYSGY;?E7pECEvOZ|icDxkhKL7uYS32;sdJY^nYK3(PX*v(&HPH!W@Sg zhJer1TE@b%7 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/gnomad_qc/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..972f882ed1081543c8bf530e6918dedf9f85a48a GIT binary patch literal 16 XcmYc;N@ieSU}Csv67@s>v#lMT6HZ64d11ZT{VYEW3rU|M&tU}1x*WfjE zkYlDYl<&Ul1VR#+R4VnQJRov>eeXHWPcl2A=rfR*j1`9$;FF)25My1^v&}pt2nnV)xpZQf?2a&sc1P0Npy7`_aJj`-9_>L)c0tyjC(m zH%~aDQM9n)RQ|<+u-6O1(`~;p!;7sK?!#n2pW8cg`@s;>ve3#fK4=9R8j>H>tWdwY z`dXA*H32FL0A58!9RVSAX6BN=l+Tb4(p`)KXM19o%0EVCCw>Q_IZa+8eK!;tY6h#4xdQni^*S{xJEn$#u6u0|# zBs18-1}krO{818&Q`t&;Xy-sMiKD>vyZ!w54*(6^VZ{x>S583j*Y*j$$PJr)hFI}X m^pT1s8;)of#xOZl+Q^QuN^Y}IW3-)==l=j&TSrU12mktOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..9b404cc0630c825e14481ececd37a01be089c7a2 GIT binary patch literal 12 TcmYc;N@ieSU}BhN-gX865>*3Q literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..46cb776986309b649bc5ca8fd1afd9c6fb8fe64e GIT binary patch literal 12 TcmYc;N@ieSU}9+8Yx)}i6CeYr literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..9188ce12d9c82bc08422c1f8d11696973ca5639a GIT binary patch literal 127 zcma!GU|^U6#2Q=m|0+z0WnvIvW?@?~UVV02`3W^k7dF=GO JBanw;GyvaM8*u;t literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/index/part-0-eceecf38-7b1a-46ab-98c2-147256aff633.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..ebb08f484c48feab9295bd61abbf10cf62547913 GIT binary patch literal 186 zcmV;r07d^FiwFP!0000009B5`3c@fDME_+^3OQ6QrOizQJt!zDUc^IMw@ol4l5D|H z`tNSNc^MXF_RS2?8jCj!(Rc^4D6+g1PzU82ZCtli4fy~v&#FRzM$>5mxfJlibWLvr z2^R`|Lxc5AALU`v+3vj2t8y-swP-QQbqFfmImpDv2m+LIo)mHAKVNx;+1VOQlai#) zQu5zTvu1NX`BM5_+jCBf_FO>z1=?^W5gAX?WQtKixww#MI7kMgG`Sm3rZC#yC)?R< zhA`5w2Mv;e$u)?)qPFnn3+pYGZhgIND)7pC2ik*BTPn!P>YCZM|40Y9JYbM@X`<#F z{)C3Cg<`eAR0t4YHdw1;bOeRGbkCc|Wmcrq+vp>6t!cA9^=|My%L}mNN!m2Q^F;s) zp=f}wJzvuAybu@OXH@ws<3X=E=Z9metds<44I(W%%@tlHExUP_9}iiytSVq(W#>Ib zzw@j*{y!JYK2%X{x%IotnX3DL=nLMWlm+E)S@f?*!%bJDRRul(R~2Wn4_&)<+*mG| e70oj=utUV(1Drot%!gZtsQUpn*P3c%0ssKG@14Q` literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..c181591800692884b26bdd1a0c2e291a391ed25b GIT binary patch literal 16 XcmYc;N@ieSU}8veZhOjd>diC&BgzGv literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..de1bce944fd5da358c00b8b606537426427aee3d GIT binary patch literal 611 zcmV-p0-XIHiwFP!000000Ns{dZ`v>vhX0GNHZAD}ft2MA7)(fAH9@tDRVXsf3Go^n zG34rN9;vLHh4~H!Tn&$Rx(h5m-}JN^CeeLjwy!N*EjE(`;?SXa6ls%n}Vidupia_VAMyD$XV z+Vi`~sM8&cPA-eEh{r)KWQ4MxP*o#nrHu>z4+GqO-*L~=ey1uhsaNLV769F}Gc*0e zA-JZjHN&j58R#U$KdFk)eslBLgllUIG&E?vh_^aiA(yH0d9crq+U;40b&8Ap#2RR# z$9pSQGb;(hw1gF9P$3dUi`-g=EL&cJ?=(kkii)~Y{k5c4s?IdE)Vve=6i7kidxYk7 zG?{)9#ASXAu71d4@LPv{PGN3&TjKH_5?CUG%Zjdquo}Ms-uc+=q1z*0kLjgR{xisX zBo{aBm3NWZgVXLbt7JqM-%_cjTwm>Y+v|n>#RdkzLT6y#t~%82xpZZJ9k8z5>D_kw zE)7H1am-8L&zy=ijFO!QbpDZ~3M=Tr@^=IN$qB};ERsJ^PDwEFI@#%2fxU<591}ck x0wQ=}iC(kB7@ZHM1Q#sc1qif}gY!aVo9OD=lIaPkQc^d`vS%vF(Y_4{005CdDKG#4 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/parts/.part-0-eceecf38-7b1a-46ab-98c2-147256aff633.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/parts/.part-0-eceecf38-7b1a-46ab-98c2-147256aff633.crc new file mode 100644 index 0000000000000000000000000000000000000000..0e6384955385e1aac667972fc61af08e7bcbe6ca GIT binary patch literal 12 TcmYc;N@ieSU}AU|S9%Tr6bb{% literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/parts/part-0-eceecf38-7b1a-46ab-98c2-147256aff633 b/v03_pipeline/var/test/reference_datasets/GRCh38/helix_mito/1.0.ht/rows/parts/part-0-eceecf38-7b1a-46ab-98c2-147256aff633 new file mode 100644 index 0000000000000000000000000000000000000000..ea9c8eeed7f768a7e70a7e09d7987e37df335ab9 GIT binary patch literal 107 zcmWGzU|`^9WMI(Ps{c2EpTU)hp@fN%g(W$o$d{3cfzgrCnW67DH-mc!6Ob3e=;-Xu z%mQRNyN59FF+31R;NS3p&Gq5plv~Uu%-a~YG5%^eYj?g$fhmBU7icmFfJ|0k09pY6 Du|gO- literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..5e2eae6e763643a3d0a14253deb22221590cd426 GIT binary patch literal 12 TcmYc;N@ieSU}E^YbQc!@6yXD* literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..7d6d258bd60c3fea62cf26183a2ee6fa19760c06 GIT binary patch literal 12 TcmYc;N@ieSU}D(uxXcXz6gvZw literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/README.txt new file mode 100644 index 000000000..959764f4d --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 12:37:09 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..9c29118d9038cd2333fcdf0b6e3f531aff1aaa30 GIT binary patch literal 12 TcmYc;N@ieSU}Dgy{o4ir5*Gtw literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..d0a1a6b6f802fbad72b536d04495cc14b5327ccb GIT binary patch literal 281 zcmV+!0p|W6iwFP!000000F{x?OT#b}$N!giwSrDIq&K78Nf1`?;24r6Z!=q(mLwh2 z(*ND});hO(GNa6BAJAx5<1;)j8nGO f^xQ{_o(oS}#u-*RrpXw6;7<1qNrXOT7y_}VkM1Ted}1u(e3JU-L literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/.index.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/.index.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/.index.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/.metadata.json.gz.crc similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/.metadata.json.gz.crc rename to v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/.metadata.json.gz.crc diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/index similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/index rename to v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/index diff --git a/v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/metadata.json.gz similarity index 100% rename from v03_pipeline/var/test/reference_data/test_combined_2.ht/index/part-0-7d0599cd-6874-47f8-b6de-a7db0b41817c.idx/metadata.json.gz rename to v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/index/part-0-2accd7be-40d6-42bd-abc3-f6dc7b382f0a.idx/metadata.json.gz diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c50eefd455d45739862e88a4c0d3b2623fbdb695 GIT binary patch literal 317 zcmV-D0mA+tiwFP!000000F6;wOT#b}{x5k_h{LtGOWve{h(5?BzKl|`^mGlTNl8*i zDf#cFS+hBxd z>ovkiLj@Wn15;=aMN4h%$(OFDSi1N2wr{~BA7e2bgxXU;s5R|(Cp>o!e?d=nLb1+ZDg+2HIo9eJMFAmi<9YME$?9}<8+q4Om@=R7_wX>U z;3avHHcg!K`I%~3B|%z)$be4K!&TB|eA4FgDT|bC3oLBqyo$(ozN?P^*9Ec{wNzVf zeZqpNZpttjyhkVt$X~O_-_7&UN2?Y*AFdqE%}I03D!H*-GAsI5=%GZ!uNR!(f7FNH P3{v+44lJUNMgjl;^o5yq literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..739b1660de5af794d22019bbf4470cfa68cb224b GIT binary patch literal 16 XcmYc;N@ieSU}AWfC7iaTW5;R$B@G3R literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/hgmd/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..43d1cc0ef8d61899781dc50e4ad06cdcdf40de78 GIT binary patch literal 576 zcmV-G0>AwqiwFP!000000NqtxZ`v>v{V#snw4{YV1I=4uY(lE038p=)Lddaiz^m&Z z+e~FB|9#g9Nl0MQsMeSAfZ+IkoOADy?w%<61SBF;#o-zF?EA<023SBmvV4Rci5`ed z+1B8HDkVrTWPwjC<-lAZiS}iMZ@UffarJ;v;2?bYFd0z;bE~(^?(=d{17fMNV&yQQ zsNQ*-)Cen5g~(LdUXXc*-U!@jO?MAv=)$9nfiXq!A)QHT%u?_w-EsMK;L>1m@4nxC zo{iVN;rpsd(E$C!=mK9Mw-AVQV4zQOF2sJYvm8Q0&0D z_>BSK%}p;nP5WUi&r>hVLjgda+L@XApa|g{(9$q1vD^?EyHHPFk(z_tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..ffe0281749ccfae4c32eac33972c4e339bba2fc7 GIT binary patch literal 12 TcmYc;N@ieSU}BJcbNN326I}!r literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..cc8b04991a11eb0dee87d4dbad4ecd063a46d5f6 GIT binary patch literal 12 TcmYc;N@ieSU}C8G7`_|;6PN?0 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/index/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..8de554cd5ee24c1fef3bb268a6bd7c91cf82844f GIT binary patch literal 129 zcma!IU|^UC#2Q=m|0+z(WnxfaW?@?~UVV02`3W^fk@QDFkILl_;+oITv5 znB6m2fLv!r$1?X0#*kTz+zduP8yF8fFsNufec%Gab%y?gjE3yy*$MJMlZ9ZW3jyt6 KWCZe135mxfJlibWLvr z2^R`|Lxc5AALU`Xv)y^4SLIwNYtdqq>kw4BbC8LT5d9i~ghBbR#&vOAin zIfuWYCtIOdXD}541eg+Qb&5n#$osh7y{?Nxwz`iTdD5Ep+jDM-4O1G(i%`^NsdK(H zrjD(WAgw`UKhynKfW4`u+H&hBubAql9;Seg zC>25Z8y5L(GB3vS+siwFP!000000NqtxZ`v>v{V#sn)TE<)HOgDV=!Dcw6I6Rxg^*+4fLGT+ zjx&{^{P$fu5RyQpQmHTH0pa^`&ON?IcuN$00uqvm;_wW7`s?d#4XjH%vUr3I2_J|| zS)SpUN(mA&GRG84IWSvD!tEa6ezOKXsuEZQHiVgraYQqi8J#n;b#P%l!ct|#GQ=vO zsNQ(%G!sUo3L_I`dfmd?QHSFjt?A}rS2*`FmTCycw=$)+&T{_UaW>fQ?DwZ{)6uFI ze6G++Ff0^mxyPzQ#d(r|ZmGN*M_1h-I;1?pRzmVxNra@2h}kHLZEz<3VZm@P=naqC zerJZe)-(H10MMs)XJJ1mLfE~wG)(p}w`L9UCp9aKUtWH3x8|Z`k+HNEsf=`)-&4GvRx=qwvxs$-q=&P^D5scfB=+qL`nu&Ot{_Mdg|O^ym7l zn(An#)-)f)eFn6m@&}scZEkXW8q{h3IJ+(qAz{&4_OT1g$oo33?6&B(vt)qe;808xh@`~Uy| literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/parts/.part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/parts/.part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5.crc new file mode 100644 index 0000000000000000000000000000000000000000..db8f9208a4d50c7d2c4660700a932e8c9927c7a0 GIT binary patch literal 12 TcmYc;N@ieSU}EsIpC1YU5eNdj literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/parts/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5 b/v03_pipeline/var/test/reference_datasets/GRCh38/hmtvar/1.0.ht/rows/parts/part-0-c858683f-c7bf-4a88-baab-d7bdeb020fa5 new file mode 100644 index 0000000000000000000000000000000000000000..d28026cfbb7a69b1307e8e64a1d21c19d1846d62 GIT binary patch literal 127 zcmc~|U|?7a#2Q=m|0=9~%f!IK!pOjqoKfV<$i%?t$mlHJs%`IXY-}9D1muS>I>z)n zIWH-&wP$2;XLe@+ia9enYR2^2y7wRY8Zv99q#Xk{gM*rBzq{|YHZ{Y$tQQ&B8Xg25 WV>Jpq&cnjwlOd=FaQ953Le=2 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..117a62e4d918d0d364409936bb7a24a3cf780724 GIT binary patch literal 12 TcmYc;N@ieSU}E5^)LRY!5EcTH literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..8c4aa7a9422c8fc0a3d42b8195045ec9859ba64e GIT binary patch literal 12 TcmYc;N@ieSU}AV;)NvL76N3Y$ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/README.txt new file mode 100644 index 000000000..157ff427c --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 17:37:22 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..8c41c40c2d05dc077890580f9af5c27ab23676b9 GIT binary patch literal 12 TcmYc;N@ieSU}E_AimMp_6qy5~ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..cc8b04991a11eb0dee87d4dbad4ecd063a46d5f6 GIT binary patch literal 12 TcmYc;N@ieSU}C8G7`_|;6PN?0 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..49a1d678266dc5c0966d8e26b5cb6d73642c855e GIT binary patch literal 129 zcma!IU|^UC#2Q=m|0+z(WnxfaW?@?~UVV02`3W^fk@QDFkILl_;+oITv5 znB8+&fLv!r$2#{3j3J8{xfzUpHZUG|U{KL|`oIN->kRz~84cOZvlHZjCJVt#7XsSD K$Oz=27!Cj_@*PG1 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/local_constraint_mito/1.0.ht/index/part-0-b707f718-6196-4c02-9d68-148cf0c9438e.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..4714eaea2a5d56c20578449bbb4597004c7c7aca GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6QrOizQJt!zDUc^IMw@ol4l5D|H z`tNSNd07T#=Isp78jCj!(Rc^4D6+g1PzU82ZCtli4fy~v&#FRzM$>5mxfJlibWLvr z2^R`|Lxc5AALU`Xv)y^4SLIwNYtdqq>kw4BbC8LT5d9i~ghBbR#&vOAin zIfuWYCtIOdXD}541eg+Qb&5n#$osh7y{?Nxwz`iTdD5Ep+jDM-4O1G(i%`^NsdK(H zrjD(WAgw`UKhynKfW4`u+H&hBubAql9;Seg zC>25Z8y5L(GB3vAwqiwFP!000000NqtxZ`v>v{V#snw4~#!A>^$vIw5t_1l1l^A!O_u@aj6q zF;f}Jf8Vu3LK2u%D)ps2Abda0xySbi?}?&MKq4|x9G-zse}0*5fOUvR7LTwa(F2hQ z+opJ?Qi6n(%rV6h4$KykXkQ@Q?>4|kWdf_fhA{JKJftbijNUS{cW_ZP!V+b~D#9wE zsNQ*-Bo#&^3L_I`x}D5BP)Fk%t?BOJP&oHemT17TTbWQ>Xi5JZNNaR=_WQ%v>1f>z zKbCMg8Wsw*D!{5x%WWKkZpgeF58It^cuaYOt%c;3k|B~lB4({uzw` z_A52qwO-hV0)Rd>J2U%15yCFEp&UN9HAQoPsIGph_i%9(pSbSvEgsouiJ5iqbR1^q2aq zoa$tz#x(E5eFn6m@;jR5b#8Ke8q|6JIJ+)lAz{&2_Nfc=$lE$D=g@%vCH(wE7n-n? zUo+mR?560Zbcad&XMojOWJEck|Mb1Up?iVv(k_$`0gZ}sDT zJM@G0An<#v)$s>B2>o_1WO0iPdfgD*tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..8c41c40c2d05dc077890580f9af5c27ab23676b9 GIT binary patch literal 12 TcmYc;N@ieSU}E_AimMp_6qy5~ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..cc8b04991a11eb0dee87d4dbad4ecd063a46d5f6 GIT binary patch literal 12 TcmYc;N@ieSU}C8G7`_|;6PN?0 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..49a1d678266dc5c0966d8e26b5cb6d73642c855e GIT binary patch literal 129 zcma!IU|^UC#2Q=m|0+z(WnxfaW?@?~UVV02`3W^fk@QDFkILl_;+oITv5 znB8+&fLv!r$2#{3j3J8{xfzUpHZUG|U{KL|`oIN->kRz~84cOZvlHZjCJVt#7XsSD K$Oz=27!Cj_@*PG1 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/index/part-0-e16f2759-68b2-4794-978c-4bfcd2f29974.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..4714eaea2a5d56c20578449bbb4597004c7c7aca GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6QrOizQJt!zDUc^IMw@ol4l5D|H z`tNSNd07T#=Isp78jCj!(Rc^4D6+g1PzU82ZCtli4fy~v&#FRzM$>5mxfJlibWLvr z2^R`|Lxc5AALU`Xv)y^4SLIwNYtdqq>kw4BbC8LT5d9i~ghBbR#&vOAin zIfuWYCtIOdXD}541eg+Qb&5n#$osh7y{?Nxwz`iTdD5Ep+jDM-4O1G(i%`^NsdK(H zrjD(WAgw`UKhynKfW4`u+H&hBubAql9;Seg zC>25Z8y5L(GB3vBP literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/mitimpact/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..99e90e7fd15409ab66ba5a85f4a71cde51e25b88 GIT binary patch literal 576 zcmV-G0>AwqiwFP!000000NqtxZ`v>v{V#snw4@s$eC4e$Iw5t_1l1l^A!O_u@aj6q zF;f}JfA6(JLK2u%D)ps2Abda0xySbi?}?(1Kq4|x9G-wre|?>8fCa=OiwD?|=$^=g zZBsl`DM3O?=9pp$2WAUNv@a0ub{pWMGJ#cKLzwwA9?}$MMsJzfJGiJCVTm$g6=9W7 zRPVe^k_sadg^`Ie?I80G)Y14_Yr4BX6wbYrB^t2oMkdr2TGGD<(i+{Z{r>Q6I$F2; zA4|9#4GV=@6<}4UweGGtpbwBlD6tPQjF8P^A(>54{zJESsOR&QV82Md_Jh`b&LQ zPIWRU(SPf~*jR#^U2v>{H6}*{w@pCt7ea(}h!H059E4CzZaEs-je~ljjEyb2jbv5tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..4fce8e00d6cbce99de098ec61d6b25590651827b GIT binary patch literal 12 TcmYc;N@ieSU}9K$qG34z6cYo_ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..46cb776986309b649bc5ca8fd1afd9c6fb8fe64e GIT binary patch literal 12 TcmYc;N@ieSU}9+8Yx)}i6CeYr literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/index/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..d1574170bdd7bbb0aa10ffefaa87d4d06030cd93 GIT binary patch literal 127 zcma!GU|{G6VvVi(e--*;nHX4@85p=2Sdue}d>NS-7#$g%8Ql3oWSD^L5JpEWXA5^X zX7>;lAlI4EG0D9kq>h`RL0N*4hp#z&2M-T#0q+NnHntsRJ2=_of#wRq%odu+z`)1| I5mxfJlibWLvr z2^R`|Lxc5AALU`v+3vj2t8y-swP-QQbqFfmImpDv2m+LIo)mHAKV*m5hP` zTw@IX`;bbT?zD@M@BQBMeK?>47f}9yHe5+W7RxNnF={9m*CCqE(pi$F4~u0E$>A}r zR;v}lNW%d%NCu|VAj+27#>3aCM=Y&;yzN`?%IA0(YN0AB$kytP*q-ZvFZtQ{B{I z3iyc9Iw*g`B3~r)-58`*3*G=X5trd)x~2BqST30r{TuYKN5o$#IDhBSAAVP`x?f1i J=~*WN000H&oD~27 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..2ccc6f456ed5078f9cde4dcb2dd58a06832228f3 GIT binary patch literal 16 XcmYc;N@ieSU}8Af_$*(ZQ$Gp-B-sRm literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..db48fe98a77f939f199f2ecb447c9da71a2c7af5 GIT binary patch literal 577 zcmV-H0>1qpiwFP!000000NqvJZ`v>r{$KpGX`yRK%4psSV-r$0O;GJ&6+*_ofY;PU zjx&`Z|L;3H5R$;8QmHTHA;S0LyZi1w>9$1DM<59qOAb%KCqF+=*TA|YAoB;ofit1!!)QQrm>OL$v-NOE-NKYKVwqr+ zkW_DiwaSGNN}^>fP1Lo)u5>cG(VA}VcY(8DP0dOzK?;U|N@YN$b%p=z3O7pLy6?NM zli@1rzpo%mL<|&i*#NBLEs8V+-Qs&Y8id{c;NS`fTZxRguScRyS`%eFn6o*>@Dp%hcrSX^^M&G*K*Z;A^VQ7vat-URw9)Tz2B< zl6Pqobt2Zs*EM|VT!+9qQ5g4Q-cMN^F8me#J)v?5164@w^0y>6h~S8oHyi(y3dX8z zrN3c2dj*puw1<<9wQtZlB)Dw?l0Oq7ZbA&vdFLU7WLd#c(7qky3zco;(4ECP2Gkhs Pidg#(EOXA}i3k7ya^n`~ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/parts/.part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/parts/.part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b.crc new file mode 100644 index 0000000000000000000000000000000000000000..d7ddc91583dea7f276f36791c839220a7d79c3b9 GIT binary patch literal 12 TcmYc;N@ieSU}ESA_LBhs5n%#^ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/parts/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b b/v03_pipeline/var/test/reference_datasets/GRCh38/mitomap/1.0.ht/rows/parts/part-0-430d2a33-3c80-49e7-91ec-31484d8fc41b new file mode 100644 index 0000000000000000000000000000000000000000..c01f44610a09b96177d97da12431db13872df42b GIT binary patch literal 95 zcmeZbU|{G4VvVi(e-(P)GBT(zF|x2EXB7D|GBGeZGCI45Faa4MjE>Ik%q&0_P=JA( n;Q@z4!sJ$qxh`NSxr4$&m@>q!sEZOklmk2BPJPCtVuUqC?ak3olKD z=Kf@G3Pob)B8ip9G*GQWyGQc2bPcU578N!Ss>%et3?ub;gJmyzjC+z*e z9#MH~T{bZPp!KXiHIO#-c8fH+%u^CyPjBOIYXbV+=H^Ev&P%;x%voq7VroA*FrUqC z=D%xiKA1!GpFKV^s$$&iWbq{`Md8ur==p;{HkVf*AN8?NF2lI2oe`@AAJ}9pm(ZDh juukW-;m1E%{AhU663K|t2~CASUOL?uQIY}@CjtNfRS1O; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..2c5dc31e6c5760bd7b566b2c3c781d370c5a9ae8 GIT binary patch literal 12 TcmYc;N@ieSU}Ctj-75tE6a52t literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..4353c3208ffbd7615fbff348f7da20e57c7cb036 GIT binary patch literal 111 zcma!IU|{eCVvVi(e-(T)nHU(E4fPB-xST_r-E@=kOY>4V!OZ-;oJw97zr^BHT@Pb# rliXBe<^Z2ymK0YYQ2->kAaWpOoH_aBx?nB?FVJKV0GrOh$iM&q1~40w literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..23324f542252c6a23a4e82c7ff3ae91d7e155c49 GIT binary patch literal 12 TcmYc;N@ieSU}9ME-1rm#6ZHeb literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..576d4ffd2f53ecd87bc049cfc67827d78f6adf26 GIT binary patch literal 12 TcmYc;N@ieSU}9*~ZOjA!5sCuX literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/index/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..952d782a33fc4e7095e476ee391088b2cd8ec83b GIT binary patch literal 79 zcmY#pU|3c@fHM&IR53K^;`rOhfRqM)d_h(j80n_#(-z}-h4#wCXMmzQ*Mbdb!VKI>; zwut!McTG)8_q4a6!**5SD_9OsJ6Dj zGjKr5r|GjXtjo^SZWy8;=x8RC19N0@7a4sDsUXs~>5zSB-rBp@&ALjL_sCvrb5Yo- z8tF8_W*nO9y3q=hvxN5KJg+}0R*&!4e7~#%vTcC-Rb1X5_yJXY`hU*D7;2bt!i^Jh z-A*%11)srMd*^Ri;Inu)_88p&@^I5|5vR;8RS4^Z;!bw&&_O{ct`r7bcgXu+=%d3g L*&dhVI|2XzyKa~c literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..a3b21f0b766c2d5398b08936aab09bb9e41b8c48 GIT binary patch literal 16 XcmYc;N@ieSU}A`5{c~7L`CB{yB5DNs literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..2b5ad46aa8ca3129f05ae7dae1203d0658bdf918 GIT binary patch literal 594 zcmV-Y0O4{4YMWYO)RCqanAlr4`busz}vC5kkg(0e7=D zvdwBK%75>30wDydg`T?03C8b@-;Bq?TO#QLkeG}lhbQ3UAD@yHu#k9Ub`Kj8Pl%8f z^drs46fveaFjqn1?Y{NNT_5Jx_du8cVf3hM1C8EF2~z z)f;c6jIcsUL`KrK!`uVOuWb>MWor&5+3%-R22@t~u%s-}+|#hY_L)T;Grns-J!;w6W;uQ+eFIj)&2$pINOISK=2N%iL3GYQR=Hsg%mn z9c69fJU^SzgHbh`N|S2#To~tSn3GW&L%fnyqY08`UonjTmeOKB4f1?0i)%1TMFz9R z8K0)RL|%B3OA#8F()_&SE5lv+dlcwXUz@HrJ-s;o=M?;9{?~XT(5vhI{|e~`Q;wT+ z;y72BozHky#yvdT4?pmGltsMDJ3gC-k>3g0f!_~<$WH?vK#+F(U1$|=lV4*h7cfw3 z$zA`J6w!BjswYfb1ItL`1Z=5WINCnaB#u6`!d`&$;5B%@vhON)DNjmrlmrzvbR;Rw g4CDvO|7fD;3LTO=f2p+^jl~rI0k~aD^EL_q0CY+ocK`qY literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/parts/.part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/parts/.part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1.crc new file mode 100644 index 0000000000000000000000000000000000000000..2e0ae9c56974f5cb01695d1dacfd65482ceab5f4 GIT binary patch literal 12 TcmYc;N@ieSU}BiN>GJ{r6p#cR literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/parts/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1 b/v03_pipeline/var/test/reference_datasets/GRCh38/screen/1.0.ht/rows/parts/part-0-6ac8d5d4-cb28-4030-9208-f0d0e0f595e1 new file mode 100644 index 0000000000000000000000000000000000000000..93fd423c87f22227a1b00814ef453cb39bd4b792 GIT binary patch literal 56 zcmY#pU|+H>0(KhtQ1PdA#D$ar=~`A zdn8DO3=wmdXToCC+R9uZ@;SU988Ch*l!;H_nM>aj-4B zj$7=87Nc;wYP%Vt{|@aVd+`vsit9DH(P$5Ob~(AuzNrM{n>Ol)Br}!VQp|~H#bBX7 zNidsEuV%k(FFvTf^`Aa^Gm3KT?D)_SwQ%8H8}zg;;LG_1@Y>$z!fr6DqE$?_-~*8q oOD43cFSL~=Dq6*!qQAozJJ0*Y@imMe+KWSVC)jB3ioxrlB@*=U?+H6^@KpfqMc-LwO%6ot4c0z5bAS-7Y=DOjM{>bGHgKYLT zX;ur)pkPW`_-gt{a&HW6*GG8cJ^qB2tc7BY#Z(9oU^m#PLrRDWdF|J$r{$u~XV;0l zjKU=dM+?i80rDsmb$N#W-e2&NJjb`Zi>Uk#OZ2r+I5O}yqI28xaB~^0k gxMWVWFVI4XNEkQx@E^$^zdBm|524e^q+vh5w76HZAEOKpOH^7@d%+X@Y4Fs}OSH8}RBn z%yyff#1QHV1G^+KbJRov?ea|`O6J|#weFPGcsbug3Z1(lTd;_sZ9I||XEeY?5 zNaIZ7GZg{^XtKbiIAsvqMiTB06TWOWz{b@Ac7ZM7%Dc&kYM2|9#b#HG3mc_4m4>fu zjZ3Pw&L-8|@Kj=DDoxMJoxL^|Zj_?i`$Ox(VIT}$>Nr&pPHsg?vER`f!DB#|Qp1l7 zB!1>B+z+$Pe(s$$Jl)y%Hy>x?buV~ZNtS}x$mOaj_jk2gEXtAuRO__vCZnzwjOrza z#A`0tQOgKzJ)ml%DYC^m|Ah_yV9@hVw|&!w6*< z5TB`8WBuytebH{K0%%e|@lT{2Q;~3`4W8sviGkmD5Xc4sbuX@jFGyMLq)DQ|R8FXr zsiFJbM;dv)ILB8VD?tn}r&qtWuu;28s!&DAx!80M&yYG5`Po literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/parts/.part-0-6272a9a2-b08b-4926-9552-feb84ffa2308.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/parts/.part-0-6272a9a2-b08b-4926-9552-feb84ffa2308.crc new file mode 100644 index 0000000000000000000000000000000000000000..0dcc92f011ee6764ca553fbc9be5bbd766f3fe98 GIT binary patch literal 12 TcmYc;N@ieSU}D(#AZ--@6j}q^ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/parts/part-0-6272a9a2-b08b-4926-9552-feb84ffa2308 b/v03_pipeline/var/test/reference_datasets/GRCh38/splice_ai/1.0.ht/rows/parts/part-0-6272a9a2-b08b-4926-9552-feb84ffa2308 new file mode 100644 index 0000000000000000000000000000000000000000..4304a92fc0ac5a0f2df202142c89d1560582a3be GIT binary patch literal 55 zcmY#lU|sNFaQ8U C5DSI? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..19491808d2dcd6127c3e812597b22de7c90d5598 GIT binary patch literal 12 TcmYc;N@ieSU}89U^~!Vr6+#3& literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..db3eac1a7bd97c08939aeb17713a6cd2c50a9f53 GIT binary patch literal 12 TcmYc;N@ieSU}6X;`>+fE5~u^0 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/README.txt b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/README.txt new file mode 100644 index 000000000..f37ec7de7 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.132-678e1f52b999 + Created at 2024/11/21 12:31:19 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..fb5ed3f93cbe103a7565c9830999566e73c689f4 GIT binary patch literal 12 TcmYc;N@ieSU}EU@{1F2H64L`? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..c945547ba2d0854fe2a257585bae58f45b8fb39f GIT binary patch literal 263 zcmV+i0r>tOiwFP!000000F{tWPs1<_#lMTM6bXhlp&SF{goLyr4j_cG#2aC4TIJ8A zQsuj2k1=R>A}9O3_p=-i;8cSk;K3X8XqHd6Rfl>81-xys2l4Vsm2M4c)N` zxj7n?LYBxmE2RopMW1vWcIbm`ogmRV<@*9%YJwPLa%OyYz$YNJ)@=^h2w0`{8vSIg z`%+%cuH~;)fPS~N`4f?KtG6_B6k4%Z*-r{A=JTt?DfS$q*`xpDXlGQ-l+66-6P@z$ z&ej;{)WFU81@Or~mdb5Vw#gZ`1_+T%8cPs5^Ice{>tyJRdy8&^N3G-xE19RM7=2(S Ne*x-txD)XK004@8eX{@n literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..1fdaa2e8d47c8a8a7300e42b27a1ba34c3a388da GIT binary patch literal 12 TcmYc;N@ieSU}9J_QCJQD5&r_8 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..b88778abc7fea54a002d3015eda3a26790756006 GIT binary patch literal 40 ncmWe;U|?VaVvVi(e-+pa85kIu4fPBdc!5G702XFoWMBXQWp@Qc literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..7cb9c5aafd74b42fc55f3e032d9cb437c12c28bd GIT binary patch literal 12 TcmYc;N@ieSU}9+I_!kQR5&Z)p literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..9af5fa9257e8d86b6f5732f8c6c4649397ec5485 GIT binary patch literal 12 TcmYc;N@ieSU}89S=g%_$777Hv literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/index b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..a979d82bf4f0b890ba94fef0dfa6090fb25d2ee4 GIT binary patch literal 69 zcmb1RU|7|snHU`zof+hTLP9VVLKA`Vj0_B@N&vXe B4MzX~ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/index/part-0-795ab066-10c9-4aac-ad59-f29794a4b01f.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..051d3e03d56a7a32825d7404a65aab17a00602f9 GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6;O3h6KJt!zDUc^IMw@ol4l5D|H z`tNSNd07T#=Isp78jCj!(Rc^4EVH5#PzU82ZCtli4fz1F$X2BQji%ECaw*`2>6+dM z5-t?_h6d}KKFY&%XS?%8=ha*&Ytdqq>kw4BbC8LT5dJdw~KHj!9c;~Y}bO)ifRFJjR4YS>FNq^+>fI&8u ziJo)#8(OjviZupPAwYo1u~x@O1c$u!-8!V5%8r7y>?`l?CT-SmZb3ywB092G7G)!dV=8uBjb2mP=+u`wA`W5b^s4 V=N}~Yhqnx^?ibo0gM>r^004?ZoOJ*I literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..d838c358d7f7d689a05f4ebc1ba062455f810351 GIT binary patch literal 16 XcmYc;N@ieSU}6YwzW#nftFZ_GCbtDQ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/GRCh38/topmed/1.0.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..de84d37052811c55080dc7c387504f6db46c7f1a GIT binary patch literal 592 zcmV-W0?KQj4|G zgrw@=?BYad5ld92Qny>FbMi)EpcFkk=A{c~Fbc9FXD|uIlPIQk{E(|(;%zYmeHx~w zyI-g7Dt+CX|978fqfNX2zOco`bdg@dvdXM$HmEp_5BOz@@Volq%b zO;4-OJkoS=4X`rGS&)LPJix6q7F}KRRO{xA0H{Gp8a*I4F9XW@DUjC@vcxV!5y7(d zvM)i*lvjr?vr)ie#jiJYBZtKaRM@*rUe8@meLJbwqWmSuTZ9)6?S)s4J%jRes=K7b zY`&vPPlUOj@ya$2Zo}>QUAk&`p4)6N-|bM!T*|w?8@BwO-=UpVqZ!)k{O63yH4Nju zbY|cMN`e4b0Rtlg0|4eP2 +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +#CHROM POS ID REF ALT QUAL FILTER INFO +1 69134 1 A G . . ALLELEID=2193183;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69134A>G;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Pathogenic;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 69314 2 T G . . ALLELEID=3374047;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69314T>G;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 69423 3 G A . . ALLELEID=3374048;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69423G>A;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 69581 4 C G . . ALLELEID=2238986;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69581C>G;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 69682 5 G A . . ALLELEID=2386655;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69682G>A;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 69731 6 T C . . ALLELEID=3374049;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69731T>C;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 69769 7 T C . . ALLELEID=2278803;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69769T>C;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 69995 8 G C . . ALLELEID=2333177;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000001.11:g.69995G>C;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=OR4F5:79501;MC=SO:0001583|missense_variant;ORIGIN=1 +1 925946 9 C G . . ALLELEID=1983057;CLNDISDB=MedGen:C3661900;CLNDN=not_provided;CLNHGVS=NC_000001.11:g.925946C>G;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=SAMD11:148398;MC=SO:0001583|missense_variant;ORIGIN=1 +1 925952 10 G A . . ALLELEID=1003021;CLNDISDB=MedGen:C3661900;CLNDN=not_provided;CLNHGVS=NC_000001.11:g.925952G>A;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=SAMD11:148398;MC=SO:0001583|missense_variant;ORIGIN=1;RS=1640863258 +1 925956 11 C T . . ALLELEID=1632777;CLNDISDB=MedGen:C3661900;CLNDN=not_provided;CLNHGVS=NC_000001.11:g.925956C>T;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Likely_benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=SAMD11:148398;MC=SO:0001819|synonymous_variant;ORIGIN=1;RS=1342334044 +1 925956 11 C T . . ALLELEID=1632777;CLNDISDB=MedGen:C3661900;CLNDN=not_provided;CLNHGVS=NC_000001.11:g.925956C>T;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Likely_benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=SAMD11:148398;MC=SO:0001819|synonymous_variant;ORIGIN=1;RS=1342334044 +MT 13112 693521 T C . . ALLELEID=680411;CLNDISDB=MONDO:MONDO:0009723,MedGen:C0023264,OMIM:256000,Orphanet:506;CLNDN=Leigh_syndrome;CLNHGVS=NC_012920.1:m.13112T>C;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Pathogenic;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=MT-ND5:4540;ORIGIN=1;RS=1603224043 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/raw/exac.vcf b/v03_pipeline/var/test/reference_datasets/raw/exac.vcf new file mode 100644 index 000000000..3392033b9 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/raw/exac.vcf @@ -0,0 +1,202 @@ +##fileformat=VCFv4.2 +##ALT= +##FILTER== 20 and DP >= 10)"> +##FILTER= +##FILTER= +##FILTER= -2.632 && InbreedingCoeff >-0.8"> +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FILTER= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##GATKCommandLine= +##GATKCommandLine= +##GATKCommandLine= +##GATKCommandLine= -2.632 && InbreedingCoeff >-0.8, InbreedingCoeff <= -0.8] filterName=[NewCut_Filter, InbreedingCoeff_Filter] genotypeFilterExpression=[] genotypeFilterName=[] clusterSize=3 clusterWindowSize=0 maskExtension=0 maskName=Mask filterNotInMask=false missingValuesInExpressionsShouldEvaluateAsFailing=false invalidatePreviousFilters=false filter_reads_with_N_cigar=false filter_mismatching_base_and_quals=false filter_bases_not_stored=false"> +##GVCFBlock=minGQ=0(inclusive),maxGQ=5(exclusive) +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= 0.05"> +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= 0.05"> +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= 0.05"> +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##LoF=Loss-of-function annotation (HC = High Confidence; LC = Low Confidence) +##LoF_filter=Reason for LoF not being HC +##LoF_flags=Possible warning flags for LoF +##LoF_info=Info used for LoF annotation +##VEP=v85 cache=/humgen/atgu1/fs03/konradk/vep//homo_sapiens/85_GRCh37 db=. sift=sift5.2.2 polyphen=2.2.2 COSMIC=71 ESP=20141103 gencode=GENCODE 19 HGMD-PUBLIC=20152 genebuild=2011-04 regbuild=13 ClinVar=201507 dbSNP=144 assembly=GRCh37.p13 +##ancestral=ancestral allele +##context=1 base context around the variant +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +##contig= +#CHROM POS ID REF ALT QUAL FILTER INFO +1 1046973 . G A,T 526.79 PASS AC=2,2;AC_AFR=0,0;AC_AMR=0,0;AC_Adj=0,1;AC_CONSANGUINEOUS=.,0;AC_EAS=0,0;AC_FEMALE=.,1;AC_FIN=0,0;AC_Het=0,1,0;AC_Hom=0,0;AC_MALE=.,0;AC_NFE=0,1;AC_OTH=0,0;AC_POPMAX=NA,1;AC_SAS=0,0;AF=1.702e-05,1.702e-05;AGE_HISTOGRAM_HET=.,0|0|0|0|0|0|0|0|0|0|0|0;AGE_HISTOGRAM_HOM=.,0|0|0|0|0|0|0|0|0|0|0|0;AN=117528;AN_AFR=3096;AN_AMR=860;AN_Adj=27700;AN_CONSANGUINEOUS=1488;AN_EAS=1442;AN_FEMALE=12128;AN_FIN=302;AN_MALE=15572;AN_NFE=13416;AN_OTH=228;AN_POPMAX=NA,13416;AN_SAS=8356;BaseQRankSum=0.439;CSQ=A|intron_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000379370|protein_coding||19/35|ENST00000379370.2:c.3388+16G>A|||||||rs751608650|1||1||SNV||HGNC|329|YES|||CCDS30551.1|ENSP00000368678|O00468|Q5XG79|UPI00001D7C8B|1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,T|intron_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000379370|protein_coding||19/35|ENST00000379370.2:c.3388+16G>T|||||||rs751608650|2||1||SNV||HGNC|329|YES|||CCDS30551.1|ENSP00000368678|O00468|Q5XG79|UPI00001D7C8B|1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,A|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000419249|protein_coding||||||||||rs751608650|1|3583|1|cds_start_NF&cds_end_NF|SNV||HGNC|329|||||ENSP00000400771|||UPI000059CF46|1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,T|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000419249|protein_coding||||||||||rs751608650|2|3583|1|cds_start_NF&cds_end_NF|SNV||HGNC|329|||||ENSP00000400771|||UPI000059CF46|1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,A|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000466223|retained_intron||||||||||rs751608650|1|228|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,T|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000466223|retained_intron||||||||||rs751608650|2|228|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,A|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000478677|retained_intron||||||||||rs751608650|1|502|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,T|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000478677|retained_intron||||||||||rs751608650|2|502|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,A|downstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000479707|retained_intron||||||||||rs751608650|1|624|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,T|downstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000479707|retained_intron||||||||||rs751608650|2|624|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,A|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000492947|retained_intron||||||||||rs751608650|1|1556|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|,T|upstream_gene_variant|MODIFIER|AGRN|ENSG00000188157|Transcript|ENST00000492947|retained_intron||||||||||rs751608650|2|1556|1||SNV||HGNC|329|||||||||1||||||||||||||A:0&T:0|A:1.702e-05&T:1.702e-05|A:0&T:0|A:0&T:3.61e-05|A:0&T:0|A:0&T:0|A:0&T:7.454e-05|A:0&T:0|||||||||||||GGG|;ClippingRankSum=-3.580e-01;DOUBLETON_DIST=.,.;DP=500454;DP_HIST=21516|23185|5132|497|5360|2303|632|78|31|12|8|5|1|2|1|0|1|0|0|0,1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0,0|1|0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0;ESP_AC=0,0;ESP_AF_GLOBAL=0,0;ESP_AF_POPMAX=0,0;FS=9.313;GQ_HIST=1031|21164|2478|2925|20769|980|656|232|77|69|65|68|6174|1026|336|365|177|61|47|64,0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0,0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|1;GQ_MEAN=20.85;GQ_STDDEV=18.47;Het_AFR=0,0,0;Het_AMR=0,0,0;Het_EAS=0,0,0;Het_FIN=0,0,0;Het_NFE=0,1,0;Het_OTH=0,0,0;Het_SAS=0,0,0;Hom_AFR=0,0;Hom_AMR=0,0;Hom_CONSANGUINEOUS=.,0;Hom_EAS=0,0;Hom_FIN=0,0;Hom_NFE=0,0;Hom_OTH=0,0;Hom_SAS=0,0;InbreedingCoeff=0.0337;K1_RUN=G:1,G:1;K2_RUN=GG:0,GG:0;K3_RUN=GGC:0,GGC:0;KG_AC=0,0;KG_AF_GLOBAL=0,0;KG_AF_POPMAX=0,0;MQ=60.00;MQ0=0;MQRankSum=0.556;NCC=4154;POPMAX=NA,NFE;QD=11.45;ReadPosRankSum=0.968;VQSLOD=-7.116e-01;culprit=MQ \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..202d5a4b2c258c4c144eacffd514460541df91cc GIT binary patch literal 12 TcmYc;N@ieSU}EUzuU-TI5pe?Y literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..4090709b13a5821b90df11d81d06d48b4708032a GIT binary patch literal 24 fcmYc;N@ieSU}E^mxQ3x5Wri1{;Bm#@Z@KIMRAdLm literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/README.txt b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/README.txt similarity index 78% rename from v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/README.txt rename to v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/README.txt index 95969a3d8..1359f75de 100644 --- a/v03_pipeline/var/test/reference_data/test_gnomad_coding_noncoding_crdq_1.ht/README.txt +++ b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/README.txt @@ -1,3 +1,3 @@ This folder comprises a Hail (www.hail.is) native Table or MatrixTable. Written with version 0.2.130-bea04d9c79b5 - Created at 2024/09/23 00:15:49 \ No newline at end of file + Created at 2024/11/13 15:32:27 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..036f69746b82c9e7092527f601994bd7307adbad GIT binary patch literal 16 XcmYc;N@ieSU}DJLykOoMe{XjHCHVy; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..40495e5c5673c97c5eef102eb4223d2e882d4b75 GIT binary patch literal 776 zcmV+j1NZzNiwFP!000000Hs!KZ<{a>{x5vmR3>fh+D&|0nslnvsco3{VG*(nJ`%TJ zQ`1dmgj)1kJ$58vG>KQlpm{L)rW9SxdOnyK zow%TCBV@(D(4rA^%mbeWmGRjZT~xUoFhrfKq`G@~(OyH`6{9!+>&NM)!P5F9EtVuH zjZFJkih0wUpL0QUB2}kS>&OLz>~zC3ZcwAP?Sr2h*5N#}3E^D&f(c~e99AgDDRN~_JRe%8tc1`xt zT&?P}TuX8;#~q-!0mHpYaIf*3OK&c_x#Z?@TQg^e*{UJm*5}!sOiKOk`}BHo@#THw z@JVI6mr3ejwM5bkFL;ecx^JK^w?h?eHdMPr@qeD^(uB(ryJ9;e(E+)O@q_Gegin_i zs*aq!v)v7mzh{4k=-y-GAb1}P`Y1gRqld(HM;aW)WtxI=zHRO%qfh5wM?018B@3Rh zj>#yK$unl^8O2r@^1oSu_uKp6Fn+z%=F^_MuWRudQ zte;Avq_G6?XYbw1ZH(j7M!nkqZlxVaxZ;|Sy|)ao0d)an0e2NLnNei|nyPQ&Xphf) zCKG#IFLa8XTZ>;Aw-KpZi}ZG>a%+KZY2r2_bsLf1+e6U>Lo}fZ4pTHZOz%b2KbPVd z+l&*$W>9G5hl%k@2u1g(7OVBqj@5#vFB5iZ6xTJ77ejo2jO{hl3QTJh;Sn)bFoLJW z2;LU-s*D(sF}CGtg1G~BFiya+JgktyUi8eMI9Huw?ue%*&FTryl1D5`4^}~W6_pZL zprUxlSV6-ZjUcV8^rog(Ninu9J2b)E0gVn|06R3n+^w_f%5vn9sJr*BLh_;ys>bkE zPi=Oo%*^_P>OQ#XRpm^wGq!cP_ugxqkVm4fEJxm2%MFIG7d->(9pO4h z12RT}G+gVP6N1Q`(&ABu0yk{wRmqrXVr+v#Boc{40MI@@GzW^`e=a+|B9Ub-6&%!k}!KPS)#gNP`9WbV}BDD_F5(K2j@v*LUwl z13_X@=o9+RhH62fFDUe-tqIrEhC;QoV6m7MBo2E)VXrt;gF>}9Wa78YjQcfrQw<8! zIIUzhS={w=BfqXH#E=S8VJb|8sW4qrOG7K35r@Jc%JQ9C%xrR{ZGZe*JHP+4>*kWj zy=A>WYrf^w=5D?@`|o!A{j=sQ?j1kR%Z`)n*2=6lcT-iLiP;|AcP3^(|LDGR7Wa;y z=Viyqc57w+tkvdjs&|RR=@auM601$;=5fA@npzFXjj3Y*smblGX;8#z@pflOjEjR9 z?a_#7T=!n&jr4l0D-nc?pQ8-Nj-Oq3b=q=*nz_=^YOLgV?(43b2E9TNY!FAU)Asl@oFlc?d zA@Q#uyDJnkSeMM4oLDCdIgMASXbZdR9v#RehuSpC`8QjfRwDoTn>KOWSu4x#3MaIn z(;g)_Zu_z0Bv%B&ddYS-_@@KscLiDR2mjP%UlH2=3)0Wp>zSDKx&y#~U>5@ZZ6E>P zatw5Hc?`P(0CfBX#C*$+fF)zQIBWJy%!=Ir0`8Azd<5Q$#O&5_zl);kp!%hb``r_B zQpt96O167avfZx!`O5iyRn{BUKeedOT9d4{KPw*hcKodIKI^<^VixS8qqH(_(#kAJ zD|05Xi^jsgPpRTtu9a>s16=e|$8V|QFSYEQmYveFr(#2~m^f?dnV1!DQBOy*i&o+p zAHOad>7sQWPGh(w5(705*nvP`uLtY_y9!j1&o*}pOm*+Qu5RvpebifbrEP!wTRXr1 zv+L%P$Gv5}d+%`;cb=CWC)=%+S$$&8+`QErj;sCt=IlS`eyw!IY25XO3v0h0A6P6F z&xXA??8PBxaqswfUUr;pw^n9V_*tvMpYZovTT@d@OG68y71h+xip8@tB5^oGTN|bc zeF0aDNrN&OK8bL%Vhjx}a8le>fgl6H&jCzitM2_u&~Z2LStIE96>#-Yt5HjHQd)$6#UpS6#(DbK);JAlu+tK;|LEEmuCx1gu` zQcaT>b=(U*F~1ZEQL^2j{u#>oy;Ig()IW8lN2J>Ro(z4~jx0esq?Orj6#kvZiEp`2 zk`{IRnC=oS`zDqW+laH7@D%ZkkA!ZD2hN#}JD4Ztbl8kZw%gA?|2V(z$$A6&r-sLhi|z)Z_IT3b=r_w_B?DH77b_ZWPEf}cGEIE z<0IHjzua`nk{})TT2IWM9O;p4cb0!@aelXw_1^MN-K0pDw*T?7zJl8Ntcg4mvtl=O zVg$XVm3fGOn{vUwrBvZt4$Ih0l>q7Zp&;?G>{M7LtPswoQJ#revYRFuaMK?=<73!G zY25>3Tpf4xiFt$xs3YkzCW~Z5L`0ILAZZL^7$6J`=$WKwP7f2n0YZW?Nh}l%2g6_> zEIJHC9Elj>3?f887=n%$1IZ4R0!@5EEmUXyRG=9q;BtnCR@TH3?0}98I6@TQTJn?l z>q{4a*T(n$#M`l5*?px^uth92ApG`EL*P+f-f|#y+?Sjggw%(N0?5COJNRm8ai^n9 zJe1%_OEf(?MWTcZKFMcs2t`Jd(6diUCW>qYJtMe1)#TMMHpDZ95k2gdCmd4KNOqCJ zb;$5t_Hl?)`-FjHM#TseVJq|}Q5lpwMP~(b;mW?rcM~ZtDXQxe9)=h#)kDxxKorW~ zo=ZI;$`8jWyMCo%vKIkazyPoh5uE~7G7Dva++r|ah#lEDdeBiim$i9Qjv5ullEV&1 zQMWp~d}zb;tdt#2+-arr?8uOF1;~PBl(7RvuE131)xToX^aO}rbWNaZjPg)HQ$+>XbIwJ0eNuVJH97 zF-S;Fu2b1~BUVjl0~{+wLs$mR+Ll2J9EUO8Cn3zM_s26CxmqvMdomG7_KubB05QQ%3We&mlqvQl63X*sllGgXxZGK2T0uPrr-r5 zoFIBc+R~aPC@G`5B1*;W#U}mES>P9^yNso1ZLfJ$b4K|?6ICMR z9%;T+Nd#l<#`Qp0mw0lo+vNKX77pAoL^*_I-#Gr;K=bTy5xX)U@fB!~8~6%`AhzJP zqL^(XGvE{GwogR0N6%@ktY&9$NrHT%ua#w{v(!?ape0pATy<2^Zcdr#hIzxH$d_)xcM%33-K-!(eGh=aO`V zyHSDp;hGfW23y_jtSgwW0}z;fGNWR%%gE?!7yx6)?cWm4L&rW@hu=EC_q~r*-yXT< z+y2_#Q}P+aH<6NhulMXixs1Ij^A!hLuSyZUi5$!u`&n?NSp*it$18{py!=7FyOnbS z(Os_E-rMCmpQmIB*@dJYX=bBjxxkE+T0s{(ekGnzu}fg}KhH$D`Hx%^jA1ox4ZvG5 zTL5ld?y5QvCDo0rPxWRS5w~od7AMOmp{=EDDY0zVCO7)D)StC!wN}eRlv+gaX*Z}O zOB;Rg5N$KS&>N!iFX}ITMZ>d+wj9V#GRzVOU6NWRqI)(qrnvz{6*-SUP^@mtq|jzw zlO*1JUtjRk*(kmGQIsZU(eFNTl4Rc01MVi@9zMS7k@ON`ZUZ(bJn~JaAs&0@{eF_l zjr^n9q&NwGKed?}-Zud4;xH$1b~Levl&LkPoFM&^4=GB3VMX_R`K9F|g?dR=)5eW6 zIksD;rtZck+H{!~ADob74j-q$tk1YrZZXgxU@G-mMV5tC9IYR%Zx{=Hi7@+$kke(J ztyD+cXwYvW+1v^FhH*8KB+Y4+PbhtA!Cn>UL2KGkt z`9Pcuz_1}Eb4zZqwft5)n3n+v5p#*v@&#(Rm4W?SzV#>~POg5*48=NsFbK1qK%!=U zq(J)sM6gF0tBdu}Zz%`5Bi3@UXm3IbSrw)la;&hxYaN}HGwir`HI%b04`<}Hn;^JN zrFM;Q#YuU{jiCzK_g>D*w6U)y1X{KBA#r;N$JrUHhAB!Vo{R(<(lbfThzNDk;MsdZ zsZfLbDPXQX%Y{6p5K!p^l@;Otc#4BZhch$!}nqR=C0e3OvWKb>mUg;KVkbIn0e4Kv@s2Z5S`i&=-_^~m1G1l#?SEH4=c9Kd_Xlj1Gt zd-=^e$UiBy!opE`(G#iauCR+tEI`MWOMbBlz1l@7`Bz0=Q2exXk%ht9Jw*y5TlVZ} zc}zJ(6+dM z5-t?FntJQ&F3Q8Glihiv^TkvsYtdqq>kw4BbC8LT0R$-LJSpPJf5_;$J*Eh9CcwBS neol%s2YxrZ!FxJ3O+=oN^A1H!tvW(}>4)J9{jxpC&j0`bO+iqn literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..b2a26b2121b5900f1e25a5b865691477930d14f4 GIT binary patch literal 1606 zcmV-M2D$kkiwFP!000000M%IEbKE8n{$K8?ol!GMoFDbQIX)-#G<6(bCNG{F1|cNT zAdmr_ef8Y`-bE4!Bwg$_^+O+S=J@znc9&(@)%_x8oT4pNnn|%(zIl7}^4DZhV#Wt> z@%-7#XD^PP|B_Pt{8e^L-X5o~7s*1=oGK~^)ksl}YHUg%4=7L!PC>lc)f8sF*uj0l zWs3Q7ZB#?dy~^FcEmjN*qfxqR7c5tdm}{71qDan#IeNKCa*9o(D5_`%TpdwjF`bBh zN+8S){5a)QFI!p3O}2c`B{pwfB{fb_h41{BiCm&TK3`$+ZQWEcy#90*1Lq=D6x0#L zsib)xPhL@+T}nMG>HRt$e_o9hh{vnTPV@aHF;pAGgrRDSxn77rYG>SWG-KHa1*>}jgbS{2tSwo;b zK*9xmfRW?;5i&BS0ZW_qaD9xx>^}KMcgt&T<0R>C#U@em);Xh{Sw?Z>c%vh=Hpek_ za@zaxJIydy<_F{;l6K=K9<-<wkphQ3^#oL?XqoW+ETvkxbS2(l8)4=Tp_AY33gJ$R3uj&+zu z+e91!QPyvt26(7b#+li9qo=0XQ`I$)h5!naX99lt*544n7zJR;+9+Wp1R>+2VPbZj zr5s!_ns{L2*=cuBYifM9D{i{f0kOS&lB5yrz@KP`tbS@6h8>>D21pH0Y{N|hC8`VI zEv?-)SZi9Pe1`C3?;|>#K@9K!=;3;>8 zH}Q~%r^_d9@$iH@W50tZ+?AK3i`bbcpf>1)1ylsz))76gctSQJBD5T4{gA1PWi92=>2|Hcz3lVMHzW5+n7%HWAZ`on(Ec^A-tMQ%D2 z(Onfu6syfYu@l>?i7;WAKhUx17z#l~;4$iBgbY-Y35-IGT41>LuFpTh?GHOtM+E-f7I5QVP3Wm;QlZb1;sgK(p4`Q%28V9;BVhi;L> z35W8_!LHhIPP3u&4j!IKhj_|aUBikozIx|77T?9K)*d^%SoL7g6aI#4-d?PAC_CmXiIUq7!uei~MghS!Q2??k2j+3fd=WI<$AOF^MHT`cbx z-KX02uf2Kn=>eNxzH6`s=VE^gY@%Mnr+sChK$*>55!7v+i-S>}Bz@?iyZD+>9 z9RK>o#>!E$rog9>1P}IWVwO!!Oc29&ztbJf7E1^zy5EDeetrA(x?Q^e8>~aV{x%l? E0Dt-pmH+?% literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..3ba9eb28cb11178f047b9d56224391555a81c9fd GIT binary patch literal 28 kcmYc;N@ieSU}E^mm~Lm9zU|cflLd2>r(cj-r5;rQ0ET%Au>b%7 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..07dc2ffd3cbc4483f9a9ea4d0af13aa8e23425a2 GIT binary patch literal 2409 zcmV-v36}OBiwFP!000000NtA1bK5o&!2cIL_2%SWBPX$=H-9DdOwLKHOdc-d;XouP zp(X*E09~sY{qMIR@eNRp<0=pBL&sPw7Q2fD1pLBp6N+j)nNKchfgf;j{pEbo;Cwcj zPO@9PnasaVDwcN~+Ni2J=Ci*vO&V829V%6N?h(sHjd zpx*T8={>4C?Avi4K6h>A== zwHgP8lJjLWLzT!(F%B{RiFj5M*gDqW8bn;UQ`{BS_XaJqKJQvUXvwnwe!52VcG1;= zeDUErkX+Jbi*Z5d>WtyC40^9IDn2rvHuiQA2tJ)hZUln!k2+A%BFbwb1THcTvsYNa z9yKABh*a4cwMABvO5m2;uHMkc4~xssmtQWj&(|lHR~a-e0*$MScPFD7tfs`aYEo6s z1T=#+ZEJuFSfEplR6>ElEYh#wH*UrCe>^fSOZYIcu>U1X;S( zd(IfhT2k5owx{XI^ne7=VXN`{NTWF zj_*1ww|8>?4pCEiZyBsQBQh{k$3IJ?Zcr}V^pP4;WBk&WJ3kE&OU>Y5)kRe__e;HKHr$-r}wG zc-mUL>MA5yyLm=#mV|j1>*p_X!YwB^qtU>u84QKTrk}a-T`*jO^OiK?2LeZOduJ#o zgbJ&VpnwyD&}78L z(ms})d_t!2Lw}7P`rtV5)~9dx$D5v`m+(!!WR(xEay;kG3 zGFFxxo0J6A1xHrvcw4d3{W~iwwBo*`zcYIx`Jx#w*M?Oa885s$;3b#$w8P>L)fYT0 z;E5uzytvu5)go_wwa`fiC#6VD)ysov{A%Za63(SXGsdZZX9m&A+ns54l)w`^e|!V# zj=X{kqLJ%6m`b9pFZpNuP(G_j)4*jd>Nr2O&oujxJzrSq#nrhMyN~1FQB(E1a^WXu z^zH4wn-fp@olo1HFJt$GKK8cv7{QlMi}xR#L3vj-EfF@iEw}5om6+~cVLA`cy_f(0 z<=yq!`tayq2EYwFe$iiz6@D`!1&$0iaWFqc@}$?b`HB*t;y6ZP0!1GT;|g+ioqVCU_Jy}U|~)>7l7^M(D10&er9k{ z*S=@iR}3!H+RqF<$#Owfl41GhpTb32Xb6@PBmkCokXv-6Eeymb0j{)y;xB1zrIqSv zSnUrp%*y|1mzT^4SSb^l&cwLNDER$d=!9lfW2j22xuhxusyw|YLms9Yqg0-5ozP9h z?;@%(LgngeKo#++F+AlC)^utNPPw|8O^u-`SCg{Orq+WVSj$&?dWAsUz z8HZ2L^VVZdIxjmTh=7ylqI(ir+%!s?f=A}=96g4bc4tCk6>|VTXBpFYNhy!{AHpSF z?;{mGN|tuV=^`aHLm0XNN)AubeHJGLG|7iaAxrX5(s!oVa3miidEpgH^N6r1)A&gB zXnN!}2Y3{-BUi^qN2*7YBh!@N$Z0fgLt`V48rjpOwLVH_WdBsi$Q2C4$dUA{W2F32 zxJao6v?v9OQmiOKMY=cxQsj#L8NAZ9m`DxqDUlKOkVv&OBYI>P`GiP`JwTLVLt|*j zAwv!rGF-^TLWT;BK%p@vHANs^`;4wyZ zIC3lhgCL0fkS&PeQp@`9+FD$FL}Cr{Uw{2iuP&vK-m?FC?Yll{bD?BOS}x1{VfbJT z5`Bv|92z6vIPL3mxb0LOPudMc;bw-^SZ9Xq_9g}XY#dG$-svp8e<~$3UPV3y-uy$V z4~}(;sw*g(AFHN*z82GT{q*H3e?4#(EOuOlgJi!IH_XvuI1!7qM^7w56TVk?aoAzd zVaQJ}(^M!Uf&3FI7n1ga{|?aSz-Gs{vYkld@Ix=&68ZJ;{Od$E4t_g8`SCGE$FqZ@ zqhoyV@@O_YSiVHB4qhKEf153jjt^(EqW{?Vk8AmMIAt9z3+2gmgA-b1eV>?q8KZ literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/parts/.part-0-ac88ea82-778e-4722-b4a5-67b02b78322d.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/parts/.part-0-ac88ea82-778e-4722-b4a5-67b02b78322d.crc new file mode 100644 index 0000000000000000000000000000000000000000..3917586c696ed69a79304d88eb538f1663b7cd17 GIT binary patch literal 28 kcmYc;N@ieSU}Cr{$o}Bo17}U4D^<%sybS5Rap;{A0FtK+r2qf` literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/parts/part-0-ac88ea82-778e-4722-b4a5-67b02b78322d b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_37.ht/rows/parts/part-0-ac88ea82-778e-4722-b4a5-67b02b78322d new file mode 100644 index 0000000000000000000000000000000000000000..b9257c7030f37f752d0d9ee97f5c5bee4ca76764 GIT binary patch literal 2166 zcmV-+2#NP$2mk=jBLDy>wJ-f(&mx^d02*Dq6+SRvR%6`(ENCL->58UMVi6-NB}7Uf zEwM!bmuXmQIVqzN?a27xtTXn}88#O`$8)Tx&`|7R*(Ye8w)d7PH33TjS^@jZht~`u zdzipKl;8%FLsEB!hJ}A*;12Wh_r*Taf`0*gTdV6X*>#uHe}KEa#W$t@r@KwzNRmR| z30?YcLbRpGCUUE%L=FCb8_muD~fD^zC`afRTR3UJd z%+S;NU!vPzhD4?dreFOV<&+#HDMI=qf|2q56>CHXbsuk$>>y?*2m6FPiEZ?gdjG=dGSi~hCX?t00v zz%`il4^N4n>4&%cPB;D~Lc4>FIVbarKlqpH{7Qg(tj3rj9oRwtao@%+fgQDcZ}czs zIMrKJ5%UP`_7js`2Qz+&;$O;edyzR((5DmuA1JOAqz1jKl-;&6K% zF7UHKjq6kEa>r@QZb;tmtZTYk(%T;3$UL05Q(!82HWH~(vQdes_)iAzDb^VvHUoxq z`>E-yn#6ewqaJSE*f0(ZffbP;aV&BeczZ>Zy`~a{p6EYq(*SZn4@qkBp9p8Gp)m5) z+9dqLc9)B65n6LB^Ekz@Z3TjjX_y-_kbAV!1CQb+({A)Vmk^gex zFXufrjJp^Lssq?WUy+s6?6JHNZ+mcYrT{^z+f^=HbMSW9(SOy8n8x4sum+C-oZnUd zRDUDt*dVAFiK)5sVCrl#n0;o}V+{bYxE?$0OHmg4hS3zZZ;)_Hm-!Ei`SqQ{gH&7p zi~st}uj#G_ku8i6Oz1lKOev|9Fl9k0e*n@cX*?d4j<9?r7)}~)FGq-&ayZf8KkB=w zE$o2hAzi@#0Kn0ixwI~quurO3(-_($F)af_I&n%`f{_~|1@4{x7F77h4C#jB!)|ZutNGLPyBSxHqcqr3pBAl{; zq|=gf@COPh1@RAj9=;u5Vz!i3AYcU)Dg(Sq zt-g$J!8RwMw!j00gWV7)mqP3%(%)mg)*(8Syx*v=GClh zD(5MtatU2xST4<|)FdV}tLesVL$hw%wgwJc)3VsLE}D0H?YdPdwHQ{lQ#nVQgsPhP zoSRH%wdv8LM~|FMWwc40W&jJ5$)I5zx;`K=n@lK^sRRQs$a>T}#`pVXa@&>h#HNdtK{Z->loVuG@BvsR~xS zDqa<<#cA=XSX?5jSX1jWfsv>oJ&8y}0ud3Bq##K;0S3q*g{p`i5@2B2q7Oy$4}DXY^ID^oH36O3&d zeM?Ej39E{@dfiQqCTEi?@mM6W4g897c z(WE44Vsk?qlHSsnv+zBdW#;3_k{1ghh(ZGAME*UFKBduxl`INiwW@P4Dg}npU|0%7 z2uKeO5*F3niu#;QxQwK{v~ao6Y!a=C^Tf8Qt5~Z{G{=i$BFYup@IOaQL{itBZ|#lj zre7ArBxzJ8nwB-ksP!5Z4@p++t{*eCv02vCDRWe1R+>qZVrRpP&}!E}Y7H|Qv(iaN zyEerz;wVRd=LJ>|EX%%1HzL8dUT`c9>ZFLX^UTC>BVFT#^ynVj_BzT^44^75Ll3Y? zkr`xXB2s_5lUM@w=Xvzh{KN_x6+76(8en>RM_Efx7$rh z6d2}_D~JgiWQ-4RG)LfFzgLH!fxx1p+^}SFCXc)a)>mjS7nBqp9wQ&g*D)Io+hWh+ s&Z>CV*}{nfIhSJ}YL9b4N`_&{^DUgG4FCWD00000D77#BAOHaX03>H51ONa4 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..8c34199dd969af1df2f08f60cc1dfec4a44e4716 GIT binary patch literal 12 TcmYc;N@ieSU}7jdX%+$i5?=!U literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..04e94db9f3e346478cf803bcf25461b001594fa3 GIT binary patch literal 24 fcmYc;N@ieSU}Df%CmxqLv-P(BoNG-djRU0sS;Yv7 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_data/test_interval_1.ht/README.txt b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/README.txt similarity index 78% rename from v03_pipeline/var/test/reference_data/test_interval_1.ht/README.txt rename to v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/README.txt index 3d9a5ac98..d66e21027 100644 --- a/v03_pipeline/var/test/reference_data/test_interval_1.ht/README.txt +++ b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/README.txt @@ -1,3 +1,3 @@ This folder comprises a Hail (www.hail.is) native Table or MatrixTable. Written with version 0.2.130-bea04d9c79b5 - Created at 2024/05/20 13:22:32 \ No newline at end of file + Created at 2024/11/13 15:27:31 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..b8ae778a4f116d9eb190fbc896d6fca231fad880 GIT binary patch literal 16 XcmYc;N@ieSU}BJ1FHv^$$<+`5BH;yg literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..fd09f8a9e7f142c42f1034fd7a0e232a237750c5 GIT binary patch literal 898 zcmV-|1AY7-iwFP!000000IiqZZ__Xk!2g%L5~ML;LVcsGNJt>04e`KKd9Hm)%^W+k zokE%NzjJPq#)(}RAA4zi-*>+JvOZ*rR*)^SkA{N*e)akCx(0Th;K_c!BbV1~KSU?~(e z`+P@b1IJ^#Ugw7>LUH&`xL|Ja_!#e$dw$;b5qurQ(yEq#JvAx}bTC9aTKtD{F~~nv zOG`jk>x|emPH455xNd|Zz)P?}>sQZ1~{2I8LF> z{q*JEx%!;WP@#PBJ8A^o$jPX?+SYxJ$C&zhHU@xuJEoy1#?tz{Xn`ZaaguvRYHDx{ zc(&8pYaCiFsVlUpyhdvYRONW7N3jj$TTyPweM5&q{LO_1cDHA@quuSZ@Nj+`OweDy zpX^W)vEs+L+_}T*$PRnMy`J^QDCdJ}jBa2DG8S80QU+ASnwqLo;|2$s`Q6pqz0`H= z$~RUe#8wn8kw?pYy#*C%+I(m1i5|VivMp3J4(N6D-feXpGPiJVpw_8eP?J*X4VCj$ zkZ8pVw;=1OglScXa+H;zJoIDtiYg!$B(QiO?SZlly6(x<>h`xu84knI}iM}8Wx6SLEB{S8`EP)#HCG*IK| ze`;{YrpuEL7rst`F^R<#5c^MX6oNyB5P?JVIRHEP41cG= z*M~`9Z%8_W-Z6fFyaeoqxC82fa}T;+GRIsmPC;%+9U*QQng`r5(Zt8YZPOG5nA!i( z?^nxb=dYHdD$m$fNFJChrDk^+WMZ`O!fRZfY;p1O<@1Z!?TdB%;QEJtSTWFQ?%Dc3 zYyux1_g3H&h`$Rme|UP9`QHa$uW-#^iCpDg^^jSIN(_rmpm_I1C?*?@@1}zaFAa{e YmACPd)<>hx8S8!8pZjm4QJxF{04U46#{d8T literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..5fd2a8f0f815c562d10f0f3c336e0c8d5f22cf2e GIT binary patch literal 60 zcmV-C0K@-da$^7h00IE+2Hfl(&m4gTvdjS4RiH9Dpw8E-9NM=a&jw+O90CTw`W48i S9;ulpDDyyj6~;mp0J`Z&GD8rUR@(pxgA5hzU2tFoo8tI%Uyi4$ z<|LygQ*y{miR~1*xpI;lJQE2yi4F__+B9q;QJZW-%xpuazW04=F1A_s2yFv*1FiyF z_%t2~oQbsqKbTme2_=&7;)pXy7*PZfGT3XzMHd9Dh7ckMGeDI2 zVJRj{QZvQ?o)}?}0mc_zbg{(aV#%rpaVf@wxD+EJF2#Hhmts5Uki!^pDaOQ8BMkz) zHfXHT#5;qH8D*?%hP*!bD8So84?hI(^1$N`J65lb09IjeFkl5n%mAw!Y^Z?-CaW|} zR$Gt+0KiiK0r}h?56=Jsn0ozoc$zTT`uucw0vPf*e>pr$gm4~zI6Of>EB;nE6Z8Cx zK4d@fhXiM0oF1Hs$1ZQH>X4~<~{CGeJJwTHQEzKa1{U{~LVk=Ch(}vLEjUnUE(YjE^@a$OCi3kY|IBheOb4Tf)Hx9ROj0dUCoF zo6W=?kA@kBJQ->T@?!AuV0dB3bKy&n=f!KmrwSEui3S;15rHtcV2YI0f*y|rF5-~4 zLKiW}Qz3;_^znEQf4m+y;#viQ$fHNep4J*;4p|DJ6+XFi~7(v!z!|DQTAUf~~Jm zM;4pSL~(tEIWpP$3UO3?<0`tdzQPr{waTv8Zyh&#aK>@8jkeJ?+D6-GTgMY;wAofv z+2eS0>#gycdy;%^Q?FZ}9%r0i`-ySNjme!+o|O)FoQZMRmUmRgTYcdLxOBOqE}L^m z;77|S_E^48U4IHvWpit!&+1-!BsS-eyvq4U@+CKs__;L_sMG0?kZA1x>Gxe)&;M&3 z)Hmg`$*MWMt&4UJQjp8$rLk(3@_jEh=kiP`oRxTU{ju&iSH5ESg+;E<*=fF#l6FSR z-JUKOEGx-iCn$S`mv+_Ln*GlAel)+i_C0U#eNV1a6(}fEqA)pGxlmA;nwpp{6qVW( zr`iTLg9U#!2FAcJwmt&5hw@WgH6Yv4E*;cELS#tADQYD#y516WA z9IBT{_MGG7=K7bopF&>8ESXpC`EPL#j)&w5Hn1F>rzY~LrsKI46K5y#sV20xPg5H^ zUD1Gis`W&9Ym>;K@z8@D;tncHJ&eQJTk=a z%>3bvA>(-=#M@Hl3L{j$Fu_p7B`8b)PTt(%t!d+(VZ$3k#uL-UGog;t3zaWLT$-c- zTEmkEP9f&U#KbG#=*|jA2M`4&j)XMmw6JAI4Hr6G&d4zX2u>STto8r`0XPa^;iCWw zi4&SDRaPjVf&_!ADM*^Cq9CbNA)!D5BuaJy7%~87VxXhZr~sUafr>^32*8;bh-h43 z0Gx?|o{TI3a3%(FGTtUQ69XwQCcp$|VkHQS2mI(P5y}Yy!uj!8^(gVaBmnR`ohNB zQszrTO;8w*3^6<}gj`|7GxG-%3`1bR41odTPs0b0Y82Tmd8=PBaH23|ND zBa8{60tlfg(DcNS#16BKT6KX9Rtky15uUlg2-66qG8kOZ1WX=HnrLISCJ~R1L!7+1 zqe&WXO&d>4m(z=Ph7HezI$CP9XsOX+^~J}Nz=yz>1ZQHO14aXaJRR;hUJ^d)xj9vC zNol#6BK)%58n4_eH~wCmo15I5C5g9LmMSjw<{Nw6Sr(hR#3bK(#hYt1cQQ-%`nK!* zW3@$d@BEkeBsbR^_uq;oeazJ+T1GccOVV>TuXuaTlIPZBit4oeIG^OHKhHhqvM0ai zYfDN?_PWIEs@I-B*;?D3%bnXSrQR9s`EUKH^548{Nom3Tcl>>+KVG?SNtt=QwQTk{ zdu@`PlbPLpzH`c&x+TK?oB(Hre)AI5I#l0Cwr8~EtG@>y>p|EdTNMB;P&X)^?Lw&8mt?(t=tBO(axJ1E(rYBor1aP*|#vECI zH+w91s`Hav+Z}bMIzL^m?@FB~_o~J}o450`I%glOy zPkzy@Z!SA;`nv0K+4|YbUsm;bf6MMEH>>(2f6H#|a@`TGc75O1ZI)|pd^3u@ z&+GL%{db2|*6S#0PF=3=i{SH{>(BCAYL6@UJ}LiqUR$)|s4JB{zTuBemurN)wR)c1 zv(s3yfe5(9uB+y)Gng0`3%_$PvIhcR@_4iww znD-lqghcb~|0#Q~)SABbI>+8t?tHzSH(jvk#J;_|+t(^O@mX|RM@_XRBXsMVQ!{Jz zb~i3>T4LT=7T>&Lf4zoM1EJLNKk?nQprmE%Z>!Yb>yN*`TjSqz`QdxR9ladtf;TO3 zzMY?4tE$%J_zfRPNkv6v$^8U>8bOxaPooU}^hR)hTD0}WADiCmocnt8)$#Ot9Z&t$aW&sARTf)0RJjpJQm?q&*tIiCQf2?;P~}F4 z`x0OSc8=Q&9RJpkz3g+ATXkN!vm7mw+n>+6O^t(KDvwFj-#yR|LeXlR-`1;(c zICrz>sqXr%Z!0hFx4T_4@AA~$b?kKtb-|*Q_w)Lb-(1%H=YILWMZeT0NoJ4b`#^N- zn{TPli;u|sjjmU5YwDafxArYZcOmeVr>!F@HKXfKey@r@|F?D2R5NNFL8%$tzgu4B zk9W^s^%q~SbNcVe@85aNYQAZSnJ&1w-umwPd9U9Z|IN$$aqswUe7)v-fLtC>fQQ(TXX-sv)*f+ z|EKm_=g&K<&D>dX=ak*4oW;BARjVcGTa_jA##zmt%p0%#+`8wl>XSUVN}gQpf9vee zlas8^l|F~2C~KskTpA630#{s5WG z1sg_Z!X~V}hIWtq@Qc7R7H}Ss|ML?a_oO7aoB^?dnzn^|{_UK{E`?bM#i5leu=0I#G<4ztSY@R8)JL8TuVAajaOSU)n5&(x5tH9Y|>lsDvsT zNg!G}BM6Q0wWCiFgD%{uW6IQbMN+!>QxvOG%FT}yK2SEFlbExtWO6aN^gH)DJb6=+ z6vf=(g#}&#=+NTfQ1gIt7SUP}8T+;)kdsFuWV!@UShy^k;nQ#~{o3RCp`RQ9ON0R{ z(D8OwRryp=(0eB7=Z@51mBlHfH^KQHd5B0cp4%tfW_kM{d~d* z2(UX0EQ@3+|LSX(<|0CTG9wUF!=*uA$rG-h?5%P53;zAYF`<}w+*%mm`omlLe3R&8 za#|y7_@nv=(E`!Cz!44F1Frm$2I}>!=h-y1lSBqLvMH1@-cIx$Ukz$pqRoZqU%sM_ z978mr3pM70&jW6IPzsnywenFr*AE;#6@yfBd*Y8#hBY43$@k!N!qOo?j4K?#z?~H1 zcf`1E8cClRyDTZiQ{Z}VdSkSJEQYR=LVXV|Daoan4aot*#RXJ2vSSt1V**QWgj>bU z0Z}Xk$Hrt1^F1LT2qVLaJXvuwUD*+FAosMeMW{t~)ckj+YNE&fU_XW>v|VQ&Rt_Sg zA6pZ<_H3&IOCBFB5NcVU?#x&McsGZM)^RY@kn%So~tAq9EDrY(>HccGpp zRVAhWgIZNoC17?3siz~|bh3~lj4Uqp>sx|2Em%20pvDX}nQc3V*WQIMEGGtrK^Jx- zr-HawA*6iBhjZ&s;S6mis=_T|0h{2Qsemze8-yj1TXgvU{8N&tqKKzs1MksF8$yB4 zjhE1`Aym+fpV0^rbf6Hy`0fY;A$XCZQ)d^+aH;gu?LcRY)Bk6IWYYzs zTYoqo5RWlMXxOmRU|Go944h2hAbl`n@B^pzit4kBoH<58!Kf1ZdoH7=b3x9T-tclI zT|T19?mm~%&>15~<~O+fVXrR;xu5$U7~50%8^`IcOvb=kpy>V>r0^JF1|x$M2uo zq1wfMrIwCLL!6SbI1D){_!TDQu<_`dGV!bRvDDk5^upU;kUcvsv!^HCN8a?|{ez>* ziz=u(t;BbhuXNz=F*gZ^G7s0SWj&x=LQ1Z#4)9!^ZU9e|*~|i1v4hk$BvZJJwRGti z)-12BMSX9#U1t>0om4-thR7M5u>(g*D#3!JEB6E>5b^GMZ90!H7;^E}lLF~ag-&>M zRXj0(A!mqwD&kAtRK4hSJ4mz_e?6YbO5wv2zgv=+F`>RS;rLRYfvr`!%}5E{5E>~;e1IPMTF5Gjhz z1ZGB~jtLCUXcrA8=)cmma}S$){dgfc$H0fDGUUD1b9Pyt{s2^{kM)x1o;QZ{zH2_)e zv*>IHTjTmZm^EgjEn`ucLxGvOoh7=g;_yeV@>Ty$4g=vc(wom@?|YecwN~&qn(wYa z-WXZwm*%JO9ymO=L8#{#mS6P}$-U9Scs+xjRt=hA%@U0a;m^+`;dv<>NG2r*&r6}8 zc=&!Xt|q@b5A^1QWR6lpv_(y9>-QC}EhcZtv%u24AYW~NiHoUz)31f!sS`$jYpX<~ zLdd&snVLgvX9fk$uLUYCAN!3eB3C z>R3d<=EWl09&yy`IxsAo+@C3%^X@|VDkxa35w0{=&HKiR*=7#3XR2p8+R>^fZEj1A zM|Q8hB^9#GthB*r3%ARs^lcl)g!~X^7%30wa%u1dGaXEAE6$!`-L%vG%a{}b41Xkq z?-hG0-C(P>TNA>Fva8d_426?$6N3G$A<=H~CO?tD<44~| z9q5^yR5!n z0R~|)Y;h^t6b#nZC($kId(=(DmVOe-X{sfH)?!tTr#wFStDE}Gm=19v#%wFjnC+72pSm;Rq0bo|s{Z zQwyMg!Cu}QZ6neLWqH6LBLcigWI&)~#N$m^gA~;Gl~s@dklkevrV|=7{x54HL!47# z!5^`wV|Mf&n(c(lhF(PLJ0voSbFION_u8U2FT$@Q3v!vKfbi&xU@Bx`Wb#Z`#X*qN zrSqGE_P^C#W-bm9tcy}79i2{>ytksnq46W0NopEl5==@b{(u*n2 zdTon$7A}wuy~~8FQmkQl2EV*WorUjVT2SF_rHf~cPL?}#vRt1L;^HiV^N907C<5O8 z_T8tT*Bu=1!1AWx?b%L!&=Y0*TAHjlvqT@TL+zS5F+naDJZuod@VL^dss7YU2 zNvF}RLW*hgY3$G}*iYB3GRh5^$1|Y17x?mOBj~`Rws}GgoB=4S!%lM!-n8P7r1*_N zer@okRYQV|zW0I{_-OBP)FnfEtWi8LV^}Kaxw?b2$vGVvA2eHHlmPBn82s>Ga6RZ^ zmF5;*4zi%eSfQWGMQ`XW+-|i2zO+O3w==hW^Zj;9gCKLq;h0lNmSnCs8M^U|AJ9gH zmbV0Xh(4~%rG~?XP&|!1juO}A_VOElsNrDYTiDs6+E$KJN)3kq193i@U)%4&>|8j! z1Tx8~2$=ysoVbyj{k^zdA+R~Z)&cp+_4aLh3>SY8!J6m0-8^wgOCx zJxww88dhlCg!Y(i!^7$L-`CNi^!{Fjg$2$gv*6^Y?;~9726)`gZSy`2`mb_mK`5uO z8e)fc)@T|EOpg0L@p9)t2Gzlz*H7(C2h-8-kEqykz*Rd!oIn+j8kqs-<31l@u{l&6 z5zi4PzA73nX*?A+I(d>1x7lE^+=9vy3G{&$V7wY?O8G4&@k4u?*Qh z@6xJ}B2F45SX{J#SZK0hpoNS3%-?P7I>hw(8<@yjpQO&V;>DE@&vfNMGnIGbeS8b2 zjJ3_CaAao9c=(x~^)SEcUwYEB^w3ioKXotV(d6QS4m7E6z9j-KTteh%db8A${_{Ma zw}T%CZ-g-$j%uMvE84*lNxiN`Yp_ zDC&K!d;$dNl+R7sMa&w%FY&Y4O?uj1@&J$bSkL)Yc!r`uQ9CJc1q}cI0000004TLD J{U87V002PBlZyZV literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..9933a2de564be1a59afe40ba530bfd43b3051ece GIT binary patch literal 12 TcmYc;N@ieSU}DgT(tQ8`5O@O4 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..1ebb40475832fe624a5f861dbde3c00c6394b5eb GIT binary patch literal 12 TcmYc;N@ieSU}E6*>pltq5OD(J literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/index b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..b77f409e3e3f73eecc6087d18748856c7c36baf8 GIT binary patch literal 71 zcmb1VU|3wH866oEfMUWx3|1mM6R3a@$U{{F E0JK*OjQ{`u literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/index/part-0-90a97bd8-3648-4074-89bd-3a64d58266e2.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..172607a22a25fc75d8e6556b90df6cfebbf5d65d GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6;OU+FLJt!zDUc^JJ+a?$iNw#1p z{dc$CyetDV^LBb@jKwR4XuJbimRV5=cm(BYZCtxiHTeLt$d;u5ji%KEaxUP7>5|?E z63!I5ntJQ&F3SCQC)?9n=hajwYtdqqs}NMWagd4k0R$-LJSpOr|B%shJ4_MeOn`Au n{2Uc&4*WE`!FxJ1O+=oNFPKD3eRPCe>4)J9c<2+M&j0`bP##cr literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..f8b50dec6f7ac3a3649581403b128c57cb371e62 GIT binary patch literal 1713 zcmV;i22S}OiwFP!000000PR^zbK5o${x3SUGi)Y_^Qd)`EvJ*I)5KPqi^qe3z$J+Z z2+#m1Tk+_>cL5S0NLkIJ^{Kts#D~ScfPDbEpA}pH*?=}&$=UqX>!attX0wWNF@UqD zPo6({cJ%a@HPELo@?-Y;c>Qvg%`_CCK{7BzgCGsH6%r39l32P%_IBF<&U|!(`%bbGbr%wFLfmu1%dyS~mm5tljKksRmeDbZVLkDm7=jt?=k zZcrvut+YMUKtqeKi$a>emqbgzIfK}`wdkK<$(l<7c?qUx|3)ZkU%j}_+=Nqatc)b4 zrb2+8djQ9R>M{&>S*~uSHw8tTksc` zio}RspBlCHE9Si+yUh7Te9r>M-)hRGQ*T2xcLyN&r+0bJ^=o8K`dS0zj)o~Hil9g@ z4d>C|4He{u;Zf9RCXP+evq5Jxg4#mqdVZpm04g)|e*{fLD_&OQrlrBuhrIu>#WOZO zJXzRq+!W`Fz?c~R1;Vaclo0|e>c(iE2QX}nh7E`*Xz(;JpnQrqRF(=4O2o$=ZmO+N z&2&+|HgXR%ys&%RmDn~F$O*p7Hrnjj;xNN1sOglAn`ySTQd35mfg25t+UE}q>9WPZ z9`;mrPz9n%$4tqBmjNj;8)0lr7uHk)F(80VP2Tx*(&Dl$xjb7e-v7^lKNVeUL5Q#Ph!RqsJ;$ zt1sJ9*^u-Q#^}4wO5N1-F32yXs%bv|agNT9gi00-mbjJe=k>qHa_92z*mMMrUq*gz zbQ}|mPuB-c^ec&*PFH(`jOBK0_ehT)AlDE8e&@_9++=T%!rS4#tWkTHq*A#E+(o5p3vsKg&3+%t)%klfU|*IiY?~0Vbh&r z7pwF4%cSa!t74qk47Kmm<%%qPrXg>a(`4f&SC(pP#x8OAnam$Rk=pK+XP>Quz%H#3 z8fJ-DvJ({8Rq9HYJ= z^W(1TV->atu`}z=FVnnn4J;{8oYGU(j3WwLcF1oU$`<<^rMExgoi$hPY7h+gzO8&f z^x&OC`XYb4tCY^_Q6uaEAVhm=_u#1Sv5R8=rwC;{kDKc96_2B-;S`GsXLu{>PYoLu zvRvYwCehCWLogxfWHcgHdEcCUi-X;@hPq?lEzYr;ayc)U#gc3;7DY1m`OCxVkh(ynM1!086EPItuzv#N`S zCG~}u3--qaaVW4>zdaE?Zf6Qtm&TC1{Uu5Eae&B&JqyKm}N7j>PAVB z*39Pjv;LXOeY&=4HlJO7TAl`yRRhewx5r2JK)*v&-G2KzPmGJOR9fzvfBzbjGWZat zPW{Z3S)jjqcJ26R)d14Nh(iSXHDfB*Wpc@^*!~V%$Y*ohF>reu*8ATH{L%yL~ HpcnuEY<*`z literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..f0334b79bae36f8312e7e049d7d9d8dfb9a5f79b GIT binary patch literal 28 kcmYc;N@ieSU}CU+ke<~0d&&1-r@h}Em0fa9jFEFc0HF;GbN~PV literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_exomes_38.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..39e342b07a50da36fb8eb45f55541cd48b2f7963 GIT binary patch literal 2318 zcmV+p3GwzHiwFP!000000PUJfbKABSfd7kD?O5)OA8}>v?AW=J8MjTWG>c|D7>EQV zTvGrIlB(5={`)-u0Ro2vUE8(1Xq!bwICz`~4ql)6Z6xT5ji#dyB4zh1z5a4BD_C+k z8jr{=+l;1PN4ZSu63R5sS`kbDr;(cyM%RZlO^l%kujw&Be`8k#+I3EkY1LFcgHn-Z(@zMPnHe=~&Zf z={G)&HTH3NC4G=$gzFXHV*EQTNm=9^f_pAy&ZroxlBfL)HUR5k5k?NAqbCAZW3xmbrqCg4ZxD(33w?=c2_fzgxhsbiO@;Vj2iSHFwCmI z%kx!16R&Me6{n&i%~0(UzJX~fD^jvGQ!KQj?YNQ^&q&5-RpS;fX6ceImavax1dcEt z9c(_fB8X{Ip%2rJ$PCt)&&}y>I$}i-utDQN34yLU3@DtXUO(;jfdb)!_Y6PjiVI0-!c#x% zdPn5G*b4=P6OJb|Z<8wtIwz^11X*h&!CHjY-6C!;Q*h%6kxx1FH>;%?Acf1ye>%s=TI~I zu8J!`dy@i(YQ?w+XWEI-Y&&5Pj!v`&vpwiim2EcGg6lB}FJ&gxs%O~X(sLC_PRll6 zaMs+F3YKJ9=M_HoB4;Luk~)Xg;{(^c2-c9hifZ3^#q`b4p1XoA=wR)3YIyE0Oy5{ZLJPA|HEM(IuF+gcnhrK| zj~&vxK)TSPeFUztxtoUelvvVcH}D49F;Q22#a(fD^G&_(CF}xnp(tz+=;nP_gU7lO zM{sdxgjzq6&vnkGi}2oIXZL2T>&&}t&n1<)G~R$KIeyY>?2$0a3EU}tg}!CL zdqcv-r7;a$*Ed$zt5|MwSp=i-tit0KGT-P6AChjj8WgTAC`5?xO8@`ef3M#yk0#%@ z2(XFU%sf&p0Bl7Tp0(^OKxYHwtb_e9>w5l(EsAUG^f>F~hHT&-a=pQNnCkVQUerYo z)`q!Sqc^~6!&Gg+>)mMa=&nod5V)JM%4dy*cA?B%w>+u~NL?S)ML1oH=>{lW0O^`J z3kV$sbRnO^crHrkB5*Fw=0Y?#Oy*)>4jF8zTo1~1ncTxjZkWg+AcuKe49E3oTpz{_ zvbYe%#VH)#Yyt#_LelcLp47rOG>j#0QRrr~UqAO+_$Tf9d_-*t~7*b zVJ8}TSnA&d_@^a+$4N1gv?I4dc(fb826$13iV!90L!t;H8bU-55uy$t>hYl{9zt|z zUS}Coov+QVF3%4RC+EIuJW7^4PlG1sxlC^16hMvUS~%l>K!cOFZ;wv;-F_>p)O3Af zKL{tIt8UM&{s&GNMA7l#gOi!h?G(#D(rO9vAAb0cX|APkXl3T3x<`F!hrX8Ls?=AS z`{p}32=Oi3glIcx3pLYSeWkY@+smahg9;(xVQl~G7?$1bPvin)sNOqfXjSj_NBTM5JJuwMe z_fg{Y-R?Twb@}OO8gpg9k-uZ+LeQT1#{m-qA00f(cBBV~XHfN?>yMhJUq`yI|7QR6 z4L$vFo=*0UPL3!0$8UZ--k+S#)BPiQa-6=M9G;x8L-P&v|E@JCI+wLbb^mW8^8(e5 zHD64_k0ygbYqC@aV2;b+(e&Wp^b{@v?MKz1|1&grH-Z@-!Q=Lk!%Pk)lf%RQIA>-Y oE|R=X*{rb(9A~ZUQme>lUb3zBg;tf)s-c z;HUoE*l3rnw0`p}gvl$)pl%&$iR1>9)3P$o5e<3>wk+mEF^k3n9 zP^FL*9MT&iwn%*_?RM`)^&4sPT449)!nYIv)lvMiIGk*`TrOXC#YKpJ=-DW3-8Kdg zq;3-S8+o)b6p1s>1#m8P<;W7Pne#PnajknB;M&;pxmm(2fi%)MmtC={y3Jh1nU0$J z(dNZq8+#7%d_KM%gdmogN5{Smbl#f1)7IeCUEy~nqJO-ac4cFCr$zVc{h%xm%Szbo zVr{JBw(Z)Ze>_=%8)+}Wyve)WlZwQrel*s^emwSJuT%m9L}^ewDR0ma>iS@S29J%b zU=D`sK2nQI?D?N(I^Z^S)i}VsIVb_c{SMM+3RL}Y>#Bn5?Zz=D}65$6(saWGDDKu0p87$HPNCJ+fEMnr;4 zQj$D{ofTg3Rf9_;zddUv@4OdHF0qBdIX<LwTR%5J$9u8_i}kk?=tXAm>}GPoMSl*QT*l{A zaOJ5+v!<{7JnZ^VcMj?7$f9IN+2LsoOQ@ukoZqg3pBoc1@XsSxE23o*S0AiDn4Coi zF@^zT;60Qv2=zLTD)g1a$XFQmycDE3+VSJ}(>n{HO`%tvDab~Q8AnORb46zoO zDBNr?{s&u}^7U^E2f4cG^VzjEp@^IhG(E6IA?^P!8Eq9DlwtWOSC8a#E+^MAMqB0# zqh=KoPOTFD>I_@PYwlI%5K7Gxl`*l^z{N4^2zK4t!U|zNG$8sApnfu|vflFAw(-(U u-3YaIy1a0km_3{Va6s_T>b*ln#_e<=m;wy|00000001bpFa00@0RRB85sb(H literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/.README.txt.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/.README.txt.crc new file mode 100644 index 0000000000000000000000000000000000000000..92a8b63a47138500ae622f65dac5e7edb2d4a70b GIT binary patch literal 12 TcmYc;N@ieSU}A_YT~!4D5)lI} literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/._SUCCESS.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/._SUCCESS.crc new file mode 100644 index 0000000000000000000000000000000000000000..3b7b044936a890cd8d651d349a752d819d71d22c GIT binary patch literal 8 PcmYc;N@ieSU}69O2$TUk literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..79fda196232b9a6657ffec419797152361fa7261 GIT binary patch literal 24 gcmYc;N@ieSU}A{Kxxw-NMO@+AX_?|HuiI<_0A=3^_5c6? literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/README.txt b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/README.txt new file mode 100644 index 000000000..59c689541 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.130-bea04d9c79b5 + Created at 2024/11/13 15:44:06 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..036f69746b82c9e7092527f601994bd7307adbad GIT binary patch literal 16 XcmYc;N@ieSU}DJLykOoMe{XjHCHVy; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..40495e5c5673c97c5eef102eb4223d2e882d4b75 GIT binary patch literal 776 zcmV+j1NZzNiwFP!000000Hs!KZ<{a>{x5vmR3>fh+D&|0nslnvsco3{VG*(nJ`%TJ zQ`1dmgj)1kJ$58vG>KQlpm{L)rW9SxdOnyK zow%TCBV@(D(4rA^%mbeWmGRjZT~xUoFhrfKq`G@~(OyH`6{9!+>&NM)!P5F9EtVuH zjZFJkih0wUpL0QUB2}kS>&OLz>~zC3ZcwAP?Sr2h*5N#}3E^D&f(c~e99AgDDRN~_JRe%8tc1`xt zT&?P}TuX8;#~q-!0mHpYaIf*3OK&c_x#Z?@TQg^e*{UJm*5}!sOiKOk`}BHo@#THw z@JVI6mr3ejwM5bkFL;ecx^JK^w?h?eHdMPr@qeD^(uB(ryJ9;e(E+)O@q_Gegin_i zs*aq!v)v7mzh{4k=-y-GAb1}P`Y1gRqld(HM;aW)WtxI=zHRO%qfh5wM?018B@3Rh zj>#yK$unl^8O2r@^1oJvT%6Vu6aDo?g0ocy{L0Hz!am;e9( literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..8318eee0b03c5ca223652e155b6ace139af17e19 GIT binary patch literal 2599 zcmV+?3fT1$3IG5IF#rH4wJ-f(2rwm30IF8p6F)HKQ3p|X!|qBc?M_&A6>s*0YeJkH z{9;D14YDPJY)L9@6ER8x`hFus2F`N;tuZZeC)h5hf(xWjg&9fPDM|q*0U!ZDvM}Jk za9t7D;A4bT0#~FG_`hD+>3xOxud6ZkFGUyYLQxP_FAw@6lAi_a0}6b82f(ygLu%|6 zA<3e}@|aRgUiGoDeqLdu{)t~%X?$ShoM6|#qKo9f$U?fp>g`P^Feiq}hA|@c!27a> zE!0=lqvVK-p9&rW{)JyzDE9XD_5cC@sJExB5u7+~?vs+LeAW0}OduYg;2YvC!D)Fm<0B=Mh z7Euze{||m4FZ8)bT5{Hh{xONNwtMi4I^G(s?auT;Nt6I-J>TfGZTexLI~~kieYV4A zk}`)+G&A>e_{0BS1DU#~x$EtsO*=8`&0E}Y<{H|u^L6)A2ZO?D?or%kO|0h5TikKx z8rre*b@x+ey@OI$8hj`d1F4upnK&q&y1Y63b9yO0+baiMK!@ znWk~ltN8y+HQ+a~8eA1l;{XWDj;Tz~vcLi0Y-&d4-T2Azn$xp@o+M!rMwnx&0iOj^ zwV(|Q4^4C#HCrjDhWsc@(6gPbnmfsk%vn1ac@TYiWXHGZz$if17ne~?X zHAdDu$J4JucI>B?v7Z`8)UQQ;bewH;oQXht4``iMq_If3XMWz&=_RqR_eY+pfKq2Y2M52ZYJQ{bEfWT?s~gI-eRtURLr9;f47@f zd~@$o@)mcTxrTP^eBESnIINKL^6IkCwZ)~Zu;rB$B9p=jUR+&WTuLrw7gsWfOa{A@ zQHU8d6qtH><^N47NF%wrd>vhw>JQ$Tgc!kRme9bFu=s>m&@ zV{<>#cAlzbvtw@EoyCqh>N-mtoy}(lF*Y|XZ#Y6YX5}X4dCdhoGC5{${F-4M&BRI^ zH*TjLjydx%tuin&2`zW^i=)ya@a;Jtw#ESw4cnIbmq_RrUFIlFE8h zb)2hsJH<7QYRY;)b)2Di>%=uSX1#^HC1$;g9cL!qE^&>ARzaX}{X)e&j+!ve*S#~% z-HzU_UZ(ChXz;u}gUNcIiTV{zn!NqgX6Upb({YxwMhvvqY2CDCcsr-+S2h5C9plCL zD-tT^SD?^2Uzf`?_dC7aE}6Qk*x|iBPyL!C>+J;US1V8E_ER^J`c=!6x{foJC`zDx zOG~9?!rLrVze=(Cb;;AOM}DPZZuKN|zHWhO?gaF94P@%3YC7=t3}tNkDkr9=7a+CN zaRvZ2g!TdLR9ob2fT)j}dZO{4iCcF!iNCf{F(1Oi`MNWluM0QL-H6_9T&C_II^^CS zg1V?uw1!$DA|j%oAV~`8q6-j)LrK<3N)Hl1Q4|M)AP0gJ!x)88L_{QznUP3H#5t63 zh7SG&ln=ybT*6TBwq8f`6|%3%aRW_bx;pa3$W1uhf>f^lVn8l9CLt-zH50GOk35ZC zEe60rH#r(_cyONFj8B#to%E$9{Y)bNBA6W*AMsAljCsFo${><9kPOT>)zlP&Js5)@ z=P-^~1OO8+3=eF02@wb6E}F#TlIP0Vqy7>#ahB{e-`n|VwX7(pMZDFa$vM=aVh`a|blt)?%m!@7UMQG_@T z;t4&#=Ybdv;2eNio?H-_Pm4*gEto7@!%8a9oL|JQME1@SyITQ(`?jGq;0xoW5kqWe27kDZ#Co%wE5 zHSjW%7}b~E76Ww#(6r~})^4LJo!L_}3a*>MlNqiR$@k~I z&Z#sz*>!VOA7hRA*41$Ikj=+yM}xaxT$A+!V6Vk^rQT-hg=u?=KU%h1f zw@V;TTjE|-bSgVvEb&DMt|WvKvRJ5OHc6%Q=n6^_r<~N(eih@MNet+Tu#gL-*2}s4 zNSfb?$&PlD^@%b#Gi@{ye}sh+b7zt!?+h=F{FisQ4RnCtu($U%)?*mGU0f#9_Q%rs{I75?Cxn9<7>g zx#~jLCq;3vh2FC3Tqfh9=i#e^E{|=|c}WxjYJ;a*xg2-Ze7MuY;Ix)nxMd z=dk*FdK3xbz{T-Ts4UDnv)3|uxv@9Gb6^x6js$fO$Xu1}l8l`BJt zMk*0QZZdZyWUU%>-m${9@@0e}z?BE!1+#u)`L;Z11aS2@-kvoH;WP&%C<9S>ViB#U zc;kE`Ni2z2^+k+o29YHDAbwRTk|H$eR|$vr#kDsb!|;_*M7+f)XKL^e_Er4gutjO2=LzLXa&!}k+>JMRm~9Gs8%#r$xKP96K^hrS-%}|@^=Qvx^K$b&@-;@(f*APX- zYdfewn&6A!i@2(CnZED}9sw3aLj3s!If|s2f<;O+f)bE1x@uAbDhDNf(1~O0+9YwZ zJaZ;4gV~c3Lo92_@tDE)OTs+{XO|e3k?7O0000004TLD J{U87V008tT=Xn4C literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/.index.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/.index.crc new file mode 100644 index 0000000000000000000000000000000000000000..00c4847e354b4e1aaa61b09a79290a34709dd32a GIT binary patch literal 12 TcmYc;N@ieSU}Dgp;`jmp5n=-; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..7b9ae4ad7c263def2614a8404535d91bce0b3b5b GIT binary patch literal 12 TcmYc;N@ieSU}Bi|S|t401pbL6`!;PM|a+NDYz-07(G} A!vFvP literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/index/part-0-deb2219c-0ebe-4343-ba3c-143f95c4b24a.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..5f7a34128fb937f0389896b2ad710b51382ba266 GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6;s^%tw9uyQ6FXADs+a?$iNw#1p z{dc$CyetDV^LBb@jKv#6+dM z5-t?FntJQ&F3Q7rC%f}T=ZmRO)}qBI*CD8M=O7aw0|-#gc~ZnL{~@F2_Lw5bnE>OS n_&F)k9QbK=gZFf7nut6j&zVF_tvW)k^uzE490YGC&j0`bbSY6H literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..fad17957bf57bb190ddb35595c05e3e0ad36fe88 GIT binary patch literal 1625 zcmV-f2B!HRiwFP!000000M%IEbKE8n{$K8?ol!GRoFDbQIkuB}nmG2ICw=hTFbE-u z27wI7_SJL$dlyL{kmT5H>W4nu%<=KD>@Lf)tNTUHI7K&9X(q*D`S#uZt6!5vi5VZj z#omioFJA8N{gP6=_c}Wu?+(&8i)5i_P8Ah|YNRMfH8v%X2NWmm@AlMqDW4K*?+Z3a*9o(D5_`%TpdwjF`bBh zN+8S){5a)QFI!p3b+-J#B{pwgCpAt{h41{BiCm)37b`5jt(z)_*I!mKa4J$oK^;*X zNt);JDoA76cvq|rZp zTpgAB?|$C&Y(;g$jpKUfqgsLP#K1W_X937@O1XEQ-?sO@My#r(G8oP{I)Pr@fjI02 z)0^bxS}SA}W&(!G_fm3-MIxldgszw|G(#HZI5iSZ*D-MRr~ePyl~|X7vRd3A(ioZh zNJ?3+b5b#ZG?9urSMKZzCNc=ur?1&1xuVw)bO!epjpEK)x9Lb$_8qb=or~XE)(~iq zkZ?gCVdOY}f{cu5z|y8YTpuGayGy>&?edD-I7#|ju}PG?bHG zs;ZVP0Ar3-;fB#pvU=W~y2Jiq?^ro~xbNDdk0rKIITMeTdGN+>Vn+u)qBjXA%HOGs z61nvwO^d9lgX<}3JaFgIYZn@iX53q>QN`|P=7qscM%o0@|F=mwnN4BduH3$-B~}@N z=N8iRklR`*=U>*Rm#1G((B%STei` zXb?5K-cm-x+0E_5tbj-4yQ5rFW)N2k$EhCUKBIaG7*gbr z-EgkP1VAFjtRB^5SrsoM+}1>_7(M<}Wc}TwkQ!nU770brvF7DMBSKS06D#&0+K!~)%3b)TiMY?>`W9;J#@kXDuVUZxJ-tI z2@@0eiB;ePr;4koB9lS)NzPc#uF)%Vb%I$$2g8 zFQ7$>-1JaHcXcLFtTz9|PHe9x!h~i1K*y$ID7F}Z$Ec4HGEhk-FbXwlf#E6vbu@~| zGPBM66J2bZ*7n0fw>X1)&cw*PEcI>1;4<(fy&Jw_z!eVN-wxX>)C+C_y{%HoBTfe) zEgLo5w6YK+5LOm9nuLQo93YwJnmX&_kb&%?1|7M~gTE-}xX=;M7MLMon}LVIuap8tQ#RZB^vY&LQFXIc%_Bv)s(=(!%i$QRwPhrnQyk76h?82sg@>Pj1u* z2Hhul>=rqka45eV?5Z8-G#fha;o+I|kVl-=HLMup$M1dD;`_YgwZ~4*j(afZ8h^tz zZ!gw5a>PRoY0yu8zS!%_6mvpgQtiB!}Im)`Gr2T;GzZyIK4p4df^mQ#6 z7D1F4W}~tDbP@_UcT-30%vhM?U%lK|IZD!G2B5vN?+hV)*uVx~17-39&|Z XJCN3|Z$EsuOLu<*QMfUeaTfpp?MX2j literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..66088fe5b61a0209afa100587cfe0f196110d939 GIT binary patch literal 28 kcmYc;N@ieSU}88c6F&E~nDJ^!QA5X5`SIJ!UhP>40DEZ)s{jB1 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_37.ht/rows/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..edc8483058164b79ffbdb705a246e15249a8a256 GIT binary patch literal 2451 zcmV;E32gQsiwFP!000000NtA1Z`(K$!2cIN-Qk?Qd^O$r&G~3AKsRmUpa)zL2$V$G z+(?unQrUP5|L-@H_)W@5nspzx4}m!x4rfL~iu_gIClpn9GMik|96#dx`s?|k!P((t zI!W&EdNTVqDeJ6b&_-p6OUx#>(3RCW&V=G*_FY-HSvNR8(;c9Ng$R27b#aOqUhuZd z`1ju2%$8&|HJDbS9MkYxn9;BLnIyLQogyJq9smNgz;pBdCYQ3XoB;l<%_g_ zI~84yLts$GQKdt+P?&3C#9{bdG$oz``U5uHK_)@##bc$s&NY15iE$WnhmEkTqg z9I912FqE7tqX{Z`qKk2e`47aBiont_23H_r+@9h-H@-J0P5Qhm0ii{m{PyJ<(Yr-g zdGf`lYfo}X(-z~L(B)Z;i^A)@#wh<>v$(PM3s3OnJaEGkoPSn<@)l865zcX*Fqpl> z9M-4~u|T9uR;bOBf|MM$%yjjJK7U$V-dujYNN%p@msbfiE4c5menmb7GjA> zNnbG}OKkgE(UH;winGOq%DHDA<-aG!T-=1v@!N41L{HqtQ*-Di?Cr?WqXiPXlibi*?Cpb)+`x?`1lzH-jiRCrR6ykF#C@v5k(Dd<_*%2bg4i4hLUL3d!yb&T-=MiOz*av(?C68#wG{cl>dBj4PCM^%OW64nRP`e;B zsx9Q9b||3*j*b`dP`l=vxEninGmN)f$n2nJRoOk*Yx%x6?-|NR9SO9RS@lKrB@a z8zaLHxW%?(#!}5mkrWv5&eizKTwjspDuL5c0+%9_t|#Jg-Lk8x?{Ie4LDXBf2loe$ zBwHc6w9X&SvmQ@di=g} z8vX-;BeA>_l#$##@gNxp&WR)wZ$DW{nWdfO6eW&(y~J9HpzSwZ{W=G~Uyys)YB=hb z#$X_a9fQzl#Cd8iR8~GA)A+GJHxGTV9XQXYZ};1qnxmJnb(Pkot2M|HR(GwnSXWD$ z*#=k32bMXl#%U#WQ827i;#C)HS*_)5Sr_*2q$tsnxsv|O?1;pzXS`gSx?GER;jRL= zy{so929d}FF@6im2g1Ra&-oiNi_8(eukInvywCo9M*!4^HXzGGnd`-g^^xdohz|@ zf&LF_%6?TY+~kbj&+WVQZ(sJ6&B)Dv>rHs;McTg5U(f&R+~>#8fP)ymep!6{WEsi2 zvS|r7X>XZ%q#Hr`9wg;+_})kH_a7gw&sIlo|Ec-iuo*1+Q?sOZS|quX#!eP>8i7tx zX6M*5cbGvO-e3lKaDVZxFHugn)0rfjdu?`lA~)n)0y)QGkK)c@!OLed)sLpfZ7ywd zVVld^Jk;iqwkT+mGuoU@id@X*Q?>xIS&Yp`Y$Jp%0Bl?N+BUrA(={Ki`E1QcYum`$ zI9PjxIUjOJ0a zQHVCm(4IhOEO^&ZX3B%oU?tj$V)i{d2m#q;5w90YO-U$t>8bwv! z4Rw2tp5j#7GohG_-hf~87~^b8Dv$Z^lPOj2BNaW0rMAau^C&rkAG)Sd7K~DThDCWO z$|X@gh;lg8GgGV~l*^!;@bXk&Gfc`9dy+j0J=x8HJcZ!N)^WL$>`~ZBHwAUF8jamh z%*jDcW_M|gj{-QEpOQD(f&n*Kk{*Y(Ho}IeER>Nz{*ILkN&CV7Lg;f~ zvg2CWOoVYb0r2-ke78LNHW7`z{kmp ztDDZcl(5(s_MbHVw+wIMtOjxu?oo?j-G9;U*BSpzW&U}`e>{88j(W+cytS*S`&U6G zgi|NU!0H;RsyqJM3y_+QA_$D(Hn+d=s{2}^^L=r ziYoOCrj#-R9?1&tRJH(Zj*b8yg}Ea^te7i~ng9tltD_)< zWTlu1dxlXA1~V*Zb&_D0SPNo|@!w4VYh#rzq%u0>{)Og#DBYbphB{>5Nai`Hk?~Fv z3ZIjl=yfErQ?;E67PS9zvn)c)_h-5w2M0v&n2?n>tftNfEBJ^r@n1eKNi{MBO68*!1f) zX$H>(0{)dDGp3Kv4oj_vXLVI`(PCxF6M6(l@AMe`gZQHe#`uP8OOlg?Ny8)@cGpErt|R(jCPD6!8gyG_quaq%yDVzWK)g z@4cxa4&v>@nCgN03K9-g`q2;2ovicsvdWSq$Jc0rXox7P-F@=}3Yk+sshW}AMR}>5 z=8Y8wj#_Q=V$2iyrbFFbWQv<Zn)pohN!rR;%AWqPPZSab0IJ*T~Ww%yc`s@AD^Rcv9`l}U>MyJjxRKnER z41;&5n&!75wh#fTHgE|8=a#2H?#B=^SGP7lFP@gx^>W4AtgRLql`b`*UrSQx^cj5; zjV?{Il$}s#v>9F6m8H<;bc$AU1)n zh`zd8UTx5~SF5!Rf3ezaIJ`~a1%JC*12(-x&03T4pWIVp%?*_@m!#1qlgVUSJ4jss zu&o!OoT8ehHXyrR?$s8IbpiH@vsqeKTX2Ezt(Y6eUUj+JF1HHnRJ~SvHuYQF&2qt6 z7T9ffdxc$*-Gas8uNKhtDXElHN>&4bNT{JetQ--MKtx1RkR<7m1~W7>#10Z5K`4}C zS}N&CBqEUr1tLiqWu!^50mP{WYwD0Pbw1zk_@aiCQNNz2QtoQjqVdIpZ55IYkAiWEav;V2T}Xu)X1XuxPfv^rWP3g$+*vwCBN{TFEVPr0*NpVu4Uz8y`u-Y%#2lR z0>dpt`4 zv9e{4Ynl<~2pfCw*xu9dJm!zxEmfA{^gwlHmp*$;vx!}dC9TI4?~6#;0Y+GsI=g9W zFJ?l5a}=vMz2nU0^zJ32yXP*>WRerY2YEom8Jcm@HQQwcfaQh&?bgg_zzKV%jWCJR z#9?PZ^}vJxsE@wvf$R%^Bn<5q^sQ!wUMXA#0BS7?NB{r; literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/README.txt b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/README.txt new file mode 100644 index 000000000..36ee163df --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/README.txt @@ -0,0 +1,3 @@ +This folder comprises a Hail (www.hail.is) native Table or MatrixTable. + Written with version 0.2.130-bea04d9c79b5 + Created at 2024/11/13 15:48:23 \ No newline at end of file diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/_SUCCESS b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/_SUCCESS new file mode 100644 index 000000000..e69de29bb diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..ac1c84b6636d420267d0bbea20906a6d3bc77aa5 GIT binary patch literal 16 XcmYc;N@ieSU}D(!ZhqVQ{AU*cEZYYl literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..02b9777539a093c6b5d06faebbea6d1d4d7a0808 GIT binary patch literal 776 zcmV+j1NZzNiwFP!000000Iil?YuhjshX0Fi>%rEpW#q0|QW$Iu+Odn3U}Wjo(JD)> zB!_vy|9waCx8#iOBR7uUbKdhIn+G2wNeYNo(Pzy-1H1pac_@KiMsbupK^?6gcleJg z2Gr@%>ew5;)FrU@LjbeDNf^1?*ivMc~`^mP3#U${GHYHw3=|~hCdL^ zNh)S1)Xui+db}jn&68>X=JKe5EE`Ab`o>l%ksLr9d`f0MG9`@lq48`q96*G!DFvrM zco->(%$2~E$B4gDxvzyPLn}*-T#yaV>_Q(U>T(Yxk8o9cs^=5NQbb{UJwb10Rc2TkmkgU3axc= zzrEdj+D;CZSeUl-oQ{Tvj)nmT+lRM7Fg#)mU#+;kEMRV$)bnYXV{K8NVZUoNN7O>6 zfI$I$+VUZJX2Xos@tAXlI18-l=9-YES4PfEFyTXDnJzCeoMB44Uw4$}c`{2o&yr=5 zEHR|pdQOo3D13f&dO(l)@E$oj)0`W9hbcAs>OM1C-6O_%R&-?+NYOX-I5E_pP-1}( zechf8llzVho#Ki71MPi8ecZ0!T)tb6p*W&@F4!bx zEfjsiBD18_i4XiICfr6vBrFgSK@!PH9{x8o G3IG7v5qHV} literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/parts/.part-0.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/parts/.part-0.crc new file mode 100644 index 0000000000000000000000000000000000000000..66bd45ca4b6c0879c9db506627ee25b4711b0c25 GIT binary patch literal 56 zcmV-80LTAha$^7h00ICs_x*Omvga5-OVXWy;RWlnBfWfk#L7+{agUE;4JYZn3PFrw Oa_Y)OETbf`sTfYX-x)ms literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/parts/part-0 b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/globals/parts/part-0 new file mode 100644 index 0000000000000000000000000000000000000000..e79fae86e53d339c8170c6aea944f5984083a433 GIT binary patch literal 6048 zcmV;R7hmX$7XSbz8rwVX^yVYGip&7$eE2%~*KsuVqp_BUyRh0(quTKKRp;m`;JsFMi}jHflL9yXups8#0sPRMCul&_#_RZeM!j3VYGiq8bXZr z>r^?gY&?wiHDTk1(S9Xoh%nlRaw?%tIgudF8SUdyl?Y zFxrQ*@ll<4JHu!{k}+@??ORd-*wCPj_UouepfK7u1dCG&qkRBQF;p0~oKP6;OF6wz ztDIb5dR%4OVYIJh%cQOmz-Gc|pO9*W2cvyG>c&I|qkS%09<|H1M3>rV|BjkOmhFYn zK9#MJdS(0K%51bhWuv27*^IE#NBccn5p^TNw4umCWel#kOvxDJeE8p00RZs7+-l^% z8sN&vpJVOvR|aU=<#Q@r{;G==Tt3$`+Fub+v2+`JZfCTQiZa^os*(!IXunfZQk`_D zmBp&H))Yr!^KzEWPl_QH^PlFMQ~K$ZJS2%{8Gm?c56WYY2UTk-?<@ALGYf^Pwboi| zt+m!#Ypu1`T5GMf)>><=wN^QLYZbrw%^!Q_oZIr)jJYbTojm{Xv+FSH(kEYW&K1e; zSeGB}>q$A8zQ)6!d?uCl%Kv=!&v{WL_BU0vQ-0j>ekLEU)TI8rbL*PQ?&m&Vrm*;F z@;N{E%EL|iae0bQzHR!P5`TNDEAms@>#gj6Tlbv%R`dYqBXV33(k2CK4l5euK$j^Oxs~gtrt86Hy zYO3;?$kfd-^XrM<_Ch8aiFd6eKN^3sspR74eZNognwA!lqltOH-`TNuJ?>$Wmm&_E z>__`;h8@)}_fD>=wQ^KdQaUwVj@spO{&$0q8u{b$!;CZj@@^lC%Z$MbX4IFz%Qsw| zXun7MDD%`i(alXPYcnlwjkXwsQ9Mn-99(gj6pj*8&W3L>x|4blZ^s29%5%S+E3WMV>>ff2!CJTcf1EJ%n)2qpw0!&8P#ke3pgoyNe%z=CEfp&-~0ENHem4Z(tD zsWPyjS?Y^9hIn|6*dTRCMJgnt4ylNT>o)f9QHNAir96?C@*laephGHxg2qjURK%nD z!Vp7(FaZJvSQ{`>sVy^fF2)Sdu&Kc32+0kc4ykB`vSI)ze59B%M3z3d$k8PR1Gdmo z0u)8KOgJhy2$FV=z;IKi4FnhLP*7*hj!_*nJ4JPNAaG!Zm7>DhqV%Aqa6DC@oTT`y zR90$L8tq$A?d)vZ#+_2DEIX?01cjBQ=QPC%X8;ctmX(xhrCM33qM~xeC$+JroU~Bs zS*2DyT2x$As3=x^R8~~0DLpKoC|Fjgo-L`<33W=XQmGxDO?Bd+RN3*_P}}LWgR0V# zTCFyo6biN4p|WvFaj7UyM)Y*?l10_S^{}`Az){)4aVP);xT1xN569!P(&8gV$_fh? zj+MoS%hDqNhrpI9sykG75CIVpy=NuCIoOsDvTaT;MlP;Vp}BDf*S6GEr&r)L*#IR!iNPC zu@VUuBQ_z*h=K+RjUFMk8N!fpN+FTd99N{~xZ!^KxKx-UCMa89>lvKQN>H4bWUq$`~3SX}k@Mw??}P8tt>f zYJvBCXNFzM0}hyB=QzHiSvbct+K(bYtwLp^{YITe`>QQg3YfXnXnz$zrDAc(Z8D7Z zA*o{2fNPV~GHN=E_AybTh0*>ZU3j?ZhtYl~^$j0J`E6zTv(-^TTG#oC zro}Fyp`O>sY`)!6m~n`CmoXWq_Inq}c+Z`^{Xb3W&xIrlGv1wd(7$tcq!yL6zP#ec zyR7-xnRnLWTy#31veqo$7Uf(^Tr24f8#GBsE$V7%Al_|k<#5;~G3a`*+q(1n^~BCuz~${=bg^` zdgYnBOJ0)vu9Kjs0;4Xe$bus_MqSdIHc4;TbRGE3TGZ9*U$!_`%e*t7Rvap8+5Bax zzg7oWROrk*likZn@&xJDMjUrb-1E69y)(Vd5>wr2m@(7VAiewzW!f6-$}`8IOk0Dm z7e|*sP;(N0zm{$de%gAMM(;(UFnjq)Yh>1CsXv-E&HLU$t*S8jt!i>dU z+jn+fFOFMkcS#J=ewuj=y(9*^ewrD-Uep7MUmVx*DTmtp^oGy*Eh)!hr?n?dj?s`; ze7!iX0gh*lyQ%4i`A5KUOT6b>p5gvDbm{W>{k`+Khp^iTB5W%sT_(q}CmXf7$|RFk>JXjYb0kl9|1W97kpQ_Ks;u z9239RB(F$aG~5~n!R%TpGs&xl<6`EWT_Od7h`!Gn_s9EZUu?|BUPxmwJ3+89D54-T zWr)nc%EV+tWMna+B%;`BDE1m76cY-7GaC956Z#Sp`f^F{lVU<&Nc?JILSN7nl{$wp zp)XhLF`+Lpp)YG<4^)(wfJ)e!vHRs4I+7kr9|MV|qnVn(K`Cot(Np+&S*wInDU{z90A0`yEo9L}66?{?e_% zjY$&IX4?BOOG|O$$hjMYb+G|CcY~c0m<#J7ZGd+1Cu92kE$8@({b#iQQc`+t@PBCS zGEpy-Qp$0%BsE-^BcmcJ14ECDiYR|-FL}KuX&5CXDr>r2Nr{4*h8gWs85oX-wAL~C zryMGVSgh@yFX#9Bp6X6%TBMw&MNVm&ydG`3t{&Ig#n z%|jhQsH=xMf{gZgs3Qo4g|JIqoQ%^nNotVtr+5F>Ag^5}dP(m|{IlmWrj-Ber~IDe z&*}2=^wUTcljHR zgM}H>OPb|#`eBfMnC^Bkl#_p2Ji|6)e(JC`N0(Z3g}I;z4CSDvl>Dwcb%il10^{J>T+@mx|w~p|GrR#@ik+VY!%|{kshs*=V zu9g9OB$kL|R*dui63C4N2--0+%0WhraJ!G~S@aFZBqm$+qRuvJiGC?$Znd8?>e>cC zOPTpZB$T*4JE?Hq35$wpn3_mmAgfhn!;T`bBVCT;E)63(3Ti za#Meiv=4SH@u~%8;bvzu3yWW&m(|ebnCJ$S1m}J%RKpaP)Ck&#tg12hCPh*3vM&$# zCZuX2vc(-k>LR8F8Zc@$6o7W3jdV@JuG$!i7)_V*ZO1>J z@6?(5O9qzZ8|=3=2xWOm75SK8LSC|4d=8}{SAO9vr|6KgZ4nxoD7%YYYt#BVQ18NLqYeJ~;}bWsg$UG&2#$LpTEHniw@$i~RQb`QG2-7IMd%+AP(zFv0D! z_fu<=z@7cqX!20G(@{_JfL`bwjfjQ~{oEW{Az6Q$qO!u2A6{olQj)z+Fpqy&^OR`z zL?U`c$1R3O(Xri#a?cMa8;Z4ncjS&7;gXN!2Q$Rrt5OucGt2NkkH^aQpL24y!~DnW z#_=iO4jSVz;^;QDz$`JWYl($;7B1F$UMvyF`037+#d}`1l3EtChsW{CiiI|~(8s2# z*FOO2LFQ0;kU?sg(uK_{9jQIJfQTms$e0i~Yp?mm1Gd3}uf!)Z3jM=J&G=C>&r@NM zZwvQCKV(`wvnI!_iz#uZ#}- zH@j)ZNmxLugrpqoD;`Qzb+!PX9Y{Hy2wIxVEs7DyMf;@^sM=r|3*sH-%0$@wEqHOz z8O@ff8N=%Z(NRcu+x~;iA>_=PFB$~1&|F~dGhEyy!nkz}aZddHD&hX{c0pA_z*WhO zux9gu3L=pfai5}VsQ%I8&jleOiVy+IILD(spG%cqG$R93c|ova)-2$#aH&gJAI(Dv z1kkcE=Kz`TCNXAPQ$C=Qjd;!hFyduWcXJ z7BB?}-&G!D(i9#BBvA-|GyWlosqp`Lw-s^`aK@-?!=_<}R_!4U9#x2}~lNBp9ySGaXXU#2nFq9drm4T~PJJjmrW3^*01tpUX6K zY)1MnM5pTk4Kp=r6knvedIREI-=FEKvNbD)`O45M)Re2S8mignCPY_ntQ>>h2rlI7{-Vb&wYA?SNG$6SFjmC!$^u?IFKGyXN`ohE#rOXPWLcrl zNss@;b$3CGg725HrcWBEvIWciqN?Hn?21CcZo!*`#;zH{~a(7XD4a8NfeC!zAi2V+Fr4i`X9 zj`03P3y`=Dm?<3-hY<==5Y_x*2pco&EBe<$z_apagBoS^>ZYI31*X@dwLAsHq6g0}ToJlf zrV& zJPJlUYEg`)=#Bz|>`KwQ=}?~1RE8fqGTxn~ymp-&$Xv$^_V9F~Hc6nbM=;AA zDcOtUgl}$(=M9zciy#~8o6H-w-Fpr|y^S`u6WXNOtim2-5NVASZ0FQm^sr!h2n1@; z7-+}&%)QQV^HKySn30BkoM1}1uY%8jrog}OxQeYR?VSNJ!nz77%Wn%|TFbL+RZ(QhJaLf8TkxpY^$A0RqAf?`Lp2@dP(O55X~7{!C7;uE2RF{L1)tMSxDA@ysQOvD z0-p2dZNpZawH|3=LneCvayK}wb0S3rmR6!tU{3R_pMh$=(_UzeWqvBV7<7^w2Sls3 zJ)zpYqYBouBAG-3dN|WY8NqJt8D$u9x>Vd?IDPN1_t-i19(&H6$IiL)w0G=0_KrQz zorARGYe;p88lo;yhpb7|kTpqlL``y*&;lk|V0fctu^T`NM3N=oXCz`bKo5wdHo#`| z!hQf05J^IS&!EBHfD<5+FaVpefHkJ_AIIT7#{m3fk@Vnt6we8;9l?}(SdNk<0=B5{3fL literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/.metadata.json.gz.crc b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/.metadata.json.gz.crc new file mode 100644 index 0000000000000000000000000000000000000000..1ebb40475832fe624a5f861dbde3c00c6394b5eb GIT binary patch literal 12 TcmYc;N@ieSU}E6*>pltq5OD(J literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/index b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/index new file mode 100644 index 0000000000000000000000000000000000000000..1e7534f237115620ab05ab338d298309883d2455 GIT binary patch literal 71 zcmb1VU|3ux8J!svfMUWx3|1mM6R3a@$U{{F E0IC@bD*ylh literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/index/part-0-a3c7b21c-f8dd-4d21-948b-3746f5229729.idx/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..172607a22a25fc75d8e6556b90df6cfebbf5d65d GIT binary patch literal 185 zcmV;q07m~GiwFP!0000009B5`3c@fDME_+^3OQ6;OU+FLJt!zDUc^JJ+a?$iNw#1p z{dc$CyetDV^LBb@jKwR4XuJbimRV5=cm(BYZCtxiHTeLt$d;u5ji%KEaxUP7>5|?E z63!I5ntJQ&F3SCQC)?9n=hajwYtdqqs}NMWagd4k0R$-LJSpOr|B%shJ4_MeOn`Au n{2Uc&4*WE`!FxJ1O+=oNFPKD3eRPCe>4)J9c<2+M&j0`bP##cr literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/metadata.json.gz b/v03_pipeline/var/test/reference_datasets/raw/gnomad_genomes_38.ht/metadata.json.gz new file mode 100644 index 0000000000000000000000000000000000000000..8e094fef25e616956b8eb8d6ee6ce67096478611 GIT binary patch literal 1570 zcmV+-2Hp7|iwFP!000000PR>yZ`?Kz{x5rK0X0bCysX>$ks_$mB)e%YhCy(tkwk5a5UPX`t!hZx$l zsFSHq+CgaGq2;$#AZV7K+*z&#p2@aOO>ok;F7q z2++9)a4o2=!g8PCxma*|NuvrFrPI;+YuF>cV)pf!F+J{Y*q zoWJ5nHaPiKQ!br(TdKJ`*x>gN%0cS4*qru_1}Gd4Q&5({kU?8!m5RA&f_aP z8nz(jplJnXK=~YRsjQSh*8)B@zO8pcwewBI#>fNE@X{VgS7O)JAZPeK+iG**i^B}7 zp`mjzj?(UIt>%I<1J@c_btoTN(ihk$9u7=)Pz|C@*G$QhR{^Q8fHCdn8yhNt7!bhb zF7JH>cDQXvR_DvJx92fv)WWD}y5}?O@=KvuXEIj{m>pM{RyVz(RMC8L?%M04JT6x_LIO}O>#tLZhUYYrp z^~tnvuHUU9?Gx2!@52$TNn!=A5<~oIE*QaBU8JDTq24g9}VoikLW5<-H;nqSy4ymeM#5Id9`b3mCQDy<X6INvlkZ!^GmL>M?XZp3V(D;sgy> zzhC$B1*`GKgz;8(S;{nT-3UZVRHyXZG~<}Uk=++J4OJ&7?%w@ObT(YM;W9YzV@LTF z)njz-vlr##v|Bl^CnJ|DfKbz^-(#SD#4n0dmm-w$A|9&iN<6OSj#H^H?QyOivWATt zSs`&MPVDo*SWL(|TTPf%G0ZC86JR$OQ_{%HbGhvK#VNhj)O)iXcd)W zMW-j@mu|XJ}e*y#Q=Huoj& z$Zt)^uugI(j&P$&rJI>O_f=28nV=@b2OIyj(O6Wb6EN1y>pixP(}5kg>Q zZ7}6?16advp%@j^SVyYTw29_jcT^AuN2I#;Yp38RzCj&2`3A)Zr{AFNJFjRTTEpmr zeQ+NhN&^m&532$5mEMs56W~9b0PfLga`pl6!~Fxf6;-I_f&6l>KiHb_i$#%EgL`sQ zdh*BZjrlQGm z5iR16biY`ASQH{@YbcYVparcL*HD#+lqRa-V)faK@NQqx^rfwUR#b_g=E=pH)O6Fx zHfcWh!>xQtB6!2{6|`cU$bD3anl(&tki6#2(Xxy5)T@g#DarnWZg?GKlHNcoBXvWb zy<81_qqA45i&upp&B<|e_KT73L@EB<=c1utO3NE9@2;a+DuF<$^kmW}?ltxM^`dw*@i5}F@oA_rI6G!47K{~tkUFlb zf`M6UAqq-(Bx{yV*Mb!dm3EDu*b2rof&BSuSA>7uw56+m``g(isVXTx z4bh-P=cWs2cSTA1M%0sjo3&N2H{&|m{Cef8-xGQM#>7{2wuyb6pa^Q=;$#AlR%DBv z3NlY$t*#QDudKpQoYNVqHtYD^+4}Ngth}o7^@s9cg0`S$JTGViIb#>8k%ZSJYZ{uy zfof55=1L_iirOm*F8Z8fiPwV{`*=saI8a)=-CX{9?i6EuD=AH3KF9@{Wv=E{;;esv zwTaLAypCU=2f{|xZyF*SuTFbIRbB8jTmgL=Ivua;qFY)a4TfDMt{VyJ+GL?THC|Ow zuBD7uZR3?7)FE+L0#@>x-n5h_{{A$XgsX)fM)i8cFibh!)>v7Q#Lad~BqO|uhoNku zUc)lgOLc5eS{~HRqC*aGniAzsfx8~W|wr_4UY;YxB z=OpDMM%wlgFBhTpxQN?}6wJ7X=Tip#`)nBoNJU8)4>};-+HrmNg5&9@eoxiMY(a9T z0c;Ow@%nPLqg;6R1K54`MO0meYnpy{qA`P}M z>FHjG%>Nx@TY5=&m?r8xwOpZZEx6>S|S1=Fi>)%{POyPW1^wziA(PGFYM%?%j{mlo*`@FJ1kur{P znLoL~X*o)8xJ4GmKlOOKMJ?5hvSjesSh<;Ykc}dj1QHe9b8lKw#HtcolHc?~H63oh zRl$?yw6@Ozd%e7lR3&~g^jEZN(7i5|=F(i+cw7)T;7z%U(&~{XbQU&`ZQ&zL2)E}) zoe&a({1JZ-sP6&w%>s(#RL2sU&xrMF=@Bfq6N>UNCIr*_a|hmsBCaI7_w?P{r#d7} zk)Ya1wdI?Kuu8`MUT;O}e&u9K!$wc8h> z@Uh@CdV`5IUNWqbWF=3KJuodmb)AhFexNk!nzchS8KL2#9outEY*@ zLs~Y_AzKK;A(VgKn|hA5VEgp*P0ulUj_8z{Cqla09YE;sX@@wF@dMI$LV=LHe^w)j z8fX>)-A#b7YWg7<#6Q+=e?@=p_Ksm4+JO5lWf=MV#X66Hk8VD^krD2>oA%somydb$ z44U%j(%5jzcs#^o>Kx%20UpeE#&`$OoiB8-MLU@6AY;?58O?K`Av)$X>tE2iV0eyS z9B|H*=J;(e%+WQ5<=CP^IbNai3kKvMj^j2P=@eEJ$Nh|v{vN~eB2+B-xOyJ=>d2|fRF@12@vlzcNMvslaOZj1D z@SnyHZNo#&;6HvJWQGYo`5plOwg7OrBLq4g$SfaiJd9m4FvI5<2w_YKMgU;U@r%3g z#fV-^;6;#KAat>9vy95ohdGxQ>%--XwX2F2$&MAN*JNFY_Gn0PygwgD=8dWS^xdf z7k!k3zNTTaRu>HSk;zB8_nDd37HU=x^+8h?>8qYmgK#e3nL70KX6U->5UO%^cvhfm zuhRa$T+3=T(P^a3KP39#ip)vz3k1#AMN>bYq&yryeYmQdIE_b(J&nRa@~{y%%poak zWNBEVdj_F39|hjj?6J{flkXp^!)fFd$wGjK3|^ppNszQN@1I|XnCso ze_r(OL1ozRaVh*5A{b^(cJdI^F%i629X@?}blAD?PhtLd(BS0)R(t_Z%`Y70lT7n<358T8Q{VI_f8{U#(I1YxkgfYM?SK6*{9HfO zAN8L-LYfxwlRbmpaI`;o*1zuc2l@Da;`|4`e=+U{(Ba%7Qy&mNMjU^@(0_2sx0&mJ zx%Tkn+OOGm+$iopwextm)<6X>4xa8HbX&7wO2l#{hMbXZ|2-Qbq-(N*AgkEq^xuoZ zW~y(@m``Y}**{g>FVy8H0OC50+Em9)q|CK__5Ty6Qzz^K;?BDFTJV_mP zPLnKBI8;nNM{1Px{@>kKDhEiLx*yhs-Zu-YuIg_r9J3qN<@G!fUW!5=;pF_P_6(?>7v?f8q+b*1xsw8 zaY3!(B~d{j(MF}Fr4fx~8Sea5S*k0oY;{L7XDCytxr92I-tO>PYiSZ8DUM{Q6_H|z z%EB{3XzRA3WxRI|R9=jOBh;`ZVlM1c5&jfg=*7#JIcfJH)^D71}^ zLwE=qfR)4I5DYv5VWDmO1tDQyECaXi7yo=Pu!!{8Ki~cxhnu*4zx#|ypZ{m%JuC6t zbc~1bu!uYki;(e#jYdXtOdQ6r0U|bNACZPbqc}FOk$_Dff=^JN^L0)JNt4K;rRP#I zd2@3aoq|GTs||i~Q-%`B;aZ^J1!S~p9hIBU_(U05QR|*LU&6StYUi@Kbk15m#x-H< z;3~xcC@-1N<`QhhK0*1IQB0vczlDm7(!`IN;v_4LHJ))c(Vwq7zV zudUa+?V`{nV3<4=N7S4*G+IyT;SM|oiVc|CR4g0n)d`V#?QB93mB}K1a zQq8U}Em;E#&IlNwx)x}k(3(v!E%wR`D{Su4qH(Rr#JCtpIZ3$ko>?O#WiePSue8eQ z)lwMPyajf&e*_Md7txpB4Nl845uZA_SXE$&tfvXDVd3=$y!entT%?#oz>%UV2T&|| z%w^;$N*e`+4AJ@6!8&|%^j%ni-{!#^asZ0hnfuT8G|d?^WyXl912gBPOBN?jrdOzZ zflZ~v!6j+VQzBCwD_C5AXzfBf9ytM+2)z;YIKq*1$1J17JF+X{60{Zr`Mue0V?bMn zF?*n=Y%?2)jt<}$APM5vXu^oZNX-+)k_x?q&0nNqy@A`+*NUorl^fM}ii-V$uYxD| zUSIu6m1ZTf_#dt}Nvv$lY(j9KalO53olT&Xt?FE1vpuVpl~LyMIqRq#uV5;dvvZc0 zT2)wKg>Ck0TlL#=w@+x>ZW-4Jb~c|)=krG-#|534(P&RzUS5H6sF7^QMnpt{q$DXQ zqyy0gGBQM*4iex%I4p#+%+d^rh$N8|lt)=faghs%CG*COTFdKyAE+a+`)Mm@jyCv< zOXaSc=}=7lsbyooW?Lw59;ul!qr4fI#oBvsGd7&N*s|Ad2?WP(DJ0y|9{htT#X+V{ zSHiTG(-$!OP#ox+rJ#ckF$EOa$%pvQs@43p&uUa|J5kG~ z;0&T9fz{3sPly(OdTqpogU&y&D2T;?jP8b1v3A-oe&VnU1BLZ;8Npp#)n}=owg*32 z#lvu{@B6V|dW;M=qq;^1P15kT;UaXpdL^QhlHXQ-*~jf16*B&3wU(>8wBw*0#X&Hn>$QIM_h5g@+0b0G^74! u%yaKd%OgH_DfSPmpN@KfVW_st$>o3dP5})700000001bpFa00@0RR9uKSkC6 literal 0 HcmV?d00001 diff --git a/v03_pipeline/var/test/reference_datasets/raw/submission_summary.txt b/v03_pipeline/var/test/reference_datasets/raw/submission_summary.txt new file mode 100644 index 000000000..425f99c23 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/raw/submission_summary.txt @@ -0,0 +1,100 @@ +##Overview of interpretation, phenotypes, observations, and methods reported in each current submission +##Explanation of the columns in this report +#VariationID: the identifier assigned by ClinVar and used to build the URL, namely https://ncbi.nlm.nih.gov/clinvar/VariationID +#ClinicalSignificance: the germline classification on this submitted record +#DateLastEvaluated: the last date the classification on this record was evaluated by the submitter +#Description: an optional free text description comment describing the rationale for the classification +#SubmittedPhenotypeInfo: the name(s) or identifier(s) submitted as the condition for the classification +#ReportedPhenotypeInfo: the MedGen identifier/name combinations that the submitted condition for the classification maps to. 'na' means there is no public identifer in MedGen for the condition. +#ReviewStatus: the level of review for this submitted record; see http//www.ncbi.nlm.nih.gov/clinvar/docs/variation_report/#review_status +#CollectionMethod: the method by which the submitter collected the data for the classification; see https://www.ncbi.nlm.nih.gov/clinvar/docs/spreadsheet/#collection +#OriginCounts: the allele origin reported by the submitter and the number of observations for each origin. ‘na’ means that the number of observations was not provided by the submitter. +#Submitter: the submitter of this record +#SCV: the accession and current version assigned by ClinVar to this submitted record +#SubmittedGeneSymbol: the gene symbol reported in this submitted record +#ExplanationOfInterpretation: more details if the germline classification (ClinicalSignificance) is 'other' or 'drug response' +#SomaticClinicalImpact: the somatic classification of clinical impact on this submitted record +#Oncogenicity: the somatic classification of oncogenicity on this submitted record +#VariationID ClinicalSignificance DateLastEvaluated Description SubmittedPhenotypeInfo ReportedPhenotypeInfo ReviewStatus CollectionMethod OriginCounts Submitter SCV SubmittedGeneSymbol ExplanationOfInterpretation SomaticClinicalImpact Oncogenicity +2 Pathogenic - - OMIM:613647 C3150901:Hereditary spastic paraplegia 48 criteria provided, single submitter clinical testing unknown:2 Paris Brain Institute, Inserm - ICM SCV001451119.1 - - - - +2 Pathogenic Jun 29, 2010 - SPASTIC PARAPLEGIA 48, AUTOSOMAL RECESSIVE C3150901:Hereditary spastic paraplegia 48 no assertion criteria provided literature only germline:na OMIM SCV000020155.3 AP5Z1 - - - +3 Pathogenic Jun 29, 2010 - SPASTIC PARAPLEGIA 48 C3150901:Hereditary spastic paraplegia 48 no assertion criteria provided literature only germline:na OMIM SCV000020156.5 AP5Z1 - - - +4 Uncertain significance Jun 29, 2015 - RECLASSIFIED - VARIANT OF UNKNOWN SIGNIFICANCE C4551772:Galloway-Mowat syndrome 1 no assertion criteria provided literature only germline:na OMIM SCV000020157.2 ZNF592 - - - +5 Pathogenic Dec 30, 2019 Variant summary: FOXRED1 c.694C>T (p.Gln232X) results in a premature termination codon, predicted to cause a truncation of the encoded protein or absence of the protein due to nonsense mediated decay, which are commonly known mechanisms for disease. At least one publication reports experimental evidence that this variant affects mRNA splicing as evidenced by analysis of patient cDNA showing occasional skipping of exon 6, resulting in a transcript predicted to lack 40 internal residues (Calvo_2010). The variant allele was found at a frequency of 1.2e-05 in 251184 control chromosomes. c.694C>T has been reported in the literature in at-least one individual affected with Leigh syndrome (example, Calvo_2010). At least one publication reports experimental evidence evaluating an impact on protein function. The most pronounced variant effect results in defects in human mitochondrial complex I biogenesis (Formosa_2015). One clinical diagnostic laboratory has submitted clinical-significance assessments for this variant to ClinVar after 2014 without evidence for independent evaluation and classified the variant as pathogenic. Based on the evidence outlined above, the variant was classified as pathogenic. MedGen:C0023264 C0023264:Leigh syndrome criteria provided, single submitter clinical testing germline:na Women's Health and Genetics/Laboratory Corporation of America, LabCorp SCV001363290.1 FOXRED1 - - - +5 Pathogenic Dec 07, 2017 The Q232X variant in the FOXRED1 gene has been reported previously in Leigh syndrome, in an affected individual who was compound heterozygous for the Q232X variant and another FOXRED1 variant (Calvo et al., 2010). This variant is predicted to cause loss of normal protein function either through protein truncation or nonsense-mediated mRNA decay. The Q232X variant is not observed at a significant frequency in large population cohorts (Lek et al., 2016). We interpret Q232X as a pathogenic variant. Not Provided C3661900:not provided criteria provided, single submitter clinical testing germline:na GeneDx SCV000680696.2 FOXRED1 - - - +5 Pathogenic Oct 31, 2022 The FOXRED1 c.694C>T variant is predicted to result in premature protein termination (p.Gln232*). This variant was reported in individuals with mitochondrial complex I deficiency (Calvo et al. 2010. PubMed ID: 20818383, supplementary data; Formosa et al. 2015. PubMed ID: 25678554; Apatean et al. 2019. PubMed ID: 30723688). This variant is reported in 0.0040% of alleles in individuals of African descent in gnomAD (http://gnomad.broadinstitute.org/variant/11-126145284-C-T). Nonsense variants in FOXRED1 are expected to be pathogenic. This variant is interpreted as pathogenic. FOXRED1-related condition na:FOXRED1-related disorder criteria provided, single submitter clinical testing germline:na PreventionGenetics, part of Exact Sciences SCV004119439.1 FOXRED1 - - - +5 Pathogenic Oct 01, 2010 - MITOCHONDRIAL COMPLEX I DEFICIENCY, NUCLEAR TYPE 19 C4748791:Mitochondrial complex 1 deficiency, nuclear type 19 no assertion criteria provided literature only germline:na OMIM SCV000020158.5 FOXRED1 - - - +5 Pathogenic Dec 01, 2023 This sequence change creates a premature translational stop signal (p.Gln232*) in the FOXRED1 gene. It is expected to result in an absent or disrupted protein product. Loss-of-function variants in FOXRED1 are known to be pathogenic (PMID: 20818383, 20858599). This variant is present in population databases (rs267606829, gnomAD 0.003%). This premature translational stop signal has been observed in individual(s) with Leigh syndrome (PMID: 20818383). ClinVar contains an entry for this variant (Variation ID: 5). For these reasons, this variant has been classified as Pathogenic. MedGen:CN517202 C3661900:not provided criteria provided, single submitter clinical testing germline:na Labcorp Genetics (formerly Invitae), Labcorp SCV002982300.2 FOXRED1 - - - +5 Pathogenic Mar 29, 2022 - OMIM:618241 C4748791:Mitochondrial complex 1 deficiency, nuclear type 19 criteria provided, single submitter clinical testing unknown:na Fulgent Genetics, Fulgent Genetics SCV002793147.1 - - - - +6 Pathogenic Oct 01, 2010 - MITOCHONDRIAL COMPLEX I DEFICIENCY, NUCLEAR TYPE 19 C4748791:Mitochondrial complex 1 deficiency, nuclear type 19 no assertion criteria provided literature only germline:na OMIM SCV000020159.5 FOXRED1 - - - +7 Pathogenic Sep 01, 2017 This variant has been previously reported as disease-causing and was found once in our laboratory in trans with a missense variant in an 18-year-old male with mitochondrial disease OMIM:252010 C1838979:Mitochondrial complex I deficiency criteria provided, single submitter clinical testing germline:na Baylor Genetics SCV000245520.3 NUBPL - - - +7 Pathogenic Apr 23, 2013 - MITOCHONDRIAL COMPLEX I DEFICIENCY, NUCLEAR TYPE 21 C4748792:Mitochondrial complex 1 deficiency, nuclear type 21 no assertion criteria provided literature only germline:na OMIM SCV000020160.6 NUBPL - - - +9 Pathogenic Dec 09, 2019 NM_000410.3(HFE):c.845G>A(C282Y) is classified as pathogenic in the context of HFE-associated hereditary hemochromatosis. Please note that clinical symptoms are uncommon in C282Y homozygotes. Sources cited for classification include the following: PMID 9162021, 9356458, 8931958, 9341868, 9462220 and 11812557. Classification of NM_000410.3(HFE):c.845G>A(C282Y) is based on the following criteria: This is a well-established pathogenic variant in the literature that has been observed more frequently in patients with clinical diagnoses than in healthy populations. Please note: this variant was assessed in the context of healthy population screening. OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing unknown:na Myriad Genetics, Inc. SCV001194044.2 HFE - - - +9 Pathogenic Dec 01, 2015 - Hereditary cancer-predisposing syndrome C0027672:Hereditary cancer-predisposing syndrome criteria provided, single submitter clinical testing germline:na Vantari Genetics SCV000267038.1 HFE - - - +9 Pathogenic Sep 03, 2024 The HFE c.845G>A variant is predicted to result in the amino acid substitution p.Cys282Tyr. In patients with transferrin-iron saturation higher than 45%, presence of the c.845G>A (p.Cys282Tyr) variant is useful in confirmation of hereditary hemochromatosis diagnosis as individuals homozygous for the variant represent 80% of cases (Bacon et al. 2011. PubMed ID: 21452290; Alexander and Kowdley. 2009. PubMed ID: 19444013; Kowdley et al. 2012. PubMed ID: 22395570). The c.845G>A (p.Cys282Tyr) variant is incompletely penetrant with ~35% of individuals homozygous for the variant having normal ferritin levels (Bacon et al. 2011. PubMed ID: 21452290). This variant is interpreted as pathogenic. HFE-related condition na:HFE-related disorder no assertion criteria provided clinical testing germline:na PreventionGenetics, part of Exact Sciences SCV004120883.3 HFE - - - +9 Pathogenic Jul 21, 2020 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Genomic Medicine Lab, University of California San Francisco SCV002576300.1 HFE - - - +9 Pathogenic May 15, 2023 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:2 New York Genome Center SCV003925227.2 - - - - +9 not provided - Variant identified in multiple participants and classified as Pathogenic. GenomeConnect assertions are reported exactly as they appear on the patient-provided report from the testing laboratory. GenomeConnect staff make no attempt to reinterpret the clinical significance of the variant. MedGen:C0018995 C0018995:Bronze diabetes no classification provided phenotyping only unknown:24 GenomeConnect, ClinGen SCV000607202.5 HFE - - - +9 Pathogenic Sep 12, 2023 The HFE c.845G>A (p.Cys282Tyr) variant has been reported in the homozygous or compound heterozygous state in many individuals affected with hereditary hemochromatosis and is considered the most common cause of hereditary hemochromatosis (Barton JC and Edwards CQ, PMID: 20301613). Studies show penetrance rates of severe iron overload to be as high as 35% and severe liver disease in 9–24% among male p.Cys282Tyr homozygotes (Grosse SD et al., PMID: 28771247). This variant has been reported in the ClinVar database as a germline pathogenic variant by many submitters. Computational predictors indicate that the variant is damaging, evidence that correlates with impact to HFE function. In support of these predictions, a homozygous mouse model showed postnatal iron loading and in vitro functional studies have shown that the variant causes reduced function (Ali-Rahmani F et al., PMID: 21243428; Boucherma R et al., PMID: 22531912; Levy JE et al., PMID: 10381492). Based on available information and the ACMG/AMP guidelines for variant interpretation (Richards S et al., PMID: 25741868), this variant is classified as pathogenic with reduced penetrance. OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Clinical Genomics Laboratory, Washington University in St. Louis SCV004177020.1 - - - - +9 Pathogenic Jun 07, 2022 PS3, PM3_Very Strong, PP3 HFE-related disorder na:HFE-related disorder criteria provided, single submitter clinical testing germline:na Greenwood Genetic Center Diagnostic Laboratories, Greenwood Genetic Center SCV002568182.1 HFE - - - +9 Pathogenic Jul 21, 2023 The variant NM_000410.4:c.845G>A (chr6:26092913) in HFE was detected in 7331 heterozygotes and 264 homozygotes out of 58K WGS Icelanders (MAF= 6,775%). Following imputation in a set of 166K Icelanders (710 imputed homozygotes) we observed an association with hemochromatosis under a recessive model using 2403 cases and 240747 controls (OR= 50.27, P= 2.69e-212). This variant has been reported multiple times in ClinVar as pathogenic. Based on ACMG criteria (PS3, PS4, PP1, PP4, PP5) this variant classifies as pathogenic. OMIM:235200 C3469186:Hemochromatosis type 1 no assertion criteria provided research germline:710 deCODE genetics, Amgen SCV004022244.1 HFE - - - +9 Pathogenic May 11, 2018 - Human Phenotype Ontology:HP:0000707;Human Phenotype Ontology:HP:0000708;Human Phenotype Ontology:HP:0000759;Human Phenotype Ontology:HP:0002027;Human Phenotype Ontology:HP:0009830;Human Phenotype Ontology:HP:0010461;Human Phenotype Ontology:HP:0012531 C0000737:Abdominal pain;C0004941:Atypical behavior;C0030193:Pain;C0031117:Peripheral neuropathy;C0497552:Abnormality of the nervous system;C4023819:Abnormality of the male genitalia;C4025831:Abnormal peripheral nervous system morphology criteria provided, single submitter clinical testing germline:1 Knight Diagnostic Laboratories, Oregon Health and Sciences University SCV001448752.1 HFE - - - +9 Pathogenic Jan 05, 2022 The c.845G>A;p.(Cys282Tyr) missense variant has been observed in affected individual(s) and ClinVar contains an entry for this variant (ClinVar ID: 9; OMIM: 613609.0001; PMID: 20301613; 27659401; 26365338; 19084217; 11040194; 23953397; 26365338) - PS4. Well-established in vitro or in vivo functional studies support a damaging effect on the gene or gene product (PMID: 11040194; 23953397; 9162021; 9356458) - PS3_moderate. The variant is located in a mutational hot spot and/or critical and well-established functional domain (Immunoglobulin C1-set domain) - PM1. The p.(Cys282Tyr) was detected in trans with a pathogenic variant (PMID: 15507752; 17384005; 26244503; 25850353; 25277871; 24401005; 23953397; 32153640; 11478530; 26365338) - PM3_very strong The variant co-segregated with disease in multiple affected family members (PMID: 32153640; 11478530) - PP1. Multiple lines of computational evidence support a deleterious effect on the gene or gene product - PP3. In summary, the currently available evidence indicates that the variant is pathogenic. MedGen:C0392514 C0392514:Hereditary hemochromatosis criteria provided, single submitter clinical testing germline:2 DASA SCV002061285.1 HFE - - - +9 Pathogenic Sep 25, 2024 Criteria applied: PS3,PM3_STR,PP3,PP4 OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing unknown:na Institute of Human Genetics, University of Leipzig Medical Center SCV002044430.3 HFE - - - +9 Pathogenic Sep 15, 2021 - not provided C3661900:not provided criteria provided, single submitter clinical testing germline:na Institute of Medical Genetics and Applied Genomics, University Hospital Tübingen SCV001905583.1 - - - - +9 not provided - Variant reported in multiple Invitae PIN participants by multiple clinical testing laboratories. Variant interpreted as Pathogenic by all laboratories and reported most recently on 11/20/2019 by Illumina and 6/19/2020 by Invitae. GenomeConnect-Invitae Patient Insights Network assertions are reported exactly as they appear on the patient-provided report from the testing laboratory. Registry team members make no attempt to reinterpret the clinical significance of the variant. Phenotypic details are available under supporting information. MedGen:C0392514 C0392514:Hereditary hemochromatosis no classification provided phenotyping only unknown:2 GenomeConnect - Invitae Patient Insights Network SCV001749341.1 HFE - - - +9 Pathogenic Jun 12, 2023 - Not provided C3661900:not provided criteria provided, single submitter clinical testing germline:32 Mayo Clinic Laboratories, Mayo Clinic SCV002525758.2 HFE - - - +9 Pathogenic Jun 24, 2019 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Centogene AG - the Rare Disease Company SCV002028313.1 HFE - - - +9 Pathogenic Jun 01, 2021 • The p.Cys282Tyr variant in the HFE gene has been identified in the homozygous state in approximately 60- 90% of individuals of European ancestry with HFE hemochromatosis, and in the compound heterozygous state with p.His63Asp in approximately 3-8% of individuals of European ancestry with HFE hemochromatosis (Barton and Edwards, 2018). • The p.Cys282Tyr variant is associated with a high penetrance for biochemical evidence of iron overload, but with a low penetrance for clinical manifestations of iron overload with studies reporting evidence of clinical disease present in as low as 2% and as high as 33% of p.Cys282Tyr homozygotes (Beutler et al., 2002; Whitlock et al., 2006). • Individuals heterozygous for the p.Cys282Tyr variant may demonstrate evidence of biochemical disease, including mildly elevated serum transferrin-iron saturation and serum ferritin concentration, but do not develop clinical manifestations of disease (Allen et al., 2008; Pedersen and Milman, 2009). • This variant has been identified in 7,435/128,950 European (non-Finnish) chromosomes (9,544/282,608 chromosomes overall) by the Genome Aggregation Database (http://gnomad.broadinstitute.org/). Although the p.Cys282Tyr variant is seen at a frequency greater than 5% in the general population, this variant is recognized as a common low-penetrant variant that is an exception to ACMG/AMP classification guidelines (Ghosh et al., 2018). • These data were assessed using the ACMG/AMP variant interpretation guidelines. In summary, there is sufficient evidence to classify the p.Cys282Tyr variant as pathogenic for autosomal recessive HFE hemochromatosis based on the information above. [ACMG evidence codes used: PS4; PP3] OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:1 Clinical Genomics Laboratory, Stanford Medicine SCV004803175.1 HFE - - - +9 Pathogenic Sep 23, 2021 - OMIM:235200 C3469186:Hemochromatosis type 1 no assertion criteria provided clinical testing germline:na Clinical Genetics Laboratory, University Hospital Schleswig-Holstein SCV002011713.1 HFE - - - +9 Pathogenic Mar 30, 2016 The c.845G>A (p.Cys282Tyr) missense variant is widely recognized as one of the two most common disease-causing variants in the HFE gene. Cys282Tyr homozygotes account for 80-85% of typical patients with Hereditary Hemochromatosis (HH). However, the majority of individuals who are homozygous for this variant do not develop the disease (GeneReviews, Kowdley et al., 2012; Ramrakhiani and Bacon, 1998; and Morrison et al., 2003). In summary, this variant c.845G>A (p.Cys282Tyr) meets our criteria for a Pathogenic classification. We have confirmed this finding in our laboratory using Sanger sequencing. OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Knight Diagnostic Laboratories, Oregon Health and Sciences University SCV000223934.2 HFE - - - +9 Pathogenic Sep 14, 2015 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Genetic Services Laboratory, University of Chicago SCV000151394.2 HFE - - - +9 Pathogenic Nov 26, 2015 - MedGen:C0392514 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:15 Blueprint Genetics SCV000206975.3 HFE - - - +9 Pathogenic Jul 25, 2019 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing unknown:18 Equipe Genetique des Anomalies du Developpement, Université de Bourgogne SCV000883106.1 HFE - - - +9 other May 31, 2018 - not provided C3661900:not provided criteria provided, single submitter clinical testing germline:124 Eurofins Ntd Llc (ga) SCV000230091.5 HFE Variant classified as "other reportable" ??? variant is clinically benign (not associated with disease) but is reported when observed (e.g. pseudodeficiency alleles). - - +9 Pathogenic Jul 01, 2024 Common pathogenic variant associated with hereditary hemochromatosis (PMID: 23953397, 8696333); Published functional studies demonstrate a damaging effect as C282Y results in a protein that does not reach the cell surface and is subject to accelerated degradation (PMID: 21243428, 9356458); This variant is associated with the following publications: (PMID: 9356458, 23792061, 32153640, 34490613, 26474245, 29969830, 23953397, 19084217, 19159930, 19271219, 20117027, 19176287, 24604426, 12707220, 22693327, 19258483, 20031541, 20031565, 9836708, 23121079, 20640879, 20946107, 22531912, 23178241, 20099304, 22611049, 20669231, 19820015, 21785125, 23222517, 21514009, 19429178, 22209421, 23281741, 20560808, 17450498, 8696333, 26501199, 27661980, 27659401, 26365338, 25916738, 27153395, 25767899, 11903355, 29555771, 30291871, 30374069, 15254010, 31019283, 31028937, 31640930, 29301508, 25287020, 32189932, 31447099, 30145563, 31980526, 26893171, 32228506, 34426522, 9630070, 9674544, 11336458, 11478530, 11531973, 11976822, 9382962, 10520044, 32641076, 11565552, 9858243, 19912313, 10792295, 11181289, 10090890, 11500063, 11189980, 32874917, 37937776, 27816425, 37443404, 38195192, 28399358, 29145899, 35499102, 27784128, 21243428) Not Provided C3661900:not provided criteria provided, single submitter clinical testing germline:na GeneDx SCV000329362.9 HFE - - - +9 Pathogenic Feb 02, 2022 The HFE c.845G>A (p.Cys282Tyr) missense variant results in the substitution of cysteine at amino acid position 282 to tyrosine. This variant is one of the two most common and well-studied pathogenic variants associated with HFE hemochromatosis. Approximately 60-90% of individuals of European ancestry with HFE hemochromatosis are homozygous for the variant and between 3-8% of individuals are compound heterozygous (Feder et al. 1996; Morrison et al. 2003; Gallego et al. 2015; Press et al. 2016; Barton and Edwards 2018). Disease penetrance for c.845G>A variant carriers is variable (Beutler et al. 2002; Pedersen et al. 2009; Gurrin et al. 2009), with homozygotes being at a greater risk for iron overload than compound heterozygotes (Gallego et al. 2015; Barton and Edwards 2018). The c.845G>A variant affects HFE protein activity by preventing the formation of a disulfide bridge in the alpha-3 domain, which impairs the beta-2-microglobulin interaction and prevents the protein from reaching the cell surface (Feder et al. 1997; Barton and Edwards 2018). The c.845G>A variant has a frequency of 5-7% in Caucasians (Press et al. 2016) and is reported at a frequency of 0.064660 in the European (non-Finnish) population (including 137 homozygotes) of the Genome Aggregation Database (version 3.1.2). This allele frequency is high but is consistent with estimates of disease prevalence. Based on the available evidence, the c.845G>A (p.Cys282Tyr) variant is classified as pathogenic for HFE hemochromatosis but is noted to have reduced penetrance. MedGen:C3469186 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing unknown:na Illumina Laboratory Services, Illumina SCV000461887.4 - - - - +9 Pathogenic Jun 05, 2017 The c.845G>A (p.Cys282Tyr) variant in the HFE gene in the homozygous state has been reported as a common cause of hereditary hemochromatosis with high penetrance of biochemically defined iron overload but low penetrance of clinically defined iron overload [OMIM:613609.0001; PMID 8896549, 10381492, 18199861]. This variant has been detected at high frequency in the ExAC population database (up to 5% in Europeans) (http://exac.broadinstitute.org/variant/6-26093141-G-A). Cysteine at amino acid position 282 of the HFE protein is highly conserved in mammals and computer-based algorithms predict this p.Cys282Tyr change to be deleterious. This variant is classified as pathogenic.
Apparent homozygosity of this variant may be caused by the presence of the mutant allele on both alleles of this individual, or the presence of a mutant allele on one allele and an exonic deletion on the opposite allele. Copy number variant (CNV) analysis or segregation analysis is necessary to assess the apparent homozygosity status of this variant. OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Human Genome Sequencing Center Clinical Lab, Baylor College of Medicine SCV000839959.1 HFE - - - +9 Pathogenic, low penetrance Jan 31, 2024 This sequence change replaces cysteine, which is neutral and slightly polar, with tyrosine, which is neutral and polar, at codon 282 of the HFE protein (p.Cys282Tyr). This variant is present in population databases (rs1800562, gnomAD 6%), and has an allele count higher than expected for a pathogenic variant. This is a common, low penetrance variant that is known to contribute to hemochromatosis when homozygous or present with a second pathogenic allele in HFE. As many as 90% of individuals of European descent who are affected with hemochromatosis are homozygous for this variant (PMID: 16132052, 26153218, 26365338). ClinVar contains an entry for this variant (Variation ID: 9). Advanced modeling of protein sequence and biophysical properties (such as structural, functional, and spatial information, amino acid conservation, physicochemical variation, residue mobility, and thermodynamic stability) performed at Invitae indicates that this missense variant is expected to disrupt HFE protein function with a positive predictive value of 80%. Experimental studies have shown that this missense change disrupts a disulfide bond in the Œ±3 domain of the HFE protein and impairs interaction of HFE with beta2-microglobulin, resulting in a block in intracellular transport and loss of cell surface expression of the Cys282Tyr variant protein (PMID: 9162021, 9356458). In summary, this variant is reported to cause disease. However, as this variant is associated with a lower penetrance than other pathogenic alleles in the HFE gene, it has been classified as Pathogenic (low penetrance). MedGen:C0392514 C0392514:Hereditary hemochromatosis criteria provided, single submitter clinical testing germline:na Labcorp Genetics (formerly Invitae), Labcorp SCV000219175.11 HFE - - - +9 Pathogenic Apr 15, 2021 PS3, PP5, PS4, PM3 Cardiomyopathy C0878544:Cardiomyopathy criteria provided, single submitter clinical testing germline:na Clinical Genetics Laboratory, Region Ostergotland SCV001984982.1 HFE - - - +9 Pathogenic Jul 31, 2024 - not provided C3661900:not provided criteria provided, single submitter clinical testing germline:na Center for Genomic Medicine, Rigshospitalet, Copenhagen University Hospital SCV002550674.6 HFE - - - +9 risk factor Mar 04, 2020 HFE c.845G>A (p.Cys282Tyr) has been associated with increased risk for hemochromatosis. This variant has been observed in multiple ethnic backgrounds with highest frequencies in individuals of European ancestry (5.7%, Genome Aggregation Database (gnomAD); rs1800562) and is present in ClinVar (ID: 9). A large meta-analysis has reported an odds ratio of 1.2 [95% CI 0.8-1.6] for developing liver disease in heterozygous carriers (Ellervik 2007). In vitro and in vivo functional studies provide some evidence that this variant may impact protein function (Ali-Rahmani 2011, Boucherma 2012). In summary, this variant is uncertain risk allele for hemochromatosis in heterozygous state. HFE c.845G>A (p.Cys282Tyr) has been associated with increased risk for hemochromatosis. This variant has been observed in multiple ethnic backgrounds with highest frequencies in individuals of European ancestry (5.7%, Genome Aggregation Database (gnomAD); rs1800562) and is present in ClinVar (ID: 9). A large meta-analysis has reported an odds ratio of 3.9 [95% CI 1.9-8.1] for developing liver disease in homozygous carriers (Ellervik 2007). In vitro and in vivo functional studies provide some evidence that this variant may impact protein function (Ali-Rahmani 2011, Boucherma 2012). In summary, this variant is established risk allele for hemochromatosis in homozygous state. Orphanet:ORPHA79230 C0268060:Juvenile hemochromatosis criteria provided, single submitter clinical testing germline:104 Laboratory for Molecular Medicine, Mass General Brigham Personalized Medicine SCV000221190.4 HFE - - - +9 Pathogenic Feb 23, 2021 Variant summary: HFE c.845G>A (p.Cys282Tyr) results in a non-conservative amino acid change located in the Immunoglobulin C1-set domain (IPR003597) of the encoded protein sequence. Five of five in-silico tools predict a damaging effect of the variant on protein function. The variant allele was found at a frequency of 0.033 in 251236 control chromosomes in the gnomAD database, including 244 homozygotes. c.845G>A has been reported in the literature as the most common mutation found in individuals with Hemochromatosis Type 1, being identified as homozygous or compound heterozygous with another pathogenic variant in approximately 80-90% of reported cases, most frequently in individuals of European ancestry (e.g. Feder_1996, LeGac_2004, Beutler_2002, Yonal_2007, vanGemmeren_2015, deTayrac_2015, Zhang_2020). These data indicate that the variant is likely to be associated with disease, however the variant appears to have significantly reduced penetrance, as the majority of homozygous or compound heterozygous individuals with this variant do not exhibit clinical symptoms of the disorder despite some cases having elevated serum ferritin and transferrin saturation levels (e.g. Feder_1996, Beutler_2002, Yonal_2007). The mechanisms behind the variable expressivity of this variant are not known, but it has been proposed that other genetic variants could modify the phenotype exhibited by individuals who are homozygous for this variant (e.g. LeGac_2004, deTayrac_2015). In-vitro experimental evidence suggests that the Cys282Tyr-mutant protein has impaired intracellular trafficking and accelerated degradation compared to wild-type HFE (e.g. Waheed_1997) and that cells expressing the variant have altered expression of genes involved in sphingolipid metabolism (e.g. Ali-Rahmani_2011). In addition, an in-vivo study reported a loss of CD8+ T-cell tolerance to HFE in transgenic mice expressing the C282Y variant (e.g. Boucherma_2012) . Seventeen clinical diagnostic laboratories have submitted clinical-significance assessments for this variant to ClinVar after 2014 without evidence for independent evaluation. Sixteen of these laboratories cited the variant as pathogenic/likely pathogenic or as a risk factor for disease. Based on the evidence outlined above, the variant was classified as pathogenic with low penetrance for developing Hemochromatosis. MedGen:C3469186 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Women's Health and Genetics/Laboratory Corporation of America, LabCorp SCV001519562.1 HFE - - - +9 Pathogenic Dec 12, 2023 - not provided C3661900:not provided criteria provided, single submitter clinical testing germline:na Clinical Genetics Laboratory, Skane University Hospital Lund SCV005198438.1 HFE - - - +9 Pathogenic Mar 08, 2022 - Not provided C3661900:not provided criteria provided, single submitter clinical testing germline:120 AiLife Diagnostics, AiLife Diagnostics SCV002502491.1 HFE - - - +9 Pathogenic Sep 27, 2021 The c.845G>A (p.C282Y) alteration is located in coding exon 4 of the HFE gene. This alteration results from a G to A substitution at nucleotide position 845, causing the cysteine (C) at amino acid position 282 to be replaced by a tyrosine (Y). Based on data from gnomAD, the A allele has an overall frequency of 3.38% (9544/282608) total alleles studied. The highest observed frequency was 5.77% (7435/128950) of European (non-Finnish) alleles. This alteration is the most common cause of hereditary hemochromatosis (Allen, 2008). In homozygous individuals, up to 50% may develop iron overload, with 10-33% developing hemochromatosis-associated morbidity (EASL, 2010). Men appear to have a higher risk for disease development than women. In homozygous men, 84% display elevated transferrin-iron saturation and 88% have elevated serum ferritin concentration. In comparison, fewer homozygous women have elevated transferrin-iron saturation and serum ferritin concentration (73% and 57%, respectively). However, when p.C282Y is compound heterozygous with another pathogenic alteration, disease penetrance is significantly lower (Adams, 1997). This amino acid position is highly conserved in available vertebrate species. Functional studies have shown that this alteration leads to impaired intracellular transport of the protein and degradation before reaching the cell surface (Feder, 1997; Waheed, 1997). This alteration is predicted to be deleterious by in silico analysis. Based on the available evidence, this alteration is classified as pathogenic. MedGen:C0950123 C0950123:Inborn genetic diseases criteria provided, single submitter clinical testing germline:na Ambry Genetics SCV003702847.3 HFE - - - +9 Pathogenic Dec 25, 2023 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing unknown:na Baylor Genetics SCV001523198.4 - - - - +9 Pathogenic - The HFE p.Cys282Tyr variant is a common variant known to cause hereditary hemochromatosis; over 80% of hereditary hemochromatosis patients are homozygous for the p.C282Y variant (Feder_1996_PMID:8696333; Morrison_2003_PMID:12693884). The variant was identified in dbSNP (ID: rs1800562), in ClinVar (classified as pathogenic 13 times, likely pathogenic once and as a VUS once) and LOVD 3.0 (classified as pathogenic). The variant was identified in control databases in 9544 of 282608 chromosomes (276 homozygous) at a frequency of 0.033771 increasing the likelihood this could be a low frequency benign variant (Genome Aggregation Database Feb 27, 2017). The variant was observed in the following populations: European (non-Finnish) in 7435 of 128950 chromosomes (freq: 0.05766), Other in 281 of 7224 chromosomes (freq: 0.0389), European (Finnish) in 879 of 25108 chromosomes (freq: 0.03501), Latino in 494 of 35430 chromosomes (freq: 0.01394), Ashkenazi Jewish in 124 of 10366 chromosomes (freq: 0.01196), African in 260 of 24962 chromosomes (freq: 0.01042), South Asian in 68 of 30616 chromosomes (freq: 0.002221), and East Asian in 3 of 19952 chromosomes (freq: 0.00015). The p.Cys282 residue is conserved across mammals and other organisms, and four out of five computational analyses (PolyPhen-2, SIFT, AlignGVGD, BLOSUM, MutationTaster) suggest that the variant may impact the protein. Functional studies of the p.C282Y variant have demonstrated abnormal protein interaction, expression, processing and localization (Feder_1997_PMID:9162021; Waheed_1997_PMID:9356458). In summary, based on the above information this variant meets our laboratory’s criteria to be classified as pathogenic. not provided C3661900:not provided no assertion criteria provided clinical testing unknown:na Department of Pathology and Laboratory Medicine, Sinai Health System SCV001549492.1 HFE - - - +9 Pathogenic Nov 05, 2023 This sequence change in HFE is predicted to replace cysteine with tyrosine at codon 282, p.(Cys282Tyr). The cysteine residue is highly conserved (100 vertebrates, UCSC), and alters a critical cysteine residue involved in a disulfide bond in the Ig-like C2 type domain and prevents HFE protein presentation (PMID: 20301613). There is a large physicochemical difference between cysteine and tyrosine. The highest population minor allele frequency in the population database gnomAD v2.1 is 5.6% (7,345/128,950 alleles, 243 homozygotes) in the European non-Finnish population. This variant is reported as the common cause of HFE-related haemochromatosis. It has been reported in multiple individuals with haemochromatosis who were either homozygous or compound heterozygous for the variant (PMID: 19159930, 32153640, 11903354). The variant has been reported to segregate with haemochromatosis in multiple affected individuals from unrelated families (PMID: 10575540, 27518069). In vitro functional assays with limited validation showed a significant impairment to protein trafficking and accelerated protein degradation indicating that this variant impacts protein function (PMID: 9162021, 9356458). A transgenic mouse model for the variant showed an increased predisposition to iron loading (PMID: 10381492). Computational evidence predicts a deleterious effect for the missense substitution (REVEL = 0.872). Based on the classification scheme RMH Modified ACMG/AMP Guidelines v1.6.1, this variant is classified as PATHOGENIC. Following criteria are met: BS1, PM3_VeryStrong, PM1, PP1_Strong, PP3, PS3_Moderate. MONDO:MONDO:0021001 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Molecular Genetics, Royal Melbourne Hospital SCV004812520.1 HFE - - - +9 Pathogenic Jun 30, 2022 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:16 MGZ Medical Genetics Center SCV002580992.1 HFE - - - +9 Pathogenic - The HFE c.845G>A (p.C282Y) variant is a pathogenic variant observed in 3.4% of the human population (gnomAD). Individuals that are homozygous for the p.C282Y variant have a greater risk of developing iron overload compared to individuals with compound heterozygous variants (i.e. c.845G>A p.C282Y and c.187C>G p.H63D in trans) (PMID: 20301613). MedGen:C3469186 C3469186:Hemochromatosis type 1 criteria provided, single submitter research germline:4 UNC Molecular Genetics Laboratory, University of North Carolina at Chapel Hill SCV001251531.1 HFE - - - +9 Pathogenic - - OMIM:235200 C3469186:Hemochromatosis type 1 no assertion criteria provided research unknown:na Genomics And Bioinformatics Analysis Resource, Columbia University SCV004024088.1 HFE - - - +9 Pathogenic Aug 01, 2024 HFE: PM3:Very Strong, PS3, PM2:Supporting MedGen:CN517202 C3661900:not provided criteria provided, single submitter clinical testing germline:95 CeGaT Center for Human Genetics Tuebingen SCV001246053.24 HFE - - - +9 Uncertain significance Apr 12, 2014 - Human Phenotype Ontology:HP:0000992;Human Phenotype Ontology:HP:0010473 C0151861:Porphyrinuria;C0349506:Cutaneous photosensitivity flagged submission clinical testing unknown:na Centre for Mendelian Genomics, University Medical Centre Ljubljana SCV000493004.1 HFE - - - +9 Pathogenic Jun 17, 2022 ACMG Criteria: PS3, PS4, PM3, PP1_M, PP5; Variant was found in compound heterozygous state with NM_000410.4:c.187C>G. OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Institute of Immunology and Genetics Kaiserslautern SCV005382109.1 - - - - +9 not provided - Variant classified as Pathogenic and reported on 05-24-2022 by GeneDx. Assertions are reported exactly as they appear on the patient provided laboratory report. GenomeConnect does not attempt to reinterpret the variant. The IDDRC-CTSA National Brain Gene Registry (BGR) is a study funded by the U.S. National Center for Advancing Translational Sciences (NCATS) and includes 13 Intellectual and Developmental Disability Research Center (IDDRC) institutions. The study is led by Principal Investigator Dr. Philip Payne from Washington University. The BGR is a data commons of gene variants paired with subject clinical information. This database helps scientists learn more about genetic changes and their impact on the brain and behavior. Participation in the Brain Gene Registry requires participation in GenomeConnect. More information about the Brain Gene Registry can be found on the study website - https://braingeneregistry.wustl.edu/. MedGen:C0392514 C0392514:Hereditary hemochromatosis no classification provided phenotyping only biparental:1 GenomeConnect - Brain Gene Registry SCV003931195.1 HFE - - - +9 Pathogenic May 20, 2023 The missense variant c.845G>A(p.Cys282Tyr) in HFE gene has been reported in homozygous state in multiple individuals affected with hemochromatosis (Porto G et. al., 2016; Gallego et. al., 2015). Experimental studies have shown that this missense change disrupts a disulfide bond in the Œ±3 domain of the HFE protein and impairs interaction of HFE with beta2-microglobulin, resulting in a block in intracellular transport and loss of cell surface expression of the Cys282Tyr variant protein (Waheed et. al., 1997). The observed variant has allele frequency of 3.3% in gnomAD exomes database. This variant has been submitted to the ClinVar database as risk factor / Uncertain Significance / Pathogenic (multiple submissions). The reference amino acid change p.Cys282Tyr in HFE is predicted as conserved by GERP++ and PhyloP across 100 vertebrates. The amino acid Cys at position 282 is changed to a Tyr changing protein sequence and it might alter its composition and physico-chemical properties. For these reasons, this variant has been classified as Pathogenic. OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Neuberg Centre For Genomic Medicine, NCGM SCV005382430.1 HFE - - - +9 Pathogenic May 28, 2019 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing unknown:na Mendelics SCV001137062.1 HFE - - - +9 Pathogenic Dec 21, 2023 Based on the classification scheme VCGS_Germline_v1.3.4, this variant is classified as Pathogenic. Following criteria are met: 0102 - Loss of function is a known mechanism of disease in this gene and is associated with haemochromatosis (MIM#235200). (I) 0106 - This gene is associated with autosomal recessive disease. (I) 0112 - The condition associated with this gene has incomplete penetrance. The highest biochemical and clinical penetrance has been reported in p.(Cys282Tyr) homozygotes (PMID: 20301613). (I) 0200 - Variant is predicted to result in a missense amino acid change from cysteine to tyrosine. (I) 0252 - This variant is homozygous. (I) 0307 - Variant is present in gnomAD at a frequency >=0.05 (v2: 8992 heterozygotes, 276 homozygotes). (SB) 0501 - Missense variant consistently predicted to be damaging by multiple in silico tools or highly conserved with a major amino acid change. (SP) 0600 - Variant is located in the annotated IgC MHC I alpha3 functional domain (NCBI). (I) 0801 - This variant has very strong previous evidence of pathogenicity in unrelated individuals. It has previously been described as pathogenic in multiple patients with haemochromatosis (ClinVar; PMIDs: 37260121, 9162021, 19159930); either in a homozygous state or in trans with NP_000401.1(HFE):p.(His63Asp). (SP) 1002 - This variant has moderate functional evidence supporting abnormal protein function. Functional analysis using transfected cell lines showed defects in HFE protein intracellular transport and cell surface expression (PMID: 9162021). (SP) 1208 - Inheritance information for this variant is not currently available in this individual. (I) Legend: (SP) - Supporting pathogenic, (I) - Information, (SB) - Supporting benign OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Victorian Clinical Genetics Services, Murdoch Childrens Research Institute SCV005086593.1 HFE - - - +9 Pathogenic Oct 25, 2022 - MedGen:CN235283 C3661900:not provided criteria provided, single submitter clinical testing germline:na ARUP Laboratories, Molecular Genetics and Genomics, ARUP Laboratories SCV003799234.2 - - - - +9 not provided - - OMIM:235200 C3469186:Hemochromatosis type 1 no classification provided literature only germline:na GeneReviews SCV000245793.3 - - - - +9 Pathogenic Mar 30, 2021 HFE NM_000410.3 exon 4 p.Cys282Tyr (c.845G>A): This variant has been reported in the literature in the homozygous or compound heterozygous state in many individuals with hereditary hemochromatosis (HH) (Allen 2008 PMID:18199861, Pederson 2009 PMID:19159930, Cezard 2014 PMID:23953397, Gallego 2015 PMID:26365338) and is reported to be the most common cause of HH (Le Gac 2005 PMID:16132052, Gallego 2015 PMID:26365338, Porto 2016 PMID:26153218). This variant is present in 3.3% (9544/282608) of total alleles in the Genome Aggregation Database, including 276 homozygotes (https://gnomad.broadinstitute.org/variant/6-26093141-G-A). Please note, disease causing variants may be present in control databases at low frequencies, reflective of the general population, carrier status, and/or variable expressivity. This variant is present in ClinVar, with several labs classifying this variant as pathogenic (Variation ID:9). Evolutionary conservation and computational predictive tools suggest that this variant may impact the protein. In addition, an in vivo mouse study showed postnatal iron loading in mice homozygous for this variant (Levy 1999 PMID:10381492), and in vitro functional studies have shown that the mutant protein is retained in the ER and is unable to interact with beta2-microglobulin (Feder 1997 PMID:9162021, Waheed 1997 PMID:9356458). However, these studies may not accurately represent in vivo biological function. In summary, this variant is classified as pathogenic based on the data above. OMIM:104300;OMIM:176100;OMIM:176200;OMIM:235200;OMIM:612635;OMIM:614193 C0162532:Variegate porphyria;C0268323:Familial porphyria cutanea tarda;C1863052:Alzheimer disease type 1;C2673520:Microvascular complications of diabetes, susceptibility to, 7;C3280096:Transferrin serum level quantitative trait locus 2;C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Center for Genomics, Ann and Robert H. Lurie Children's Hospital of Chicago SCV003920032.1 HFE - - - +9 Benign Jan 01, 2009 - HEMOCHROMATOSIS, TYPE 1 C3469186:Hemochromatosis type 1 no assertion criteria provided literature only germline:na OMIM SCV000020162.9 HFE - - - +10 Likely pathogenic Mar 26, 2024 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Center for Genomic Medicine, King Faisal Specialist Hospital and Research Center SCV004806939.1 - - - - +10 Pathogenic Mar 28, 2023 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing biparental:na Genomic Medicine Lab, University of California San Francisco SCV004847117.1 HFE - - - +10 Pathogenic - The HFE c.187C>G (p.H63D) variant is a pathogenic variant seen in 10.8% of the human population in gnomAD. Indviduals with the p.H63D variant are considered carriers of hemochromatosis, although this variant is associated with less severe iron overload and reduced penetrance compared to another pathogenic HFE variant, c.845G>A, p.C282Y (PMID: 19159930; 20301613). MedGen:C3469186 C3469186:Hemochromatosis type 1 criteria provided, single submitter research germline:11 UNC Molecular Genetics Laboratory, University of North Carolina at Chapel Hill SCV001251532.1 HFE - - - +10 Pathogenic May 13, 2021 • The p.His63Asp variant in the HFE gene has been identified in the homozygous state in approximately 1% of individuals of European ancestry with HFE hemochromatosis, and in the compound heterozygous state with p.Cys282Tyr in approximately 3-8% of individuals of European ancestry with HFE hemochromatosis (Barton and Edwards, 2018). • The p.His63Asp variant is described as a low-penetrant allele and is rarely associated with clinical disease in the homozygous or compound heterozygous state (Gochee et al., 2002; Gurrin et al., 2009). • Individuals heterozygous for the p.His63Asp variant may demonstrate evidence of biochemical disease, including mildly elevated serum transferrin-iron saturation and serum ferritin concentration, but do not develop clinical manifestations of disease (Allen et al., 2008; Pedersen and Milman, 2009). • This variant has been identified in 18,635/129,168 European (non-Finnish) chromosomes (30,592/282,844 chromosomes overall) by the Genome Aggregation Database (http://gnomad.broadinstitute.org/). Although the p.His63Asp variant is seen at a frequency greater than 5% in the general population, this variant is recognized as a common low-penetrant variant that is an exception to ACMG/AMP classification guidelines (Ghosh et al., 2018). • These data were assessed using the ACMG/AMP variant interpretation guidelines. In summary, there is sufficient evidence to classify the p.His63Asp variant as pathogenic for autosomal recessive HFE hemochromatosis based on the information above. [ACMG evidence codes used: PS4] OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:1 Clinical Genomics Laboratory, Stanford Medicine SCV004801387.1 HFE - - - +10 Pathogenic - The observed missense c.187C>G(p.His63Asp) variant in HFE gene has been reported previously in homozygous or compound heterozygous state in multiple individuals affected with hemochromatosis (Atkins et al., 2022), however, penetrance of the homozygous genotype is very low and is associated with variable phenotypes. Experimental studies have shown that this missense change affects HFE function (Tomatsu et al., 2003). This variant has been reported with the high allele frequency of 10.9% in the gnomAD Exomes. This variant has been submitted to the ClinVar database with Benign / Uncertain Significance / Risk factor / Pathogenic (multiple submitters). The amino acid His at position 63 is changed to a Asp changing protein sequence and it might alter its composition and physico-chemical properties. The amino acid change p.His63Asp in HFE is predicted as conserved by GERP++ and PhyloP across 100 vertebrates.Though the variant frequency is very high in the population, the variant is enriched in patints with HFE hemochromatosis as compared to the general population (Burke et al., 2000). For these reasons, this variant has been classified as a Pathogenic variant which acts as a risk factor for the development of the disease. OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Neuberg Centre For Genomic Medicine, NCGM SCV005061024.1 HFE - - - +10 Pathogenic Dec 21, 2021 ACMG classification criteria: PS3, PS4, PM3 OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Laboratorio de Genetica e Diagnostico Molecular, Hospital Israelita Albert Einstein SCV004183355.1 HFE - - - +10 Pathogenic Nov 25, 2023 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing unknown:na Baylor Genetics SCV001523197.4 - - - - +10 Pathogenic Feb 01, 2024 HFE: PM3:Strong, PM1, PP4:Moderate, PS3:Moderate, PM2:Supporting MedGen:CN517202 C3661900:not provided criteria provided, single submitter clinical testing germline:118 CeGaT Center for Human Genetics Tuebingen SCV001154674.25 HFE - - - +10 Pathogenic Mar 25, 2022 - Not provided C3661900:not provided criteria provided, single submitter clinical testing germline:237 AiLife Diagnostics, AiLife Diagnostics SCV002502480.1 HFE - - - +10 Pathogenic May 23, 2023 - Not provided C3661900:not provided criteria provided, single submitter clinical testing germline:62 Mayo Clinic Laboratories, Mayo Clinic SCV001715880.3 HFE - - - +10 Pathogenic Feb 26, 2021 Variant summary: HFE c.187C>G (p.His63Asp) results in a non-conservative amino acid change located in the MHC class I-like antigen recognition-like domain (IPR011161) of the encoded protein sequence. Three of five in-silico tools predict a benign effect of the variant on protein function. The variant allele was found at a frequency of 0.11 in 251484 control chromosomes in the gnomAD database, including 1832 homozygotes. c.187C>G has been reported as a common disease variant in the literature in individuals affected with Hemochromatosis Type 1, in both homozygous and compound heterozygous states, but most frequently in trans with the most common disease variant c.845G>A (p.Cys282Tyr) (e.g. Feder_1996, Kelley_2014). These data indicate that the variant is likely to be associated with disease, however the variant appears to have very low penetrance, as the majority of homozygous or compound heterozygous individuals with this variant do not exhibit clinical symptoms of hemochromatosis despite some cases having elevated serum ferritin and transferrin saturation levels (e.g. Beutler_2002, Pedersen_2009). Several publications report experimental evidence evaluating an impact on protein function. While p.His63Asp was shown to have normal levels of association with beta2-globulin and expression of HFE on the cell surface in contrast to impairment observed in cells with the other common pathogenic variant p.Cys282Tyr (e.g. Waheed_1997), p.His63Asp was shown to induce ER-stress in-vitro and in a transgenic mouse model (e.g. Liu_2011). Transgenic mice expressing the murine equivalent of this variant were also reported to have increased iron storage and decreased levels of iron mobilization at 12 months of age (e.g. Nandar_2013). The variant has also been reported to alter the expression levels of several genes involved in sphingolipid metabolism (e.g. Ali-Rahmani_2011) and to affect cellular glutamate levels (e.g. Mitchell_2011). Sixteen clinical diagnostic laboratories have submitted clinical-significance assessments for this variant to ClinVar after 2014 without evidence for independent evaluation. Thirteen of these submitters report the variant as either Pathogenic or a risk factor for disease. Based on the evidence outlined above, the variant was classified as pathogenic with very low penetrance in association with Hemochromatosis. MedGen:C3469186 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:na Women's Health and Genetics/Laboratory Corporation of America, LabCorp SCV001519563.1 HFE - - - +10 Pathogenic Dec 27, 2022 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:1 Institute of Human Genetics Munich, Klinikum Rechts Der Isar, TU München SCV004045959.1 HFE - - - +10 Pathogenic Jan 13, 2020 Contributing pathogenic variant when co-inherited with other pathogenic variants in HFE or PPOX genes, but not pathogenic alone, even in the homozygous state. OMIM:176200 C0162532:Variegate porphyria criteria provided, single submitter clinical testing germline:na Genetics and Molecular Pathology, SA Pathology SCV002556586.2 HFE - - - +10 Pathogenic Oct 09, 2023 - OMIM:235200 C3469186:Hemochromatosis type 1 no assertion criteria provided clinical testing germline:na Zotz-Klimas Genetics Lab, MVZ Zotz Klimas SCV004041642.1 - - - - +10 other Jun 26, 2018 - not provided C3661900:not provided criteria provided, single submitter clinical testing germline:348 Eurofins Ntd Llc (ga) SCV000227124.5 HFE Variant classified as "other reportable" ??? variant is clinically benign (not associated with disease) but is reported when observed (e.g. pseudodeficiency alleles). - - +10 Pathogenic May 15, 2023 - OMIM:235200 C3469186:Hemochromatosis type 1 criteria provided, single submitter clinical testing germline:2 New York Genome Center SCV004046529.2 - - - - +10 Pathogenic Jul 31, 2024 - not provided C3661900:not provided criteria provided, single submitter clinical testing germline:na Center for Genomic Medicine, Rigshospitalet, Copenhagen University Hospital SCV002568070.6 HFE - - - +10 Pathogenic Jan 24, 2024 - not provided C3661900:not provided criteria provided, single submitter clinical testing germline:na Clinical Genetics Laboratory, Skane University Hospital Lund SCV005198437.1 HFE - - - diff --git a/v03_pipeline/var/test/reference_data/test_hgmd.vcf b/v03_pipeline/var/test/reference_datasets/raw/test_hgmd.vcf similarity index 100% rename from v03_pipeline/var/test/reference_data/test_hgmd.vcf rename to v03_pipeline/var/test/reference_datasets/raw/test_hgmd.vcf diff --git a/v03_pipeline/var/test/reference_datasets/raw/test_mitomap.csv b/v03_pipeline/var/test/reference_datasets/raw/test_mitomap.csv new file mode 100644 index 000000000..24491c2c0 --- /dev/null +++ b/v03_pipeline/var/test/reference_datasets/raw/test_mitomap.csv @@ -0,0 +1,4 @@ +"Index","Locus Type","Locus","Associated Diseases","Allele","Position","aaΔ or RNA","Status ♣(Mitomap [ClinGen])","Last StatusUpdate" +"1","tRNA","MT-TF","MELAS / MM & EXIT","m.583G>A","583","tRNA Phe","Cfrm [VUS*]","2022.10.10" +"2","tRNA","MT-TF","Gitelman-like syndrome","m.591C>T","591","tRNA Phe","Cfrm [LP]","2024.07.22" +"3","tRNA","MT-TF","Maternally inherited epilepsy / mito tubulointerstitial kidney disease (MITKD) / Gitelman-like syndrome","m.616T>C","616","tRNA Phe","Cfrm [LP]","2022.06.13" From 8281b3fc0afecf13c7f4969f5af55b696c77e9c6 Mon Sep 17 00:00:00 2001 From: Benjamin Blankenmeister Date: Mon, 25 Nov 2024 14:53:11 -0500 Subject: [PATCH 2/2] remove compare globals --- .../lib/reference_data/compare_globals.py | 137 ------------------ 1 file changed, 137 deletions(-) delete mode 100644 v03_pipeline/lib/reference_data/compare_globals.py diff --git a/v03_pipeline/lib/reference_data/compare_globals.py b/v03_pipeline/lib/reference_data/compare_globals.py deleted file mode 100644 index bdaf367ae..000000000 --- a/v03_pipeline/lib/reference_data/compare_globals.py +++ /dev/null @@ -1,137 +0,0 @@ -import dataclasses - -import hail as hl - -from v03_pipeline.lib.logger import get_logger -from v03_pipeline.lib.model import ( - DatasetType, - ReferenceGenome, -) -from v03_pipeline.lib.reference_data.clinvar import parse_clinvar_release_date -from v03_pipeline.lib.reference_data.config import CONFIG -from v03_pipeline.lib.reference_data.dataset_table_operations import ( - get_all_select_fields, - get_enum_select_fields, - import_ht_from_config_path, -) - -logger = get_logger(__name__) - - -def clinvar_versions_equal( - ht: hl.Table, - reference_genome: ReferenceGenome, - dataset_type: DatasetType, -) -> bool: - dataset = 'clinvar_mito' if dataset_type == DatasetType.MITO else 'clinvar' - return hl.eval(ht.globals.versions[dataset]) == parse_clinvar_release_date( - CONFIG[dataset][reference_genome.v02_value]['source_path'], - ) - - -@dataclasses.dataclass -class Globals: - paths: dict[str, str] - versions: dict[str, str] - enums: dict[str, dict[str, list[str]]] - selects: dict[str, dict[str, hl.dtype]] - - def __getitem__(self, name: str): - return getattr(self, name) - - @classmethod - def from_dataset_configs( - cls, - reference_genome: ReferenceGenome, - datasets: list[str], - ): - paths, versions, enums, selects = {}, {}, {}, {} - for dataset in datasets: - dataset_config = CONFIG[dataset][reference_genome.v02_value] - dataset_ht = import_ht_from_config_path( - dataset_config, - dataset, - reference_genome, - ) - dataset_ht_globals = hl.eval(dataset_ht.globals) - paths[dataset] = dataset_ht_globals.path - versions[dataset] = dataset_ht_globals.version - enums[dataset] = dict(dataset_ht_globals.enums) - dataset_ht = dataset_ht.select( - **get_all_select_fields(dataset_ht, dataset_config), - ) - dataset_ht = dataset_ht.transmute( - **get_enum_select_fields(dataset_ht, dataset_config), - ) - selects[dataset] = { - k: v.dtype - for k, v in dict(dataset_ht.row).items() - if k not in set(dataset_ht.key) - } - return cls(paths, versions, enums, selects) - - @classmethod - def from_ht( - cls, - ht: hl.Table, - datasets: list[str], - ): - rdc_globals_struct = hl.eval(ht.globals) - paths = dict(rdc_globals_struct.paths) - versions = dict(rdc_globals_struct.versions) - # enums are nested structs - enums = {k: dict(v) for k, v in rdc_globals_struct.enums.items() if k in paths} - selects = {} - for dataset in datasets: - if dataset in ht.row: - # NB: handle an edge case (mito high constraint) where we annotate a bool from the reference dataset collection - selects[dataset] = ( - {k: v.dtype for k, v in dict(ht[dataset]).items()} - if isinstance(ht[dataset], hl.StructExpression) - else {} - ) - return cls(paths, versions, enums, selects) - - -def validate_selects_types( - ht1_globals: Globals, - ht2_globals: Globals, - dataset: str, -) -> None: - # Assert that all shared annotations have identical types - shared_selects = ( - ht1_globals['selects'][dataset].keys() - & ht2_globals['selects'].get(dataset).keys() - ) - mismatched_select_types = [ - (select, ht2_globals['selects'][dataset][select]) - for select in shared_selects - if ( - ht1_globals['selects'][dataset][select] - != ht2_globals['selects'][dataset][select] - ) - ] - if mismatched_select_types: - msg = f'Unexpected field types detected in {dataset}: {mismatched_select_types}' - raise ValueError(msg) - - -def get_datasets_to_update( - ht1_globals: Globals, - ht2_globals: Globals, - validate_selects: bool = True, -) -> list[str]: - datasets_to_update = set() - for field in dataclasses.fields(Globals): - if field.name == 'selects' and not validate_selects: - continue - datasets_to_update.update( - ht1_globals[field.name].keys() ^ ht2_globals[field.name].keys(), - ) - for dataset in ht1_globals[field.name].keys() & ht2_globals[field.name].keys(): - if field.name == 'selects': - validate_selects_types(ht1_globals, ht2_globals, dataset) - if ht1_globals[field.name][dataset] != ht2_globals[field.name][dataset]: - logger.info(f'{field.name} mismatch for {dataset}') - datasets_to_update.add(dataset) - return sorted(datasets_to_update)