-
Notifications
You must be signed in to change notification settings - Fork 90
variant lookup table has sample_type #4289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/.README.txt.crc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/.metadata.json.gz.crc
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-eead8100a1c1 | ||
Created at 2024/04/03 17:08:32 | ||
Created at 2024/08/12 16:34:26 |
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/globals/.metadata.json.gz.crc
Binary file not shown.
Binary file modified
BIN
+24 Bytes
(110%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/globals/metadata.json.gz
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/globals/parts/.part-0.crc
Binary file not shown.
Binary file modified
BIN
+6 Bytes
(110%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/globals/parts/part-0
Binary file not shown.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file modified
BIN
+12 Bytes
(100%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/metadata.json.gz
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/rows/.metadata.json.gz.crc
Binary file not shown.
Binary file modified
BIN
+2 Bytes
(100%)
hail_search/fixtures/GRCh37/SNV_INDEL/lookup.ht/rows/metadata.json.gz
Binary file not shown.
File renamed without changes.
File renamed without changes.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/.metadata.json.gz.crc
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-eead8100a1c1 | ||
Created at 2024/04/03 15:52:09 | ||
Created at 2024/08/12 16:46:33 |
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/globals/.metadata.json.gz.crc
Binary file not shown.
Binary file modified
BIN
+24 Bytes
(110%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/globals/metadata.json.gz
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/globals/parts/.part-0.crc
Binary file not shown.
Binary file modified
BIN
+6 Bytes
(100%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/globals/parts/part-0
Binary file not shown.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file modified
BIN
+12 Bytes
(100%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/metadata.json.gz
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/rows/.metadata.json.gz.crc
Binary file not shown.
Binary file modified
BIN
-1 Byte
(100%)
hail_search/fixtures/GRCh38/MITO/lookup.ht/rows/metadata.json.gz
Binary file not shown.
File renamed without changes.
File renamed without changes.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/.README.txt.crc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/.metadata.json.gz.crc
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-eead8100a1c1 | ||
Created at 2024/04/03 17:00:55 | ||
Created at 2024/08/12 17:01:09 |
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/globals/.metadata.json.gz.crc
Binary file not shown.
Binary file modified
BIN
+24 Bytes
(110%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/globals/metadata.json.gz
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/globals/parts/.part-0.crc
Binary file not shown.
Binary file modified
BIN
+16 Bytes
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/globals/parts/part-0
Binary file not shown.
Binary file removed
BIN
-12 Bytes
...Ch38/SNV_INDEL/lookup.ht/index/part-0-38581d1a-27f8-452f-9678-75225dfc64ab.idx/.index.crc
Binary file not shown.
Binary file removed
BIN
-12 Bytes
...DEL/lookup.ht/index/part-0-38581d1a-27f8-452f-9678-75225dfc64ab.idx/.metadata.json.gz.crc
Binary file not shown.
Binary file removed
BIN
-110 Bytes
...es/GRCh38/SNV_INDEL/lookup.ht/index/part-0-38581d1a-27f8-452f-9678-75225dfc64ab.idx/index
Binary file not shown.
Binary file removed
BIN
-184 Bytes
...NV_INDEL/lookup.ht/index/part-0-38581d1a-27f8-452f-9678-75225dfc64ab.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+12 Bytes
...Ch38/SNV_INDEL/lookup.ht/index/part-0-7710d56c-23d5-4437-8754-d3412cdf53bf.idx/.index.crc
Binary file not shown.
Binary file added
BIN
+12 Bytes
...DEL/lookup.ht/index/part-0-7710d56c-23d5-4437-8754-d3412cdf53bf.idx/.metadata.json.gz.crc
Binary file not shown.
Binary file added
BIN
+111 Bytes
...es/GRCh38/SNV_INDEL/lookup.ht/index/part-0-7710d56c-23d5-4437-8754-d3412cdf53bf.idx/index
Binary file not shown.
Binary file added
BIN
+184 Bytes
...NV_INDEL/lookup.ht/index/part-0-7710d56c-23d5-4437-8754-d3412cdf53bf.idx/metadata.json.gz
Binary file not shown.
Binary file modified
BIN
+12 Bytes
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/metadata.json.gz
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/rows/.metadata.json.gz.crc
Binary file not shown.
Binary file modified
BIN
+1 Byte
(100%)
hail_search/fixtures/GRCh38/SNV_INDEL/lookup.ht/rows/metadata.json.gz
Binary file not shown.
Binary file removed
BIN
-12 Bytes
...es/GRCh38/SNV_INDEL/lookup.ht/rows/parts/.part-0-38581d1a-27f8-452f-9678-75225dfc64ab.crc
Binary file not shown.
Binary file added
BIN
+12 Bytes
...es/GRCh38/SNV_INDEL/lookup.ht/rows/parts/.part-0-7710d56c-23d5-4437-8754-d3412cdf53bf.crc
Binary file not shown.
Binary file removed
BIN
-118 Bytes
...ixtures/GRCh38/SNV_INDEL/lookup.ht/rows/parts/part-0-38581d1a-27f8-452f-9678-75225dfc64ab
Binary file not shown.
Binary file added
BIN
+123 Bytes
...ixtures/GRCh38/SNV_INDEL/lookup.ht/rows/parts/part-0-7710d56c-23d5-4437-8754-d3412cdf53bf
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
import os | ||
from collections import defaultdict | ||
|
||
from aiohttp.web import HTTPNotFound | ||
import hail as hl | ||
|
@@ -322,33 +322,30 @@ def _add_project_lookup_data(self, ht, annotation_fields, *args, **kwargs): | |
).filter(hl.is_defined), | ||
)).filter( | ||
lambda x: x[1].any(hl.is_defined) | ||
).starmap(lambda project_guid, family_indices: ( | ||
project_guid, | ||
hl.dict(family_indices.map(lambda j: (lookup_ht.project_families[project_guid][j], True))), | ||
).starmap(lambda project_key, family_indices: ( | ||
project_key, | ||
hl.dict(family_indices.map(lambda j: (lookup_ht.project_families[project_key][j], True))), | ||
))), 1), | ||
)[0] | ||
|
||
for project_guid, families in variant_projects.items(): | ||
if os.path.exists(self._get_table_path(f'projects/WES/{project_guid}.ht')): | ||
sample_type = 'WES' | ||
else: | ||
sample_type = 'WGS' | ||
for family_guid in families: | ||
families[family_guid] = {sample_type: True} | ||
|
||
# Variant can be present in the lookup table with only ref calls, so is still not present in any projects | ||
if not variant_projects: | ||
raise HTTPNotFound() | ||
|
||
new_variant_projects = defaultdict(lambda: defaultdict(dict)) | ||
for (project_guid, sample_type), families in variant_projects.items(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. assuming you need to kepe this. post processing, |
||
for family_guid, value in families.items(): | ||
new_variant_projects[project_guid][family_guid][sample_type] = value | ||
|
||
annotation_fields.update({ | ||
'familyGenotypes': lambda r: hl.dict(r.family_entries.map( | ||
lambda entries: (entries.first().familyGuid, entries.filter(hl.is_defined).map(self._get_sample_genotype)) | ||
)), | ||
}) | ||
|
||
logger.info(f'Looking up {self.DATA_TYPE} variant in {len(variant_projects)} projects') | ||
logger.info(f'Looking up {self.DATA_TYPE} variant in {len(new_variant_projects)} projects') | ||
|
||
return super()._add_project_lookup_data(ht, annotation_fields, project_samples=variant_projects, **kwargs) | ||
return super()._add_project_lookup_data(ht, annotation_fields, project_samples=new_variant_projects, **kwargs) | ||
|
||
@staticmethod | ||
def _stat_has_non_ref(s): | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure on the syntax here but I think you can update this to return the structure you want by changing this to the following:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This returns something like
{'R0001_1kg': {'WES': {'F000002_2': True}}, 'R0003_test': {'WES': {'F000011_11': True}, 'WGS': {'F000011_11': True}}}
but the resulting structure is expected to be {project_guid: {family_guid: {sample_type: bool}}}. I've been trying to get the result that I want in this block of hail code but haven't figured it out yet.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think the change would be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched the order of the
project_samples
dict and used your original suggestion here!