-
Notifications
You must be signed in to change notification settings - Fork 21
add sample_type to family and project table paths #842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
) | ||
if hfs.exists(project_ht_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bpblanken I think what was happening before was that HailTableTask
was actually never complete()
if the project table didn't exist and that's what was causing the infinite loop?
yielding inside the sample_type loop works now if the project table exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 definitely strange behavior! I think we do actually want this task to fail though if the project able isn't there right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, all of the existing seqr projects will have project tables in only one sample_type directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: raise error if there is not at least 1 project table! else the task will never complete!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ done
family_guid=family_guid, | ||
), | ||
) | ||
for sample_type in SampleType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potentially out of scope here, but I think we would actually want to be explicit about the sample type (or multiple sample types) we are deleting, so potentially we could delete one but not the other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
talked yesterday, this was intentional so that we won't have to remember to run the dag twice for both sample types
@@ -21,21 +23,24 @@ def complete(self) -> bool: | |||
) | |||
|
|||
def run(self): | |||
project_table_task: luigi.Target = yield HailTableTask( | |||
project_table_path( | |||
for sample_type in SampleType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, I think ideally we would want to explicitly set which sample type(s) we are deleting
TODO: change path from v03 to 3.1, but keep reference data the same |
@@ -17,6 +17,7 @@ class Sex(Enum): | |||
class PipelineVersion(Enum): | |||
V02 = 'v02' | |||
V03 = 'v03' | |||
V3_1 = 'v3.1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please lmk if you have a better idea
the naming curse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
This will require a one-time migration to move all existing family tables into their sampletype subfolder. This won't be too bad as sample_type is already in the table globals.