You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(nlp): add a variety of --cohort-* args to filter notes
Three main new arguments:
* --cohort-csv: a csv with a column like patient_id or note_ref
* --cohort-anon-csv: same but with anonymized IDs
* --cohort-athena-table: same but points at an Athena table
To support the Athena option, we have some other new args:
* --athena-workgroup: specify the workgroup to use
* --athena-database: specify the database to use
* --allow-large-cohort: if the table is gigantic, use it anyway
(this is here because we have a typo-guard in there - if you
accidentally point at the base observation table, we're gonna
stop you from downloading a terabyte of data)
You can specify the database in the table arg with a period.
And the workgroup can be specified via env var or CLI.
If we find a docref/dxreport ID/ref column, we'll use that. Otherwise,
we'll use a patient ID column and grab all notes for those patients.
This cohort filtering replaces instead of augments the previous default
filtering of "final" status notes (i.e. skipping draft or superceded
notes). But if the user is specifying the IDs manually for us, they
must know what they want and we don't need to do the status check for
them.
0 commit comments