Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
93d5186
Add (hilariously) basic test coverage for Synthea.
karlmdavis Sep 8, 2020
5d243d3
Add initial documentation for Synthea sample data.
karlmdavis Oct 1, 2020
d42b0f2
Add test that renders Synthea data to FHIR.
karlmdavis Oct 1, 2020
4b5ad62
Updated the option and output directory names.
karlmdavis Nov 13, 2020
b2e3a98
Fix unit tests on synthetic data.
jawalonoski Mar 26, 2021
228b2d6
Fix assert message.
jawalonoski Mar 30, 2021
27b12f1
Merge Synthea NPI file (if present) to bfd server NPI config
hadleynet Mar 30, 2021
a5b2317
Remove hardcoded path to Synthea output dir, use environment var instead
hadleynet Mar 31, 2021
9b62411
Add Synthea outpatient file
hadleynet Apr 5, 2021
d556f72
Add support for Synthea PDE file
hadleynet Apr 15, 2021
62bb928
Add Synthea carrier file
hadleynet Apr 20, 2021
ad04f68
Add Synthea DME file import
hadleynet Apr 21, 2021
c9d0b1f
Load Synthea bene history file
hadleynet Apr 21, 2021
08632cc
Import Synthea HHA, Hospice, SNF.
jawalonoski Apr 29, 2021
828e870
Fix HHA field optional status.
jawalonoski May 4, 2021
3443136
Update Synthea test fixtures and generation to support export changes…
hadleynet Jun 10, 2021
76c7aa4
Comment out Synthea NPI file loading and the test asertions that requ…
hadleynet Jun 22, 2021
69e482a
Revert unit test.
jawalonoski Aug 9, 2021
9b9b51d
Address PR comments.
jawalonoski Aug 11, 2021
f85399e
Fix assertion to account for line items.
jawalonoski Aug 27, 2021
3b4b35d
Remove code related to loading of Synthea NPI file.
hadleynet Aug 31, 2021
3b7eb13
Make RifFileType.idColumn an Optional
hadleynet Sep 2, 2021
a70e84a
Fix unit test broken by addition of claim line items
hadleynet Sep 2, 2021
3fd281f
Run Synthea without mapping files in CI tests and reduce size of test…
hadleynet Sep 7, 2021
c45f73a
Revert some changes that are no longer required
hadleynet Sep 28, 2021
c6ec693
Synthea now only outputs a single beneficiary file but groups rows fo…
hadleynet Oct 7, 2021
6c597ee
Add a beneficiary_updates file for Synthea out which now separates IN…
hadleynet Nov 1, 2021
308027b
Add third Synthea bene test file for final updates
hadleynet Nov 3, 2021
de7a738
Switch back to one bene file per year of data
hadleynet May 31, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -449,3 +449,39 @@ It was created via the following process:
2. Randomly select 11 beneficiaries from the synthetic data set. Modify their field values (only) as needed to correspond to the values required by MCT, as detailed on [NGD/MBP/BB/MCT Test Case Data](https://jira.cms.gov/browse/BFD-326). Adjust identifiers `BENE_ID` so that they don't conflict/collide with any other data sets.
3. For each of those beneficiaries, randomly select Part D events from the synthetic data sample-mct-update-5-pde.txt set to associate with them. Modify their field values (only) as needed to correspond to the values required by MCT, as detailed on [NGD/MBP/BB/MCT Test Case Data](https://jira.cms.gov/browse/BFD-326).
4. Adjust all other identifiers (`BENE_ID`, `PDE_ID`, and `CLM_GRP_ID`) so that they don't conflict/collide with any other data sets.

### `SYNTHEA`: Synthetic Data Generated by Synthea

[Synthea](https://synthetichealth.github.io/synthea/) is an open source tool
for generating large volumes of realistic, but synthetic, health data.
It was created and is maintained by [MITRE](https://www.mitre.org/),
a not-for-profit organization,
which operates federally funded research and development centers (FFRDCs).
The best way to think of Synthea is like this:
"what if we paid for a metric ton of academic papers on disease prevalence, progression, etc.
and then paid a bunch of scientist-engineers to turn those papers into code
to generate realistic health data based on models derived from the papers?"
That's Synthea: sure, it generates data,
but it's _how_ it generates the data that's the interesting, and tricky, part.

Synthea models a wide range of populations, health conditions, etc.
and generates a similarly wide range of FHIR data,
including `Patient`s, and `ExplanationOfBenfit` resources.
That's great, but not super useful for BFD: BFD _produces_ FHIR;
it doesn't need or _consume_ FHIR data as input.

In 2020, CMS engaged Synthea to add an output mode
that produced data in BFD's RIF input file formats.
The initial engagement ended with these accomplishments:

* Synthea added a `--exporter.bfd.export=true` option to produce RIF.
* It produces beneficiary RIF records:
* These are valid records with all 192 columns and convert to FHIR without errors.
* Most of those columns are optional. Synthea currently populates 21 of them.
* Of those populated columns, 16 appear to have useful, realistic data.
* It produces inpatient claim RIF records:
* These are valid records with all 272 columns and convert to FHIR without errors.
* Most of those columns are optional. Synthea currently populates 52 of them.
* Of those populated columns, 25 appear to have useful, realistic data.
* _Note: All column counts above are estimates produced by eyeballing the data;
we need to get more accurate counts from the Synthea folks._
8 changes: 8 additions & 0 deletions apps/bfd-model/bfd-model-rif-samples/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,14 @@
<artifactId>metrics-core</artifactId>
</dependency>

<dependency>
<!-- Used in builds/tests to run and manage external processes. (Can't
be marked as test-scoped, as the build/test code needs to live in src/main/java.) -->
<groupId>org.zeroturnaround</groupId>
<artifactId>zt-exec</artifactId>
<version>1.12</version>
</dependency>

<dependency>
<!-- Used to run our unit and integration tests. -->
<groupId>junit</groupId>
Expand Down
Loading