Add CMS130 Synthea Module #43

sarahmcdougall · 2023-09-05T15:35:03Z

Summary

Adds a Synthea module that simulates the CMS130 (v10) measure on our Synthea fork.
NOTE - For the Connectathon we want to generate varying group sizes of synthetic patients using this Synthea module. This PR only covers the Synthea module generation, not the patient generation. The patients will be generated once this PR is reviewed and merged.

New behavior

We now have a Synthea module that we can use to generate synthetic patients whose data aligns with the data required for the CMS130 Colorectal Cancer Screening measure.

Purpose: The September Connectathon will be supporting a bulk data track, focused on bulk data export. We are specifically interested in:
(1) measuring the scalability of the bulk export operation across different groups of patients, and
(2) determining whether we can configure _typeFilter queries that will result in patients falling into the appropriate measure populations (i.e. can we construct queries such that we are not exporting all the data on the server but that we are exporting just enough for successful measure calculation).

The colorectal cancer screening measure (CMS130) will be used as a specific use case at the Connectathon since it is both a CMS and HRSA measure.

Since the colorectal cancer screening measure is our use case, we want to create synthetic patient data that we can load onto servers (ex. the Abacus bulk export server) and export so that we can achieve the purposes above.

Code changes

New JSON Synthea module. This module reuses some of the logic from the EXM130-8.0.000-r4.json Synthea module, but adds some handling for additional encounters, procedures, and measure population conditions.

At a high level, the logic contained in this module includes:

Guards on age and measurement period - these ensure that the patient is of the correct age range and we are in the correct measurement period before generating the necessary encounters/procedures. See the Synthea module builder wiki for more info on the different states included in this module
Encounter state for qualifying encounters - I included the ValueSet associated with each type of encounter, and one code system/value for each type of encounter. The distribution (70% qualifying encounter/20% Telehealth visit/10% no encounter) is arbitrary but will hopefully ensure a wide array of data to be generated while still ensuring a sizable number of patients end up in the numerator and denominator
Procedure state for colonoscopy and other qualifying procedures - I included the ValueSet associated with each type of procedure, and one code system/value for each type of procedure. The distribution (70% procedure/30% no procedure) is arbitrary (for same reasons as above)
Assign attributes for the relevant measure populations that the patient belongs to (ex. the patient is a numerator patient if both the encounter and procedure occurred)

Testing Guidance

Navigate to the Synthea Module Builder and upload the JSON included in this PR to see a visual of the module logic. Check that the module logic accurately captures the logic from the CMS130 measure.
Go into synthea.properties within synthea/src/main/resources and edit generate.log_patients.detail to be detailed instead of simple. This will allow you to see the states reached by the Synthea patient from the terminal. This is optional but gives a lot of good info for debugging.

From here, we will want to (1) try generating patients to make sure the module logic does not break at any point, and (2) check that we can actually generate patients for each measure population that have the appropriate resources. Here are some steps for doing so:

From the terminal, run ./run_synthea with -m “CMS130*” and however many patients you want to generate (I recommend keeping this number low when logging detailed results). The results will be stored in output/fhir. (If you run into SSL errors, check out the internal docs for how to make the certs cooperate with Java.) You can add some additional flags to help filter the results, which is helpful for testing.
- Example: ./run_synthea -m “CMS130*” -p 1 -k "src/main/resources/keep_modules/keep_ipp.json" -a 18-75 (age flag included to avoid timeout errors for not being able to generate patients with the IPP conditions, and keep module flag used to only return patients in the IPP)
- If you have the detailed results enabled, you can check the terminal to see what states the patient reached.

* Denom/IPP patient should have an encounter within the measurement period. Numer patient should have a qualifying encounter and a colonoscopy (or equivalent) procedure within the measurement period. Check out the Patient resources in output/fhir and check for these resources if the patient hit these states. (ex. if a patient hit the "set_numerator" state, check that the Patient resource contains the relevant Encounter and Procedure resources)
* Additionally, to test the accuracy of the module logic, run the patients through fqm-execution and check that the measure reports are accurate. Example terminal command:

npm run cli -- reports -m <CMS130 measure bundle> --patients-directory <patients directory from synthea> -o -s 2022-01-01 -e 2022-12-31 --report-type summary --debug

For example, if you generated 3 patients, where 2 reached the denominator state and 1 reached the numerator state, the resulting measure report from fqm-execution should have a numerator count of 1 and a denominator count of 3.

Looking for feedback on:

The CQL for the CMS130 measure has some logic along the lines of “colonoscopy must be performed 10 years or less before the measurement period” - how strict should we be with trying to capture this? Right now I believe the colonoscopy is performed after the start of the measurement period
Is there anything captured in the EXM130-8.0.000-r4.json Synthea module that should also be captured in this module but currently isn’t?

dehall · 2023-09-06T17:16:20Z