Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CMS130 Synthea Module #43

Merged
merged 6 commits into from
Sep 13, 2023
Merged

Add CMS130 Synthea Module #43

merged 6 commits into from
Sep 13, 2023

Conversation

sarahmcdougall
Copy link

Summary

Adds a Synthea module that simulates the CMS130 (v10) measure on our Synthea fork.
NOTE - For the Connectathon we want to generate varying group sizes of synthetic patients using this Synthea module. This PR only covers the Synthea module generation, not the patient generation. The patients will be generated once this PR is reviewed and merged.

New behavior

We now have a Synthea module that we can use to generate synthetic patients whose data aligns with the data required for the CMS130 Colorectal Cancer Screening measure.

Purpose: The September Connectathon will be supporting a bulk data track, focused on bulk data export. We are specifically interested in:
(1) measuring the scalability of the bulk export operation across different groups of patients, and
(2) determining whether we can configure _typeFilter queries that will result in patients falling into the appropriate measure populations (i.e. can we construct queries such that we are not exporting all the data on the server but that we are exporting just enough for successful measure calculation).

The colorectal cancer screening measure (CMS130) will be used as a specific use case at the Connectathon since it is both a CMS and HRSA measure.

Since the colorectal cancer screening measure is our use case, we want to create synthetic patient data that we can load onto servers (ex. the Abacus bulk export server) and export so that we can achieve the purposes above.

Code changes

New JSON Synthea module. This module reuses some of the logic from the EXM130-8.0.000-r4.json Synthea module, but adds some handling for additional encounters, procedures, and measure population conditions.

At a high level, the logic contained in this module includes:

  • Guards on age and measurement period - these ensure that the patient is of the correct age range and we are in the correct measurement period before generating the necessary encounters/procedures. See the Synthea module builder wiki for more info on the different states included in this module
  • Encounter state for qualifying encounters - I included the ValueSet associated with each type of encounter, and one code system/value for each type of encounter. The distribution (70% qualifying encounter/20% Telehealth visit/10% no encounter) is arbitrary but will hopefully ensure a wide array of data to be generated while still ensuring a sizable number of patients end up in the numerator and denominator
  • Procedure state for colonoscopy and other qualifying procedures - I included the ValueSet associated with each type of procedure, and one code system/value for each type of procedure. The distribution (70% procedure/30% no procedure) is arbitrary (for same reasons as above)
  • Assign attributes for the relevant measure populations that the patient belongs to (ex. the patient is a numerator patient if both the encounter and procedure occurred)

Testing Guidance

  • Navigate to the Synthea Module Builder and upload the JSON included in this PR to see a visual of the module logic. Check that the module logic accurately captures the logic from the CMS130 measure.
  • Go into synthea.properties within synthea/src/main/resources and edit generate.log_patients.detail to be detailed instead of simple. This will allow you to see the states reached by the Synthea patient from the terminal. This is optional but gives a lot of good info for debugging.
    Screenshot 2023-09-05 at 11 23 31 AM

From here, we will want to (1) try generating patients to make sure the module logic does not break at any point, and (2) check that we can actually generate patients for each measure population that have the appropriate resources. Here are some steps for doing so:

  • From the terminal, run ./run_synthea with -m “CMS130*” and however many patients you want to generate (I recommend keeping this number low when logging detailed results). The results will be stored in output/fhir. (If you run into SSL errors, check out the internal docs for how to make the certs cooperate with Java.) You can add some additional flags to help filter the results, which is helpful for testing.
    • Example: ./run_synthea -m “CMS130*” -p 1 -k "src/main/resources/keep_modules/keep_ipp.json" -a 18-75 (age flag included to avoid timeout errors for not being able to generate patients with the IPP conditions, and keep module flag used to only return patients in the IPP)
    • If you have the detailed results enabled, you can check the terminal to see what states the patient reached.

Screenshot 2023-09-05 at 11 24 55 AM
* Denom/IPP patient should have an encounter within the measurement period. Numer patient should have a qualifying encounter and a colonoscopy (or equivalent) procedure within the measurement period. Check out the Patient resources in output/fhir and check for these resources if the patient hit these states. (ex. if a patient hit the "set_numerator" state, check that the Patient resource contains the relevant Encounter and Procedure resources)
* Additionally, to test the accuracy of the module logic, run the patients through fqm-execution and check that the measure reports are accurate. Example terminal command:

npm run cli -- reports -m <CMS130 measure bundle> --patients-directory <patients directory from synthea> -o -s 2022-01-01 -e 2022-12-31 --report-type summary --debug

For example, if you generated 3 patients, where 2 reached the denominator state and 1 reached the numerator state, the resulting measure report from fqm-execution should have a numerator count of 1 and a denominator count of 3.

Looking for feedback on:

  • The CQL for the CMS130 measure has some logic along the lines of “colonoscopy must be performed 10 years or less before the measurement period” - how strict should we be with trying to capture this? Right now I believe the colonoscopy is performed after the start of the measurement period
  • Is there anything captured in the EXM130-8.0.000-r4.json Synthea module that should also be captured in this module but currently isn’t?

"operator": ">=",
"quantity": 50,
"unit": "years",
"value": 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused as to why this "value": 0 field is here and in a couple other logic conditions - I don't think it will break anything but I can't figure out where it came from, an Age condition doesn't need that field.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't figure out where it came from either but it seems like it can safely be removed. I had re-used these logic conditions from a previously built module that was built using the Synthea Module Builder.

"type": "Terminal",
"name": "Terminal"
},
"Measurement_Period_Guard": {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequence that these two guards get processed in means that it is possible for patients you don't expect to go through the module. The first guard will always allow patients in at age 50, because the module gets called every week of the simulation and they will pass that guard the first time that logic returns true, then the second guard will wait until year >= 2022. One example is a patient born in 1942, they will pass through the first guard at age 50 then in year 2022 they will be 80 years old. Not a big deal if you always provide the -a flag to specify a target age range, but something to be aware of. I think the best way to ensure things work the way you intended is to move this year guard into the And of the age guard. (The UI won't let you do that but you can just cut & paste the relevant object in the JSON)

"condition_type": "And",
"conditions": [
{
"condition_type": "Gender",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this condition supposed to be here and in the other transition option below? I don't see a reference to gender in the measure

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - no, there is no reference to gender and so it should not be included. Accidentally kept this in during testing and it should be removed

"name": "Qualifying_Encounter",
"direct_transition": "Encounter_Occurred",
"codes": [
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codes list is a little misleading because only the first code is generally meaningful. In some resource types you might see all of the codes in the output but for Encounters and Procedures as used in this module only the first is relevant. The value_set field means a different code/display may be picked but still only the first entry in the list will show up in the output FHIR. Unfortunately if you want to exercise all these codes you'll either need an overarching value set (probably not possible) or to create separate states for all of them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - thanks for the explanation on the codes list! I had assumed it was a list of codes from which a code would be chosen from at random and applied to the output FHIR.

For our use case (generating synthetic patient data at the Connectathon to use with Bulk Export), I don't think we necessarily need to exercise all the codes, although it would be nice so that we could test out different _typeFilter queries on the codes. Since there are potential licensing issues, I think for now it's fine to just use the first code. Maybe future work could look into creating separate states and doing more in-depth querying on them

Comment on lines 88 to 99
{
"system": "CPT",
"code": "99395",
"display": "Preventive Care Services - Established Office Visit, 18 and Up",
"value_set": "2.16.840.1.113883.3.464.1003.101.12.1025"
},
{
"system": "CPT",
"code": 99385,
"display": "Preventive Care Services-Initial Office Visit, 18 and Up",
"value_set": "2.16.840.1.113883.3.464.1003.101.12.1023"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you really need to exercise all options from the measure I'd suggest removing these - licensing is a question and I don't know if this small set constitutes fair use.

Comment on lines 115 to 120
{
"system": "CPT",
"code": "98970",
"display": "Virtual Encounter",
"value_set": "2.16.840.1.113883.3.464.1003.101.12.1089"
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note about removing this unless you need full "code coverage"

"value_set": "2.16.840.1.113883.3.464.1003.198.12.1010"
},
{
"system": "LOINC",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this actually matters for the measure evaluatation, some of these do look like Procedures but some seem like they should be Observations (the ones with LOINC codes). For example I see in the measure criteria [["Laboratory Test, Performed": "FIT DNA"]](https://ecqi.healthit.gov/mcw/2022/ecqm-dataelement/laboratorytestperformedfitdna.html) . If a Procedure resource can meet that criteria then no problem, but it seems like it may have to be an Observation instead.

Copy link

@elsaperelli elsaperelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a lot of Synthea learning, I don't feel totally qualified to say this, but this looks good to me!

I did all of the suggested testing and that all works, so looks like we will be able to create some patients for the connectathon!

@sarahmcdougall sarahmcdougall merged commit 7c748a1 into abacus-dev Sep 13, 2023
4 of 5 checks passed
@sarahmcdougall sarahmcdougall deleted the cms130-connectathon branch September 13, 2023 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants