This is a working PoC of the use of SQLMesh to generate on OMOP 5.4 CDM using Synthea synthetic data.
- Clone this repository
- Create a python virtual environment and activate it
- Run
pip install -r ./requirements.txt
- Run
python ./bootstrap.py
. The following steps are executed.- Creates a DuckDb database using the information in config.yaml and creates
./data/synthea
and./data/vocab
folders. - Downloads the latest Synthea 100 sample patients CSV data, and uploads them into the database.
- Pauses code execution until user downloads/copies the Athena vocabulary data zip file into
./data/vocab
and pressesEnter
to continue - Uploads vocabulary into the database.
- Prints out the table names in the database
- Creates a DuckDb database using the information in config.yaml and creates
- You are all set to get started. Run
sqlmesh ui
for next steps.
Please do not hesistate to fork, create a PR, raise an issue or get involved in any other way.
DISCLAIMER: The following are all pre-alpha proof-of-concepts with absolutely no guarantees. In fact, running any of these in your data warehouse without guardrails may try to kill your cat. Read more about guardrails here.