-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Split homepage stuff into background. Start overview page
- Loading branch information
Showing
8 changed files
with
739 additions
and
95 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
--- | ||
title: '<div><img src="ohdsi40x40.png"></img> OHDSI GIS WG </div>' | ||
output: | ||
html_document: | ||
toc: TRUE | ||
toc_depth: 3 | ||
toc_float: | ||
collapsed: false | ||
--- | ||
|
||
<br> | ||
|
||
# **Problem Space** | ||
|
||
Geospatial data come in a variety of source formats with no universal, interoperable standard representation. While working with a single geospatial dataset has presents some technical challenges, the complexity of using geospatial begins to compound when variables from disparate datasets are needed for a single analysis study. As ad-hoc solutions are introduced to deal with the increasing technical complexity, the reproducibility of these studies plummets. | ||
|
||
In order to enable reproducible studies that incorporate place-related data with longitudinal patient-level data, it is first necessary to develop functionality to automate retrieval and transformation of publicly available place-related data into a standard representation. | ||
|
||
# **Goals** | ||
|
||
The overarching goal of the OHDSI GIS Workgroup is to develop software and standards for incorporating place-based data into the OMOP CDM and present this back to the OHDSI community. We reach these goals through development in the four key areas below. | ||
|
||
### 1) Infrastructure | ||
- Functional Metadata Catalog | ||
- Harmonized geospatial data model | ||
|
||
### 2) Functionality | ||
- Geocoding | ||
- Accessing geospatial datasets | ||
|
||
### 3) Vocabulary | ||
- GIS specific concepts | ||
- SDoH vocabulary | ||
- Toxin vocabulary | ||
|
||
### 4) Extensions | ||
- Exposure_occurrence | ||
|
||
<br> | ||
|
||
## **Publications/Presentation** | ||
OHDSI GIS Infrastructure (2023 OHDSI Symposium [Poster](https://www.ohdsi.org/2023showcase-19/), [Abstract](https://www.ohdsi.org/wp-content/uploads/2023/10/19-zollovenecek-BriefReport.pdf)) | ||
|
||
Toxins Vocabulary (2023 OHDSI Symposium [Poster](https://www.ohdsi.org/2023showcase-11/), [Abstract](https://www.ohdsi.org/wp-content/uploads/2023/10/Talapova-Polina_A_Toxin_Vocabulary_for_the_OMOP_CDM_2023symposium-Polina-Talapova.pdf)) | ||
|
||
OHDSI GIS Workgroup October 10, 2023 Update ([video](https://youtu.be/QZY-slWdsMs?si=UxMT39rqWTAW_AMx&t=2178)) | ||
|
||
OHDSI GIS Workgroup 2023 Objectives and Key Results ([video](https://www.youtube.com/watch?v=bd0htzO1hx4&t=1103s), [slides](https://www.ohdsi.org/wp-content/uploads/2023/02/GIS-OKRs2023Q1.pdf)) | ||
|
||
|
||
OHDSI GIS Workgroup August 23, 2022 Update ([video](https://youtu.be/d8jIAm9cssM)) | ||
<br> | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
--- | ||
title: '<div><img src="ohdsi40x40.png"></img> OHDSI GIS WG </div>' | ||
output: | ||
html_document: | ||
toc: TRUE | ||
toc_depth: 3 | ||
toc_float: | ||
collapsed: false | ||
--- | ||
|
||
# **OHDSI GIS Gaia Overview** | ||
|
||
**Gaia** refers to the amalgamation of infrastructure, software, standards, tools, and the overall workflow that the OHDSI GIS Workgroup has developed to assist researchers with integrating place-based datasets into their patient-based health database and subsequent analyses. | ||
|
||
**Gaia** includes multiple major elements: | ||
- gaiaCatalog: a functional metadata catalog containing references to publicly-hosted geospatial datasets and instructions for their download and standardization | ||
- gaiaCore: a Postgis database for managing harmonized data sources, the dockerized DeGauss geocoding tool, and gaiaR, an R package for managing interactions between gaiaCore, gaiaCatalog, and any of the Gaia "extensions" | ||
- Extensions: a broad suite of software packages that are powered by gaiaCore. The most relevant of these packages is gaiaOhdsi, an R package that contains operations specific to interacting with an OMOP CDM or external OHDSI software. Other example of extensions are the gaiaVis tools which provide a set of visualizations for data in gaiaCore. | ||
|
||
## Purpose | ||
|
||
What is the purpose of Gaia? Why are we doing all of this? | ||
|
||
Gaia provides a standardized, automated, reproducible, and easily shareable means for integrating place-based datasets into a database of longitudinal patient health data. | ||
|
||
The *simplest case* for Gaia is a single researcher looking to leverage place-based data. After standing up a local or cloud instance of gaiaCore, any researcher now has access to a wealth of curated sources of geospatial data ranging from environmental toxin data to one of many Social Determinants of Health Indexes derived from the US Census data. Instead of the countless hours of work typical to munging multiple disparate geospatial datasets, the researcher can simply use the functions from the gaiaR package to load datasets into their Postgis database all in a harmonized geospatial data format. They've now quickly enabled datasets across many domains, years, and regions in a single Postgis database to which they connect using the software of their choice and begin performing ad hoc exploratory data analyses, creating visualizations, or even powering their own geospatial applications. | ||
|
||
Taking this scenario a step further, a researcher with an established OMOP CDM database may wish to incorporate a subset of geospatial variables into their CDM database alongside their patient health data. The steps necessary to perform this ingestion, which requires geocoding of patient address and a spatiotemporal join, are all handled by gaiaCore and the gaiaOhdsi extension. Thehe DeGauss geocoder, a lightweight geocoder that operates fully locally to ensure that patient information is not transmitted, is easily utilized through a gaiaR wrapper. standardized spatiotemporal joins from the gaiaOhdsi extension relate patient addresses to polygon, line, or point geometries. By transforming the place-based data into patient-level information, it is now ready to be inserted into the CDM extension table "exposure_occurrence". The DDL and insert scripts for this table are also contained in the gaiaOhdsi extension. Once the data has been added to the CDM, it can be used to create cohort definitions, develop predictive models, and generally utilized by all relevant external OHDSI tooling. | ||
|
||
Finally, Gaia enables standardized and reproducible workflows for federated data networks and studies. The process highlighted above to retrieve and harmonize geospatial datasets, perform spatiotemporal joins to transform place-based data to person-level information, and insert person-level information into an OMOP CDM and define cohorts, is fully reproducible. Each step of the process contains detailed, structured metadata focused on provenance of source data and rationale for transformation methods. By scripting and containerizing an entire Gaia workflow, the process of pairing place-based data, often handled using undocumented ad-hoc methods unique to single sites, can be packaged and shipped across an entire network with minimal effort. | ||
|