From 86a2facac143189a294c4403af35e8750364448e Mon Sep 17 00:00:00 2001 From: Kyle Zollo-Venecek Date: Sat, 25 Nov 2023 13:43:24 -0500 Subject: [PATCH] Start to redo docs Split homepage stuff into background. Start overview page --- docs/background.html | 603 +++++++++++++++++++++++++++++++++++++++++++ docs/dataModel.html | 19 +- docs/index.html | 78 ++---- rmd/_site.yml | 8 +- rmd/background.Rmd | 54 ++++ rmd/dataModel.Rmd | 4 + rmd/index.Rmd | 37 +-- rmd/overview.Rmd | 31 +++ 8 files changed, 739 insertions(+), 95 deletions(-) create mode 100644 docs/background.html create mode 100644 rmd/background.Rmd create mode 100644 rmd/overview.Rmd diff --git a/docs/background.html b/docs/background.html new file mode 100644 index 00000000..1294b91b --- /dev/null +++ b/docs/background.html @@ -0,0 +1,603 @@ + + + + + + + + + + + + + +background.knit + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + + +


+
+

Problem Space

+

Geospatial data come in a variety of source formats with no +universal, interoperable standard representation. While working with a +single geospatial dataset has presents some technical challenges, the +complexity of using geospatial begins to compound when variables from +disparate datasets are needed for a single analysis study. As ad-hoc +solutions are introduced to deal with the increasing technical +complexity, the reproducibility of these studies plummets.

+

In order to enable reproducible studies that incorporate +place-related data with longitudinal patient-level data, it is first +necessary to develop functionality to automate retrieval and +transformation of publicly available place-related data into a standard +representation.

+
+
+

Goals

+

The overarching goal of the OHDSI GIS Workgroup is to develop +software and standards for incorporating place-based data into the OMOP +CDM and present this back to the OHDSI community. We reach these goals +through development in the four key areas below.

+
+

1) Infrastructure

+
    +
  • Functional Metadata Catalog
  • +
  • Harmonized geospatial data model
  • +
+
+
+

2) Functionality

+
    +
  • Geocoding
  • +
  • Accessing geospatial datasets
  • +
+
+
+

3) Vocabulary

+
    +
  • GIS specific concepts
  • +
  • SDoH vocabulary
  • +
  • Toxin vocabulary
  • +
+
+
+

4) Extensions

+
    +
  • Exposure_occurrence
  • +
+


+
+
+

Publications/Presentation

+

OHDSI GIS Infrastructure (2023 OHDSI Symposium Poster, Abstract)

+

Toxins Vocabulary (2023 OHDSI Symposium Poster, Abstract)

+

OHDSI GIS Workgroup October 10, 2023 Update (video)

+

OHDSI GIS Workgroup 2023 Objectives and Key Results (video, +slides)

+

OHDSI GIS Workgroup August 23, 2022 Update (video)

+
+
+ + + +
+
+ +
+ + + + + + + + + + + + + + + + diff --git a/docs/dataModel.html b/docs/dataModel.html index 804193bd..4bef0abb 100644 --- a/docs/dataModel.html +++ b/docs/dataModel.html @@ -325,11 +325,25 @@ -
OMOP GIS
+
OHDSI GIS
+ +
+

CDM Extension Tables

exposure_occurrence

diff --git a/docs/index.html b/docs/index.html index 2dc18235..d881d662 100644 --- a/docs/index.html +++ b/docs/index.html @@ -439,71 +439,25 @@

-
-

Problem Space and Goals

-

Geospatial data come in a variety of source formats, with no -universal, interoperable standard representation. In order to meet our -goal of enabling studies of place-related data in conjunction with -longitudinal patient-level data, it is first necessary to develop -functionality to automate retrieval and transformation of publicly -available place-related data into a standard representation. The GIS -workgroup aims to create a suite of tools built around a lightweight -“repository of functionality” to incorporate geospatial analyses into -the familiar research workflows within the OHDSI ecosystem.

-


-
-

Meeting Schedule

-

The GIS General and Development subgroups meet on alternating Fridays -at 9 AM ET (meeting -link)

-

General meetings focus on use cases, integration with OHDSI tools, -and larger project context. Development meetings provide a time for -software developers to collaborate or share their recent work.

-

We encourage folks with an interest in leveraging Place-related data -in their research to join or collaborate with the GIS workgroup. While -anyone is welcome to join either of the meetings, consider attending the -subgroup meeting that best aligns with your goals: If you would like to -propose a use case, contribute domain expertise, or learn how the work -from this group could augment your current research, consider joining us -at a General meeting. If you are interested in contributing to this -project on the technical side (code, documentation, unit testing) -consider joining us at a Development meeting.

-


+
+

Mission

+

The OHDSI GIS workgroup aims to introduce new infrastructure, +tooling, and CDM extension tables that will allow researchers to +seamlessly incorporate place-based datasets into their OMOP-shaped +databases to enhance their evidence-generating workflows.

-
-

Roadmap

-

TODO

+
+

Meeting Schedule

+

The GIS General and Development subgroups meet on alternating +Fridays at 9 AM ET

+

(General +meeting link)

+

(Development +meeting link)


-
-

Get Involved

-

The first iteration of our gaia software and -variable library is now functional! See the installation -page to get started with the gaiaCore R package and gaiaDB database. -From there, feel free to build your own tools that interface with gaiaDB -and enhance the collection of datasets by adding your own (see TODO). We -encourage anyone who adds data sources to a local gaiaDB instance to -also create a pull request for them to be added to the full gaiaDB.

-

You can also request that a data source be added to gaiaDB by filling -out a data source request form here

-

Note: For the time being, all software and data sources are still in -active development, subject to unannounced changes, and should be -considered “unstable”. There is currently no versioning of software or -data sources, though there are thoughts to implement Semantic Versioning -(SemVer) and Universally Unique -Identifiers (UUIDs) -

-
-
-

Publications/Presentation

-

OHDSI GIS Workgroup 2022 Objectives and Key Results (video)

-
-
diff --git a/rmd/_site.yml b/rmd/_site.yml index 7ca30f42..890c4409 100644 --- a/rmd/_site.yml +++ b/rmd/_site.yml @@ -8,11 +8,17 @@ output: in_header: "favicon.html" exclude: ["extras"] navbar: - title: '
OMOP GIS
' + title: '
OHDSI GIS
' right: - icon: fa-github href: https://github.com/OHDSI/GIS left: + - text: "Home" + icon: fa-home + href: index.html + - text: "Background" + icon: fa-circle-info + href: background.html - text: "Applications" icon: fa-landmark href: applications.html diff --git a/rmd/background.Rmd b/rmd/background.Rmd new file mode 100644 index 00000000..ec234825 --- /dev/null +++ b/rmd/background.Rmd @@ -0,0 +1,54 @@ +--- +title: '
OHDSI GIS WG
' +output: + html_document: + toc: TRUE + toc_depth: 3 + toc_float: + collapsed: false +--- + +
+ +# **Problem Space** + +Geospatial data come in a variety of source formats with no universal, interoperable standard representation. While working with a single geospatial dataset has presents some technical challenges, the complexity of using geospatial begins to compound when variables from disparate datasets are needed for a single analysis study. As ad-hoc solutions are introduced to deal with the increasing technical complexity, the reproducibility of these studies plummets. + +In order to enable reproducible studies that incorporate place-related data with longitudinal patient-level data, it is first necessary to develop functionality to automate retrieval and transformation of publicly available place-related data into a standard representation. + +# **Goals** + +The overarching goal of the OHDSI GIS Workgroup is to develop software and standards for incorporating place-based data into the OMOP CDM and present this back to the OHDSI community. We reach these goals through development in the four key areas below. + +### 1) Infrastructure +- Functional Metadata Catalog +- Harmonized geospatial data model + +### 2) Functionality +- Geocoding +- Accessing geospatial datasets + +### 3) Vocabulary +- GIS specific concepts +- SDoH vocabulary +- Toxin vocabulary + +### 4) Extensions +- Exposure_occurrence + +
+ +## **Publications/Presentation** +OHDSI GIS Infrastructure (2023 OHDSI Symposium [Poster](https://www.ohdsi.org/2023showcase-19/), [Abstract](https://www.ohdsi.org/wp-content/uploads/2023/10/19-zollovenecek-BriefReport.pdf)) + +Toxins Vocabulary (2023 OHDSI Symposium [Poster](https://www.ohdsi.org/2023showcase-11/), [Abstract](https://www.ohdsi.org/wp-content/uploads/2023/10/Talapova-Polina_A_Toxin_Vocabulary_for_the_OMOP_CDM_2023symposium-Polina-Talapova.pdf)) + +OHDSI GIS Workgroup October 10, 2023 Update ([video](https://youtu.be/QZY-slWdsMs?si=UxMT39rqWTAW_AMx&t=2178)) + +OHDSI GIS Workgroup 2023 Objectives and Key Results ([video](https://www.youtube.com/watch?v=bd0htzO1hx4&t=1103s), [slides](https://www.ohdsi.org/wp-content/uploads/2023/02/GIS-OKRs2023Q1.pdf)) + + +OHDSI GIS Workgroup August 23, 2022 Update ([video](https://youtu.be/d8jIAm9cssM)) +
+ + diff --git a/rmd/dataModel.Rmd b/rmd/dataModel.Rmd index f6cff5e8..601d1437 100644 --- a/rmd/dataModel.Rmd +++ b/rmd/dataModel.Rmd @@ -50,6 +50,10 @@ for(tb in tables) { cat("## **Backbone**\n\n") } + if(tb == 'exposure_occurrence'){ + cat("## **CDM Extension**\n\n") + } + cat("###", tb, "{.tabset .tabset-pills} \n\n") tableInfo <- subset(tableSpecs, gaiaTableName == tb) diff --git a/rmd/index.Rmd b/rmd/index.Rmd index 98964d27..f7272a57 100644 --- a/rmd/index.Rmd +++ b/rmd/index.Rmd @@ -9,41 +9,16 @@ output: --- -# **Problem Space and Goals** +# **Mission** -Geospatial data come in a variety of source formats, with no universal, interoperable standard representation. In order to meet our goal of enabling studies of place-related data in conjunction with longitudinal patient-level data, it is first necessary to develop functionality to automate retrieval and transformation of publicly available place-related data into a standard representation. The GIS workgroup aims to create a suite of tools built around a lightweight "repository of functionality" to incorporate geospatial analyses into the familiar research workflows within the OHDSI ecosystem. +The OHDSI GIS workgroup aims to introduce new infrastructure, tooling, and CDM extension tables that will allow researchers to seamlessly incorporate place-based datasets into their OMOP-shaped databases to enhance their evidence-generating workflows. -
- -## **Meeting Schedule** - -The GIS General and Development subgroups meet on alternating Fridays at 9 AM ET ([meeting link](https://teams.microsoft.com/l/meetup-join/19%3a83e90982136c4665aa5f74a6ce292e39%40thread.tacv2/1647286365762?context=%7b%22Tid%22%3a%22a30f0094-9120-4aab-ba4c-e5509023b2d5%22%2c%22Oid%22%3a%2222effa56-2c2a-408b-9cfd-25cfe976bb49%22%7d)) - -General meetings focus on use cases, integration with OHDSI tools, and larger project context. Development meetings provide a time for software developers to collaborate or share their recent work. - -We encourage folks with an interest in leveraging Place-related data in their research to join or collaborate with the GIS workgroup. While anyone is welcome to join either of the meetings, consider attending the subgroup meeting that best aligns with your goals: If you would like to propose a use case, contribute domain expertise, or learn how the work from this group could augment your current research, consider joining us at a General meeting. If you are interested in contributing to this project on the technical side (code, documentation, unit testing) consider joining us at a Development meeting. - -
- -## **Roadmap** +# **Meeting Schedule** -TODO +The GIS General and Development subgroups meet on alternating **Fridays at 9 AM ET** -
- -## **Get Involved** - -The first iteration of our **gaia** software and variable library is now functional! See the [installation](https://ohdsi.github.io/GIS/installation.html) page to get started with the gaiaCore R package and gaiaDB database. From there, feel free to build your own tools that interface with gaiaDB and enhance the collection of datasets by adding your own (see TODO). We encourage anyone who adds data sources to a local gaiaDB instance to also create a pull request for them to be added to the full gaiaDB. - -You can also request that a data source be added to gaiaDB by filling out a data source request form [here](https://github.com/OHDSI/GIS/issues/new?assignees=&labels=data+request&template=data_request_template.yaml&title=%5BData+Source+Request%5D%3A+) +([General meeting link](https://teams.microsoft.com/l/meetup-join/19%3af4d776830f504bf3827fe4309156a3c6%40thread.tacv2/1680892162259?context=%7b%22Tid%22%3a%22a30f0094-9120-4aab-ba4c-e5509023b2d5%22%2c%22Oid%22%3a%22df92a21b-7fa5-4c86-9f81-9f6ec39e042c%22%7d)) +([Development meeting link](https://teams.microsoft.com/l/meetup-join/19%3a83e90982136c4665aa5f74a6ce292e39%40thread.tacv2/1647286365762?context=%7b%22Tid%22%3a%22a30f0094-9120-4aab-ba4c-e5509023b2d5%22%2c%22Oid%22%3a%2222effa56-2c2a-408b-9cfd-25cfe976bb49%22%7d)) -Note: For the time being, all software and data sources are still in active development, subject to unannounced changes, and should be considered "unstable". There is currently no versioning of software or data sources, though there are thoughts to implement Semantic Versioning ([SemVer](https://semver.org/)) and Universally Unique Identifiers ([UUIDs](https://en.wikipedia.org/wiki/Universally_unique_identifier)) -
- -## **Publications/Presentation** - -OHDSI GIS Workgroup 2022 Objectives and Key Results ([video](https://youtu.be/d8jIAm9cssM))
- - diff --git a/rmd/overview.Rmd b/rmd/overview.Rmd new file mode 100644 index 00000000..d46d4881 --- /dev/null +++ b/rmd/overview.Rmd @@ -0,0 +1,31 @@ +--- +title: '
OHDSI GIS WG
' +output: + html_document: + toc: TRUE + toc_depth: 3 + toc_float: + collapsed: false +--- + +# **OHDSI GIS Gaia Overview** + +**Gaia** refers to the amalgamation of infrastructure, software, standards, tools, and the overall workflow that the OHDSI GIS Workgroup has developed to assist researchers with integrating place-based datasets into their patient-based health database and subsequent analyses. + +**Gaia** includes multiple major elements: +- gaiaCatalog: a functional metadata catalog containing references to publicly-hosted geospatial datasets and instructions for their download and standardization +- gaiaCore: a Postgis database for managing harmonized data sources, the dockerized DeGauss geocoding tool, and gaiaR, an R package for managing interactions between gaiaCore, gaiaCatalog, and any of the Gaia "extensions" +- Extensions: a broad suite of software packages that are powered by gaiaCore. The most relevant of these packages is gaiaOhdsi, an R package that contains operations specific to interacting with an OMOP CDM or external OHDSI software. Other example of extensions are the gaiaVis tools which provide a set of visualizations for data in gaiaCore. + +## Purpose + +What is the purpose of Gaia? Why are we doing all of this? + +Gaia provides a standardized, automated, reproducible, and easily shareable means for integrating place-based datasets into a database of longitudinal patient health data. + +The *simplest case* for Gaia is a single researcher looking to leverage place-based data. After standing up a local or cloud instance of gaiaCore, any researcher now has access to a wealth of curated sources of geospatial data ranging from environmental toxin data to one of many Social Determinants of Health Indexes derived from the US Census data. Instead of the countless hours of work typical to munging multiple disparate geospatial datasets, the researcher can simply use the functions from the gaiaR package to load datasets into their Postgis database all in a harmonized geospatial data format. They've now quickly enabled datasets across many domains, years, and regions in a single Postgis database to which they connect using the software of their choice and begin performing ad hoc exploratory data analyses, creating visualizations, or even powering their own geospatial applications. + +Taking this scenario a step further, a researcher with an established OMOP CDM database may wish to incorporate a subset of geospatial variables into their CDM database alongside their patient health data. The steps necessary to perform this ingestion, which requires geocoding of patient address and a spatiotemporal join, are all handled by gaiaCore and the gaiaOhdsi extension. Thehe DeGauss geocoder, a lightweight geocoder that operates fully locally to ensure that patient information is not transmitted, is easily utilized through a gaiaR wrapper. standardized spatiotemporal joins from the gaiaOhdsi extension relate patient addresses to polygon, line, or point geometries. By transforming the place-based data into patient-level information, it is now ready to be inserted into the CDM extension table "exposure_occurrence". The DDL and insert scripts for this table are also contained in the gaiaOhdsi extension. Once the data has been added to the CDM, it can be used to create cohort definitions, develop predictive models, and generally utilized by all relevant external OHDSI tooling. + +Finally, Gaia enables standardized and reproducible workflows for federated data networks and studies. The process highlighted above to retrieve and harmonize geospatial datasets, perform spatiotemporal joins to transform place-based data to person-level information, and insert person-level information into an OMOP CDM and define cohorts, is fully reproducible. Each step of the process contains detailed, structured metadata focused on provenance of source data and rationale for transformation methods. By scripting and containerizing an entire Gaia workflow, the process of pairing place-based data, often handled using undocumented ad-hoc methods unique to single sites, can be packaged and shipped across an entire network with minimal effort. +