Skip to content

Commit

Permalink
Fix userGuidnce, etlConventions docs
Browse files Browse the repository at this point in the history
  • Loading branch information
kzollove committed Mar 1, 2024
1 parent bd978b4 commit 64e584e
Show file tree
Hide file tree
Showing 4 changed files with 79 additions and 64 deletions.
126 changes: 70 additions & 56 deletions docs/gaia-datamodels.html
Original file line number Diff line number Diff line change
Expand Up @@ -478,15 +478,9 @@ <h3 class="tabset tabset-pills">data_source</h3>
web-hosted entities. All source data in gaiaDB must be referenced in
this table.</p>
<p><strong>User Guide</strong></p>
<p>All records in this table are sources of geospatial data. They can be
sources of geometry data, such as point, line, or polygon, or they can
be attribute data with an identifier that relates them to geometry data,
such as a FIPS code or GEOID.</p>
<p>NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>All sources of data that should be included in gaiaDB must have an
entry in this table. Geometry data sources require a “geom_spec”: a
lightweight transformation from the source data to the standardized
format, written in R and serialized as JSON.</p>
<p>NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -930,22 +924,9 @@ <h3 class="tabset tabset-pills">variable_source</h3>
data source enabling downstream data integrations. All variables from
attribute source data must be catalogued in this table.</p>
<p><strong>User Guide</strong></p>
<p>All records in this table describe distinct variables from attribute
source data. For example, consider a weather dataset that is being added
to gaiaDB. First, the entire dataset is catalogued in the data_source
table. Then, the distinct variables of that dataset (temperature in
fahrenheit, temperature in celsius, inches of rain, wind direction,
etc.) each become a single record in this table. All records in this
table are related back to their parent source dataset via a foreign key
relationship to the data_source table. Many variable_source records can
be related to a single data_source record.</p>
<p>NA NA NA NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>Every individual variable from a source dataset must have an entry in
this table. Likewise, any source attribute dataset that gets included in
the data_source table will likely have many “children” in this table.
All records in this table contain an “attr_spec”: a lightweight
transformation of a single variable into the standardized table
format.</p>
<p>NA NA NA NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -1140,10 +1121,9 @@ <h3 class="tabset tabset-pills">attr_index</h3>
<p>A programmatically derived index table of all the attribute source
datasets included in the data_source table.</p>
<p><strong>User Guide</strong></p>
<p>This table can be (re)generated after new entries are added to the
data_source table by running the gaiaCore createIndices() function.</p>
<p>NA NA NA NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>Run the createIndices() function to (re)generate this table.</p>
<p>NA NA NA NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -1339,10 +1319,9 @@ <h3 class="tabset tabset-pills">geom_index</h3>
<p>A programmatically derived index table of all the geometry source
datasets included in the data_source table.</p>
<p><strong>User Guide</strong></p>
<p>This table can be (re)generated after new entries are added to the
data_source table by running the gaiaCore createIndices() function.</p>
<p>NA NA NA NA NA NA NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>Run the createIndices() function to (re)generate this table.</p>
<p>NA NA NA NA NA NA NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -1612,10 +1591,9 @@ <h3 class="tabset tabset-pills">attr_template</h3>
<p>This table is a template for the standardized attribute table that
get created.</p>
<p><strong>User Guide</strong></p>
<p>No action necessary. This table must simply exist (with no entries)
in the backbone schema.</p>
<p>NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>No action necessary.</p>
<p>NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -2065,10 +2043,9 @@ <h3 class="tabset tabset-pills">geom_template</h3>
<p>This table is a template for the standardized geometry tables that
get created.</p>
<p><strong>User Guide</strong></p>
<p>No action necessary. This table must simply exist (with no entries)
in the backbone schema.</p>
<p>NA NA NA NA NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>No action necessary.</p>
<p>NA NA NA NA NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -2289,12 +2266,9 @@ <h3 class="tabset tabset-pills">geom_omop_location</h3>
<p>This table contains identifier and text address from OMOP Location
table records along with their associated geocoded, point geometry.</p>
<p><strong>User Guide</strong></p>
<p>Populate this table from the OMOP Location table to facilitate
creation of CDM Extension tables.</p>
<p>NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>Use the geocodeAddresses() function as outlined in <a
href="https://ohdsi.github.io/GIS/ht-geocode.html"
class="uri">https://ohdsi.github.io/GIS/ht-geocode.html</a></p>
<p>NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -2412,11 +2386,9 @@ <h3 class="tabset tabset-pills">omop_location_history</h3>
<p><strong>Table Description</strong></p>
<p>This table is a copy of the OMOP Location_History table.</p>
<p><strong>User Guide</strong></p>
<p>Copy the OMOP Location_History table to Gaia to facilitate creation
of CDM Extension tables</p>
<p>NA NA NA NA NA NA</p>
<p><strong>ETL Conventions</strong></p>
<p>This table should be an exact duplicate of the OMOP Location_History
table.</p>
<p>NA NA NA NA NA NA</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -2616,9 +2588,53 @@ <h3 class="tabset tabset-pills">exposure_occurrence</h3>
transformations of data from Gaia and to interface with ATLAS and OHDSI
tool stack from an OMOP CDM database.</p>
<p><strong>User Guide</strong></p>
<p>NA</p>
<p>The unique key given to a social or environmental exposure for a
Person The LOCATION_ID of the Person for whom the exposure is
associated. The PERSON_ID of the Person for whom the exposure is
associated. The EXPOSURE_CONCEPT_ID field is recommended for primary use
in analyses, and must be used for network studies. This is the standard
concept mapped from the source value which represents a exposure. Use
this date to determine the start date of the exposure. NA Use this date
to determine the end date of the exposure. NA This field identifies the
origin of the exposure record (e.g. Census, EHR, Environmental data,
Geospatial data, Satellite imagery, GIS mapping, Sensor network, Mobile
device geolocation, LiDAR) This field can be used to determine the
spatiotemporal relationship between the source Exposure and the Person
This field can be used to determine the original source of place-based
exposure data This field houses the verbatim name of the original source
of place-based exposure data. NA NA NA NA The meaning of
Concept?4172703?for ?=? is identical to omission of a
OPERATOR_CONCEPT_ID value. Since the use of this field is rare, it?s
important when devising analyses to not to forget testing for the
content of this field for values different from =. This is the numerical
value of the Exposure, if available. If the raw data gives a categorial
result for exposures those values are captured and mapped to standard
concepts in the ?Exposure Value? domain. UNIT_SOURCE_VALUES should be
mapped to a Standard Concept in the Unit domain that best represents the
unit as given in the source data.</p>
<p><strong>ETL Conventions</strong></p>
<p>NA</p>
<p>Each derived instance of an exposure should be assigned this unique
key. NA NA The CONCEPT_ID to which the source exposure is mapped. This
mapping should be integrated into the variable_source record and
automatically populated in this record. The date range of the exposure
should represent the temporal overlap between the place-based exposure
data point and the LOCATION_ID’s location_history record. NA The date
range of the exposure should represent the temporal overlap between the
place-based exposure data point and the LOCATION_ID’s location_history
record. NA The CONCEPT_ID to which the exposure’s data source type is
mapped. This mapping should be integrated into the data_source record
and automatically populated in this record. The CONCEPT_ID to which the
relationship between the Exposure and the Person is mapped. This mapping
should be automatically populated in this record. The CONCEPT_ID to
which the exposure’s data source is mapped. This mapping should be
integrated into the data_source record and automatically populated in
this record. This name is mapped to a Standard Exposure Source Concept
and the original name is stored here for reference. NA NA NA NA NA This
value should be integrated into the variable_source record and
automatically populated in this record. This mapping should be
integrated into the variable_source record and automatically populated
in this record. This mapping should be integrated into the
variable_source record and automatically populated in this record.</p>
<table class="table table-condensed table-hover" style="font-size: 13px; margin-left: auto; margin-right: auto;">
<thead>
<tr>
Expand Down Expand Up @@ -2878,9 +2894,9 @@ <h3 class="tabset tabset-pills">exposure_occurrence</h3>
exposure_type_concept_id
</td>
<td style="text-align:left;">
This field can be used to determine the provenance of the Exposure
record, as in whether the exposure was from an ___________ or other
sources.
This field identifies the origin of the exposure record (e.g. Census,
EHR, Environmental data, Geospatial data, Satellite imagery, GIS
mapping, Sensor network, Mobile device geolocation, LiDAR)
</td>
<td style="text-align:left;">
The CONCEPT_ID to which the exposure’s data source type is mapped. This
Expand Down Expand Up @@ -3102,11 +3118,10 @@ <h3 class="tabset tabset-pills">exposure_occurrence</h3>
operator_concept_id
</td>
<td style="text-align:left;">
The meaning of Concept<a0>4172703<a0>for &lt;91&gt;=&lt;92&gt; is
identical to omission of a OPERATOR_CONCEPT_ID value. Since the use of
this field is rare, it&lt;92&gt;s important when devising analyses to
not to forget testing for the content of this field for values different
from =. </a0></a0>
The meaning of Concept?4172703?for ?=? is identical to omission of a
OPERATOR_CONCEPT_ID value. Since the use of this field is rare, it?s
important when devising analyses to not to forget testing for the
content of this field for values different from =.
</td>
<td style="text-align:left;">
</td>
Expand Down Expand Up @@ -3162,8 +3177,7 @@ <h3 class="tabset tabset-pills">exposure_occurrence</h3>
</td>
<td style="text-align:left;">
If the raw data gives a categorial result for exposures those values are
captured and mapped to standard concepts in the &lt;91&gt;Exposure
Value&lt;92&gt; domain.
captured and mapped to standard concepts in the ?Exposure Value? domain.
</td>
<td style="text-align:left;">
This mapping should be integrated into the variable_source record and
Expand Down
2 changes: 1 addition & 1 deletion docs/gaiaCore/pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ pandoc: '2.18'
pkgdown: 2.0.7
pkgdown_sha: ~
articles: {}
last_built: 2024-03-01T14:55Z
last_built: 2024-03-01T16:56Z

6 changes: 3 additions & 3 deletions inst/csv/gaia001fieldLevel.csv
Original file line number Diff line number Diff line change
Expand Up @@ -76,15 +76,15 @@ exposure_occurrence,exposure_start_date,Yes,date,Use this date to determine the
exposure_occurrence,exposure_start_datetime,No,datetime,,,No,No,,,,,
exposure_occurrence,exposure_end_date,Yes,date,Use this date to determine the end date of the exposure.,The date range of the exposure should represent the temporal overlap between the place-based exposure data point and the LOCATION_ID's location_history record.,No,No,,,,,
exposure_occurrence,exposure_end_datetime,No,datetime,,,No,No,,,,,
exposure_occurrence,exposure_type_concept_id,Yes,integer,"This field can be used to determine the provenance of the Exposure record, as in whether the exposure was from an ___________ or other sources.",The CONCEPT_ID to which the exposure's data source type is mapped. This mapping should be integrated into the data_source record and automatically populated in this record.,No,Yes,concept,concept_id,Type Concept,,
exposure_occurrence,exposure_type_concept_id,Yes,integer,"This field identifies the origin of the exposure record (e.g. Census, EHR, Environmental data, Geospatial data, Satellite imagery, GIS mapping, Sensor network, Mobile device geolocation, LiDAR)",The CONCEPT_ID to which the exposure's data source type is mapped. This mapping should be integrated into the data_source record and automatically populated in this record.,No,Yes,concept,concept_id,Type Concept,,
exposure_occurrence,exposure_relationship_concept_id,Yes,integer,This field can be used to determine the spatiotemporal relationship between the source Exposure and the Person,The CONCEPT_ID to which the relationship between the Exposure and the Person is mapped. This mapping should be automatically populated in this record.,No,Yes,concept,concept_id,,,
exposure_occurrence,exposure_source_concept_id,No,integer,This field can be used to determine the original source of place-based exposure data,The CONCEPT_ID to which the exposure's data source is mapped. This mapping should be integrated into the data_source record and automatically populated in this record.,No,Yes,concept,concept_id,,,
exposure_occurrence,exposure_source_value,No,varchar(50),This field houses the verbatim name of the original source of place-based exposure data.,This name is mapped to a Standard Exposure Source Concept and the original name is stored here for reference.,No,No,,,,,
exposure_occurrence,exposure_relationship_source_value,No,varchar(50),,,No,No,,,,,
exposure_occurrence,dose_unit_source_value,No,varchar(50),,,No,No,,,,,
exposure_occurrence,quantity,No,integer,,,No,No,,,,,
exposure_occurrence,modifier_source_value,No,varchar(50),,,No,No,,,,,
exposure_occurrence,operator_concept_id,No,integer,"The meaning of Concept4172703for �=� is identical to omission of a OPERATOR_CONCEPT_ID value. Since the use of this field is rare, its important when devising analyses to not to forget testing for the content of this field for values different from =.",,No,Yes,concept,concept_id,,,
exposure_occurrence,operator_concept_id,No,integer,"The meaning of Concept?4172703?for ?=? is identical to omission of a OPERATOR_CONCEPT_ID value. Since the use of this field is rare, it?s important when devising analyses to not to forget testing for the content of this field for values different from =.",,No,Yes,concept,concept_id,,,
exposure_occurrence,value_as_number,No,float,"This is the numerical value of the Exposure, if available.",This value should be integrated into the variable_source record and automatically populated in this record.,No,No,,,,,
exposure_occurrence,value_as_concept_id,No,integer,If the raw data gives a categorial result for exposures those values are captured and mapped to standard concepts in the Exposure Value domain.,This mapping should be integrated into the variable_source record and automatically populated in this record.,No,Yes,concept,concept_id,,,
exposure_occurrence,value_as_concept_id,No,integer,If the raw data gives a categorial result for exposures those values are captured and mapped to standard concepts in the ?Exposure Value? domain.,This mapping should be integrated into the variable_source record and automatically populated in this record.,No,Yes,concept,concept_id,,,
exposure_occurrence,unit_concept_id,No,integer,UNIT_SOURCE_VALUES should be mapped to a Standard Concept in the Unit domain that best represents the unit as given in the source data.,This mapping should be integrated into the variable_source record and automatically populated in this record.,No,Yes,concept,concept_id,Unit,,
9 changes: 5 additions & 4 deletions rmd/gaia-datamodels.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,13 @@ for(tb in tables) {
tableInfo <- subset(tableSpecs, gaiaTableName == tb)
cat("**Table Description**\n\n",tableInfo[,"tableDescription"][[1]], "\n\n")
if(!isTRUE(tableInfo[,"userGuidance"][[1]]=="")){
cat("**User Guide**\n\n",tableInfo[,"userGuidance"][[1]],"\n\n")
fieldInfo <- subset(cdmSpecs, gaiaTableName == tb)
if(!isTRUE(fieldInfo[,"userGuidance"][[1]]=="")){
cat("**User Guide**\n\n",fieldInfo[,"userGuidance"][[1]],"\n\n")
}
if(!isTRUE(tableInfo[,"etlConventions"][[1]]=="")){
cat("**ETL Conventions**\n\n",tableInfo[,"etlConventions"][[1]],"\n\n")
if(!isTRUE(fieldInfo[,"etlConventions"][[1]]=="")){
cat("**ETL Conventions**\n\n",fieldInfo[,"etlConventions"][[1]],"\n\n")
}
loopTable <- subset(gaiaSpecsClean, `Gaia Table` == tb)
Expand Down

0 comments on commit 64e584e

Please sign in to comment.