You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes source data has columns that cannot be mapped to one of the existing concepts but we want to capture that information anyway. We should have a way to do this
Proposed Solution
Add another data structure to the extract config called additional_info
Each entry in this data structure will be a dict that represents a column that cannot be mapped
additional_info= [
{
"column_name": "marital_status",
"column_type": "boolean",
"notes": "Whether the participant in the study is married or not",
"concept": "PARTICIPANT"
},
....
]
During the extract stage of the ingest pipeline, the additional_info structure will be evaluated for each extract config so that a table like the following can be built and written to disk as one of the outputs of the extract stage:
source file
source column
source column type
concept
Notes
clinical.tsv
marital_status
true
PARTICIPANT
Whether the participant in the study is married or not
This is a single table that will contain all of the data in the ingest package that cannot be mapped
The text was updated successfully, but these errors were encountered:
Problem
Sometimes source data has columns that cannot be mapped to one of the existing concepts but we want to capture that information anyway. We should have a way to do this
Proposed Solution
additional_info
During the extract stage of the ingest pipeline, the
additional_info
structure will be evaluated for each extract config so that a table like the following can be built and written to disk as one of the outputs of the extract stage:This is a single table that will contain all of the data in the ingest package that cannot be mapped
The text was updated successfully, but these errors were encountered: