-
Notifications
You must be signed in to change notification settings - Fork 12
Semantic Mapping Representation
Collaborators: Bill Baumgartner, Nicole Vasilevsky
This page describes how semantically rich representations of the mappings were created utilizing the Resource Description Framework (RDF). This tasks consists of several steps, guided by domain experts, in order to ensure that the resulting representations were both accurate and clinically meaningful.
The most important task is to develop semantic definitions for the OMOP2OBO
mappings. In order to do this, logical definitions or representations of each of the mappings needed to be created. This required creating templates to represent the primary design patterns utilized by the mappings. Each of the patterns is built around the use of different combinations of the Web Ontology Language (OWL) constructors owl:intersectionOf
, owl:unionOf
, owl:complementOf
. Examples of how each of these constructors (and combinations of them) are shown below.
Relevant GitHub Issues: issue #34
owl:complementOf
Class_Name: 'Skin appearance normal (OMOP_4021360
)'
Class Expression Syntax: not('Abnormality of the skin')
New Triples:
omop2obo: <https://github.com/callahantiff/omop2obo/obo/ext/>
oboInOwl: <http://www.geneontology.org/formats/oboInOwl>
owl: <http://www.w3.org/2002/07/owl>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
rdfs: <http://www.w3.org/2000/01/rdf-schema>
omop2obo:OMOP_4021360, oboInOwl:hasOBONamespace, OMOP2OBO
omop2obo:OMOP_4021360, oboInOwl:id, OMOP:4021360
omop2obo:OMOP_4021360, rdf:type, owl:Class
omop2obo:OMOP_4021360, rdfs:label, 'Skin appearance normal'
omop2obo:OMOP_4021360, owl:equivalentClass, ec1
ec1, rdf:type, owl:Restriction
ec1, owl:onProperty, obo:BFO_0000051 # has part
ec1, owl:someValuesFrom, ec_not
ec_not, rdf:type, owl:Class
## Abnormality of the skin
ec_not, owl:complementOf, obo:HP_0000951
owl:unionOf
Class_Name: 'Longitudinal deficiency of tibia AND/OR fibula (OMOP_434473
)'
Class Expression Syntax: ('Abnormality of fibula morphology' or 'Abnormality of tibia morphology')
New Triples:
omop2obo: <https://github.com/callahantiff/omop2obo/obo/ext/>
oboInOwl: <http://www.geneontology.org/formats/oboInOwl>
owl: <http://www.w3.org/2002/07/owl>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
rdfs: <http://www.w3.org/2000/01/rdf-schema>
omop2obo:OMOP_434473, oboInOwl:hasOBONamespace, OMOP2OBO
omop2obo:OMOP_434473, oboInOwl:id, OMOP:434473
omop2obo:OMOP_434473, rdfs:label, "Longitudinal deficiency of tibia AND/OR fibula"
omop2obo:OMOP_434473, rdf:type, owl:Class
omop2obo:OMOP_434473, owl:equivalentClass, ec1
ec1, rdf:type, owl:Restriction
ec1, owl:onProperty, obo:BFO_0000051 # has part
ec1, owl:someValuesFrom, ec_union
ec_union, rdf:type, owl:Class
## Abnormality of fibula morphology
ec_union, owl:unionOf, ec_union_member1
ec_union_member1, rdf:first, obo:HP_0002991
## Abnormality of tibia morphology
ec_union_member1, rdf:rest, ec_union_member2
ec_union_member2, rdf:first, obo:HP_0002992
ec_union_member2, rdf:rest, rdf:nil
owl:intersectionOf
Class_Name: 'Abnormal cervical smear (OMOP_434165
)'
Class Expression Syntax: ('Abnormal cell morphology' and 'Abnormality of the uterine cervix')
New Triples:
omop2obo: <https://github.com/callahantiff/omop2obo/obo/ext/>
oboInOwl: <http://www.geneontology.org/formats/oboInOwl>
owl: <http://www.w3.org/2002/07/owl>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
rdfs: <http://www.w3.org/2000/01/rdf-schema>
omop2obo:OMOP_434165, oboInOwl:hasOBONamespace, OMOP2OBO
omop2obo:OMOP_434165, oboInOwl:id, OMOP:434165
omop2obo:OMOP_434165, rdfs:label, "Abnormal cervical smear"
omop2obo:OMOP_434165, rdf:type, owl:Class
omop2obo:OMOP_434165, owl:equivalentClass, ec1
ec1, rdf:type, owl:Restriction
ec1, owl:onProperty, obo:BFO_0000051 # has part
ec1, owl:someValuesFrom, ec_intersection1
ec_intersection1, rdf:type, owl:Class
## Abnormality of the uterine cervix
ec_intersection1, owl:intersectionOf, ec_intersection_member1
ec_intersection_member1, rdf:first, obo:HP_0012888
## Abnormal cell morphology
ec_intersection_member1, rdf:rest, ec_intersection_member2
ec_intersection_member2, rdf:first, obo:HP_0025461
ec_intersection_member2, rdf:rest, rdf:nil
Once the experiments described above were complete, a more complex representation spanning all ontologies for a given mapping, rather than creating ontology-specific mappings, was created. Mappings that spanned multiple ontologies required additional content not currently included in each mapping and required new patterns specific to each clinical domain. Additional detail is included below, which describes the steps taken to add mapping categories and evidence.
Ontologies: HPO
, MONDO
Assumptions
- All classes created in
OMOP2OBO
namespace using the originalOMOP
concept identifier - All mappings for
Concepts Used in Practice
(annotated with a mapping category other thanUnmapped
) that had at least 1HPO
orMONDO
annotation - All
phenotypes
arerdfs:subClassOf
phenotypic abnormality
(HP_0000118
) - All
diseases
arerdfs:subClassOf
disease or disorder
(MONDO_0000001
) - Use
owl:equivalentClass
for all 1:1 mappings where theHPO
andMONDO
annotations represented the same concept - Use
RO
relations for all mappings where theHPO
andMONDO
annotations represent different diseases/phenotypes
Intra-Ontology Relations:
-
HPO
-phenotype of
-MONDO
-
MONDO
-has phenotype
-HPO
Mapping Combinations
The following 3 mapping patterns were created (click on image to enlarge in current tab):
Ontologies: CHEBI
, PRO
, NCBITaxon
, VO
Assumptions
- All classes created in
OMOP2OBO
namespace using the originalOMOP
concept identifier - For all
Concepts Used in Clinical Practice
drugs and ingredients with at least 1CHEBI
annotation - Additional annotations are added to connect each ingredient to its
RxNorm
drug - All new
OMOP2OBO
classes for drug ingredients arerdfs:subClassOf
chemical entity
(CHEBI_24431
)
Intra-Ontology Relations:
Drugs and Ingredients
- DRUG -
has component
- INGREDIENT
Ingredients
-
CHEBI
-has part
-PRO
-
CHEBI
-has part
-VO
-
CHEBI
-in taxon
-NCBITaxon
-
PRO
-in taxon
-NCBITaxon
-
VO
-in taxon
-NCBITaxon
Mapping Combinations
The following 13 mapping patterns were created (click on image to enlarge in current tab):
Class Construction Heuristics
- Assigning
NCBITaxon
from spreadsheet:- If
PRO
andVO
don't have a taxon assignment →NCBITaxon
to both - If only
PRO
and it doesn't have a taxon assignment →NCBITaxon
toPRO
- If only
VO
and it doesn't have a taxon assignment →NCBITaxon
toVO
- If
CHEBI
→NCBITaxon
toCHEBI
- If
Ontologies HPO
, CHEBI
, UBERON
, NCBITaxon
, CL
, PRO
Assumptions
- All classes created in
OMOP2OBO
namespace using the originalOMOP
concept identifier - For all lab test results (annotated with a mapping category other than
Unmapped
) with at least 1HPO
annotation and at least 1UBERON
annotation - Since mappings are to lab test results, but we know what LOINC code each test is assigned, additional annotations were added to connect each lab test result to its
LOINC
measurement_concept_id
. To do this, each originalOMOP
concept was appended with the result type (i.e._NORMAL
,_LOW
,_HIGH
or_NEGATIVE
,_POSITIVE
) - All new
OMOP2OBO
classes for measurements arerdfs:subClassOf
phenotypic abnormality
(HP_0000118
)
Intra-Ontology Relations:
Lab Test and Results
- LAB TEST -
has output
- LAB TEST RESULT
Lab Test Results
-
HPO
-has part that occurs in
-UBERON
-
HPO
-has part
-CL
-
HPO
-has part
-PRO
-
HPO
-has part
-CHEBI
-
UBERON
-in taxon
-NCBITaxon
-
CL
-in taxon
-NCBITaxon
-
PRO
-in taxon
-NCBITaxon
-
CHEBI
-in taxon
-NCBITaxon
Mapping Combinations
The following 20 mapping patterns were created (click on image to enlarge in current tab). Note that a dashed line is used to indicate multiple patterns. There are two special cases of the patterns shown below: (1) IgE
antibody tests and (2) IgA
, IgD
, IgG
, and IgM
(i.e., Antibody, but not IgE
). These specific patterns are also demonstrated below.
Class Construction Heuristics
- Assigning
NCBITaxon
from spreadsheet:- If only
PRO
and it doesn't have a taxon assignment →NCBITaxon
toPRO
- If
CHEBI
→NCBITaxon
toCHEBI
- All
UBERON
→NCBITaxon_9606
- All
CL
→NCBITaxon_9606
- If only
For the first initial release, mapping categories and evidence are represented as class annotations, similar to how synonyms and dbxrefs are annotated to Open Biomedical Ontology Foundry ontology classes. Each annotation includes metadata for the original OMOP concepts, original OBO concepts, OMOP Common Data Model (CDM) version used, ontologies version date, and url for current OMOP2OBO
release. Examples for each major type of evidence are shown below:
Mapping categories added as class annotation.
Evidence can come in the following forms:
-
OBO DbXRef to OMOP Source Code
- OBO_DbXRef-OMOP_CONCEPT_SOURCE_CODE:xxxxxxx
- OBO_DbXRef-OMOP_ANCESTOR_SOURCE_CODE:xxxxxxx
-
OBO Label to OMOP Synonym or Label
- OBO_LABEL-OMOP_CONCEPT_LABEL:xxxxxxx
- OBO_LABEL-OMOP_ANCETSOR_LABEL:xxxxxxx
- OBO_LABEL-OMOP_CONCEPT_SYNONYM:xxxxxxx
- OBO_LABEL-OMOP_ANCETSOR_SYNONYM:xxxxxxx
-
OBO Synonym to OMOP Synonym or Label
- OBO_hasSynonymType-OMOP_CONCEPT_LABEL:xxxxxxx
- OBO_hasSynonymType-OMOP_ANCETSOR_LABEL:xxxxxxx
- OBO_hasSynonymType-OMOP_CONCEPT_SYNONYM:xxxxxxx
- OBO_hasSynonymType-OMOP_ANCETSOR_SYNONYM:xxxxxxx
-
Concept Similarity Score → CONCEPT_SIMILARITY:OBO_URI_x.x
REPRESENTATION
DbXRef Example 1: OBO_DbXRef-OMOP_CONCEPT_SOURCE_CODE:ABC_1234567
Pattern for all DbXref
evidence to an OMOP concept.
class_id SKOS:exactMatch ABC_1234567
BNode owl:annotatedSource class_id
BNode owl:annotatedProperty SKOS:exactMatch
BNode owl:annotatedTarget ABC_1234567
BNode oboInOwl:source "Mapping Category"
BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1
DbXRef Example 2: OBO_DbXRef-OMOP_ANCESTOR_SOURCE_CODE:ABC_1234567
Pattern for all DbXref
evidence that includes an OMOP concept ancestor.
class_id oboInOwl:hasDbXref ABC_1234567
BNode owl:annotatedSource class_id
BNode owl:annotatedProperty oboInOwl:hasDbXref
BNode owl:annotatedTarget ABC_1234567
BNode oboInOwl:source "Mapping Category"
BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1
Label Example: OBO_LABEL-OMOP_CONCEPT_LABEL:xxxxxxx
All OBO-OMOP label matches (even those to concept ancestors) will utilize SKOS:exactMatch
since this type of match only happens when the OBO and OMOP strings match exactly.
class_id SKOS:exactMatch OMOP_1234567
BNode owl:annotatedSource class_id
BNode owl:annotatedProperty SKOS:exactMatch
BNode oboInOwl:target OMOP_1234567
BNode oboInOwl:source "Mapping Category"
BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source "LABEL STRING"
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1
OBO Synonym Example: OBO_hasSynonymType-OMOP_CONCEPT_LABEL:xxxxxxx
This would be the pattern for all OBO Synonym matches. Note that this example uses a generic oboInOwl:hasSynonymType
for this example, the actual axioms will use the specific types recorded from each matched ontology.
class_id oboInOwl:hasSynonymType "Synonym string"
BNode owl:annotatedSource class_id
BNode owl:annotatedProperty oboInOwl:hasSynonymType
BNode owl:annotatedTarget "Synonym string"
BNode oboInOwl:source "Mapping Category"
BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1
Similarity Example: CONCEPT_SIMILARITY:OBO_URI_1.0
The pattern for all cosine similarity-generated evidence uses the RO
property is evidence with support from (RO_0002614
) with the NCIT
class Cosine Distance Method NCIT_C272662
. In addition to extending the metadata sources to include the similarity score float value.
class_id obo:RO_0002614 NCIT_C27662
BNode owl:annotatedSource class_id
BNode owl:annotatedProperty RO_0002614
BNode owl:annotatedTarget NCIT_C27662
BNode oboInOwl:source "Cosine similarity score of x.x derived from applying a Bag-Of-Words TF-IDF vector space model to all available OMOP and OBO labels and synonyms"
BNode oboInOwl:source "Mapping Category"
BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1