Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRAPI validation issues with MolePro #878

Open
colleenXu opened this issue Oct 1, 2024 · 8 comments
Open

TRAPI validation issues with MolePro #878

colleenXu opened this issue Oct 1, 2024 · 8 comments
Labels
external Requires fixes to an external service

Comments

@colleenXu
Copy link
Collaborator

colleenXu commented Oct 1, 2024

Related to #865

Noticed in https://arax.ci.transltr.io/?r=c961069f-36da-4369-a141-3aad9234e5ca.
Downloaded BTE's response bte-ci-pf2-validationProblem.json.zip and used notebook to run TRAPI validation locally

(TRAPI "orange" errors don't prevent ARS from using BTE's response. The "red" critical errors do prevent the ARS.)

Error: invalid KL/AT values

Using unspecified. The valid value to use is not_provided.

Examples

See edges:

  • 59616f9c131b5816c63ee3fb4d109ee6 (KL)
  • 3c92b53ccab80dca45b5aa9cef449e18 (both)
		Error

		* Knowledge Graph Edge Knowledge Level:
		=> The indicated 'knowledge_level' slot value is invalid for the given edge

			$ global
				# unspecified
				- context: 
					CHEBI:45713[biolink:SmallMolecule]--biolink:affects->NCBIGene:216[biolink:Gene]
					CHEBI:45713[biolink:SmallMolecule]--biolink:affects->NCBIGene:3553[biolink:Gene]
					CHEBI:45713[biolink:SmallMolecule]--biolink:affects->NCBIGene:3315[biolink:Gene]

Warning: multiple primary knowledge sources

According to Jackson's post below, this is coming from MolePro, not BTE.
Unsure if this is from BTE's handling (merging KG edges?) or originally from MolePro

I can also find this same edge when I query BTE in non-creative-mode for this edge CHEBI:45713 -(affects)-> NCBIGene:5743.

Edge from KP that triggers this warning

                "3c92b53ccab80dca45b5aa9cef449e18": {
                    "predicate": "biolink:affects",
                    "subject": "CHEBI:45713",
                    "object": "NCBIGene:5743",
                    "attributes": [
                        {
                            "attribute_source": "infores:pharos",
                            "attribute_type_id": "biolink:knowledge_level",
                            "attributes": [],
                            "original_attribute_name": "biolink:knowledge_level",
                            "value": "unspecified",
                            "value_type_id": "string"
                        },
                        {
                            "attribute_source": "infores:pharos",
                            "attribute_type_id": "biolink:agent_type",
                            "attributes": [],
                            "original_attribute_name": "biolink:agent_type",
                            "value": "unspecified",
                            "value_type_id": "string"
                        },
                        {
                            "attribute_type_id": "biolink:publications",
                            "value": [
                                "PMID:18487053",
                                "PMID:26850006",
                                "PMID:20527891",
                                "PMID:17604631"
                            ],
                            "value_type_id": "linkml:Uriorcurie"
                        }
                    ],
                    "sources": [
                        {
                            "resource_id": "infores:chembl",
                            "resource_role": "primary_knowledge_source",
                            "upstream_resource_ids": [],
                            "source_record_urls": []
                        },
                        {
                            "resource_id": "infores:gtopdb",
                            "resource_role": "primary_knowledge_source",
                            "upstream_resource_ids": [],
                            "source_record_urls": []
                        },
                        {
                            "resource_id": "infores:molepro",
                            "resource_role": "aggregator_knowledge_source",
                            "upstream_resource_ids": [
                                "infores:chembl",
                                "infores:gtopdb"
                            ],
                            "source_record_urls": []
                        },
                        {
                            "resource_id": "infores:biothings-explorer",
                            "resource_role": "aggregator_knowledge_source",
                            "upstream_resource_ids": [
                                "infores:molepro"
                            ]
                        }
                    ]
                },

@tokebe
Copy link
Member

tokebe commented Oct 2, 2024

The query graph in question:

{
  "message": {
    "query_graph": {
      "nodes": {
        "sn": {
          "ids": [
            "CHEBI:45713"
          ],
          "categories": [
            "biolink:SmallMolecule",
            "biolink:ChemicalEntity"
          ]
        },
        "on": {
          "ids": [
            "NCBIGene:2739"
          ],
          "categories": [
            "biolink:Gene",
            "biolink:Protein"
          ]
        },
        "un": {
          "categories": [
            "biolink:NamedThing"
          ]
        }
      },
      "edges": {
        "e2": {
          "subject": "sn",
          "object": "on",
          "predicates": [
            "biolink:related_to"
          ],
          "knowledge_type": "inferred"
        },
        "e0": {
          "subject": "sn",
          "object": "un",
          "predicates": [
            "biolink:related_to"
          ],
          "knowledge_type": "inferred"
        },
        "e1": {
          "subject": "un",
          "object": "on",
          "predicates": [
            "biolink:related_to"
          ],
          "knowledge_type": "inferred"
        }
      }
    }
  }
}

@tokebe
Copy link
Member

tokebe commented Oct 2, 2024

I've confirmed that MolePro is returning edges with multiple primary sources:

curl -X POST \
"https://molepro-trapi.ci.transltr.io/molepro/trapi/v1.5/query" \
-H "Content-Type: application/json" \
-d '{"message": { "query_graph": { "nodes": { "n0": { "ids": ["CHEBI:45713"] }, "n1": { "ids": ["NCBIGene:5743"] } }, "edges": { "e0": { "subject": "n0", "object": "n1", "predicates": ["biolink:affects"] } } } }}' 

@colleenXu colleenXu added the external Requires fixes to an external service label Oct 3, 2024
@colleenXu
Copy link
Collaborator Author

New Warning

MolePro is using qualifier values that aren't in the biolink model. See 10/20 automated test run https://arax.ci.transltr.io/?r=2dbf14eb-e50c-47d4-9e25-fb487df1d86a

	* Knowledge Graph Edge Qualifiers Qualifier Value:
		=> The 'qualifier_type_id' for edge has unresolved 'qualifier_value'
		        $ infores:drugbank -> infores:molepro -> infores:biothings-explorer
				# induction
				- edge_id | qualifier_type_id: 
					CHEBI:9943[biolink:SmallMolecule]--biolink:affects->NCBIGene:9429[biolink:Gene] | biolink:causal_mechanism_qualifier
					CHEBI:17026[biolink:SmallMolecule]--biolink:affects->NCBIGene:9429[biolink:Gene] | biolink:causal_mechanism_qualifier
			$ infores:ctd -> infores:molepro -> infores:biothings-explorer
				# susceptibility
				- edge_id | qualifier_type_id: 
					CHEBI:4784[biolink:SmallMolecule]--biolink:affects->NCBIGene:7422[biolink:Gene] | biolink:object_aspect_qualifier
					CHEBI:91408[biolink:SmallMolecule]--biolink:affects->NCBIGene:3569[biolink:Gene] | biolink:object_aspect_qualifier
					CHEBI:50140[biolink:MolecularMixture]--biolink:affects->NCBIGene:7099[biolink:Gene] | biolink:object_aspect_qualifier
				# alternative_form
				- edge_id | qualifier_type_id: 
					CHEBI:68534[biolink:SmallMolecule]--biolink:affects->NCBIGene:367[biolink:Gene] | biolink:object_form_or_variant_qualifier
					CHEBI:15756[biolink:SmallMolecule]--biolink:affects->NCBIGene:7494[biolink:Gene] | biolink:object_form_or_variant_qualifier
					CHEBI:52172[biolink:SmallMolecule]--biolink:affects->NCBIGene:7494[biolink:Gene] | biolink:object_form_or_variant_qualifier
				# analog
				- edge_id | qualifier_type_id: 
					CHEBI:2038[biolink:SmallMolecule]--biolink:affects->NCBIGene:367[biolink:Gene] | biolink:subject_form_or_variant_qualifier
					CHEBI:78543[biolink:SmallMolecule]--biolink:affects->NCBIGene:2045[biolink:Gene] | biolink:subject_form_or_variant_qualifier

@codewarrior2000
Copy link

Are you referring to the Biolink Model documented here? These are the qualifiers listed in Biolink Model that MolePro uses:
- subject form or variant qualifier
- subject part qualifier
- subject derivative qualifier
- subject aspect qualifier
- subject context qualifier
- subject direction qualifier
- object form or variant qualifier
- object part qualifier
- object aspect qualifier
- object context qualifier
- object direction qualifier
- causal mechanism qualifier
- anatomical context qualifier
- qualified predicate
- species context qualifier

@colleenXu
Copy link
Collaborator Author

colleenXu commented Oct 22, 2024

The TRAPI validation summary doesn't say. It just shows the not-expected qualifier values, not the attribute_type_ids you listed. It seems to be using biolink-model 4.2.1.

Based on my digging in the ARAX-UI/BTE-response I linked:

@codewarrior2000
Copy link

codewarrior2000 commented Oct 22, 2024

Why are we looking at the biolink enums while the qualifiers are listed here?

@colleenXu
Copy link
Collaborator Author

Let's communicate just on this issue, rather than here and on Slack.

Pasted from Slack:

There are multiple places in biolink-model where the qualifiers are set to particular value enums.
Like here and here

@codewarrior2000
Copy link

codewarrior2000 commented Oct 23, 2024

So, the ARAX Validation Results @colleenXu copied and pasted into this issue on October 21st had omitted monarchinitiative and automat-robokop Edge Qualifier values. The unabridged Results of qualifier values (shown below) suggests that it could be the validator that is problematic, where the validator also reports the other Translator components were providing unresolved 'qualifier_value'.

(from the 10/20 automated test run https://arax.ci.transltr.io/?r=2dbf14eb-e50c-47d4-9e25-fb487df1d86a)

* Knowledge Graph Edge Qualifiers Qualifier Value:
		=> The 'qualifier_type_id' for edge has unresolved 'qualifier_value'
			$ infores:hpo-annotations -> infores:monarchinitiative -> infores:automat-robokop -> infores:biothings-explorer
				# HP:0040282
				- edge_id | qualifier_type_id: 
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0001824[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0002729[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0100721[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0025142[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0002027[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0012378[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0002280[biolink:Disease] | biolink:frequency_qualifier
				# HP:0040283
				- edge_id | qualifier_type_id: 
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0030157[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0000952[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0025066[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0002017[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0008940[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0012735[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0003270[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0031500[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0004335[biolink:Disease] | biolink:frequency_qualifier
				# HP:0040284
				- edge_id | qualifier_type_id: 
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0012050[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0002094[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0000790[biolink:PhenotypicFeature] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0005201[biolink:Disease] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0003329[biolink:Disease] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0004565[biolink:Disease] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0002049[biolink:Disease] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0009692[biolink:Disease] | biolink:frequency_qualifier
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->MONDO:0001106[biolink:Disease] | biolink:frequency_qualifier
				# HP:0040281
				- edge_id | qualifier_type_id: 
					MONDO:0015564[biolink:Disease]--biolink:has_phenotype->HP:0002716[biolink:PhenotypicFeature] | biolink:frequency_qualifier
			$ infores:drugbank -> infores:molepro -> infores:biothings-explorer
				# induction
				- edge_id | qualifier_type_id: 
					CHEBI:9943[biolink:SmallMolecule]--biolink:affects->NCBIGene:9429[biolink:Gene] | biolink:causal_mechanism_qualifier
					CHEBI:17026[biolink:SmallMolecule]--biolink:affects->NCBIGene:9429[biolink:Gene] | biolink:causal_mechanism_qualifier
			$ infores:ctd -> infores:molepro -> infores:biothings-explorer
				# susceptibility
				- edge_id | qualifier_type_id: 
					CHEBI:4784[biolink:SmallMolecule]--biolink:affects->NCBIGene:7422[biolink:Gene] | biolink:object_aspect_qualifier
					CHEBI:91408[biolink:SmallMolecule]--biolink:affects->NCBIGene:3569[biolink:Gene] | biolink:object_aspect_qualifier
					CHEBI:50140[biolink:MolecularMixture]--biolink:affects->NCBIGene:7099[biolink:Gene] | biolink:object_aspect_qualifier
				# alternative_form
				- edge_id | qualifier_type_id: 
					CHEBI:68534[biolink:SmallMolecule]--biolink:affects->NCBIGene:367[biolink:Gene] | biolink:object_form_or_variant_qualifier
					CHEBI:15756[biolink:SmallMolecule]--biolink:affects->NCBIGene:7494[biolink:Gene] | biolink:object_form_or_variant_qualifier
					CHEBI:52172[biolink:SmallMolecule]--biolink:affects->NCBIGene:7494[biolink:Gene] | biolink:object_form_or_variant_qualifier
				# analog
				- edge_id | qualifier_type_id: 
					CHEBI:2038[biolink:SmallMolecule]--biolink:affects->NCBIGene:367[biolink:Gene] | biolink:subject_form_or_variant_qualifier
					CHEBI:78543[biolink:SmallMolecule]--biolink:affects->NCBIGene:2045[biolink:Gene] | biolink:subject_form_or_variant_qualifier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external Requires fixes to an external service
Projects
None yet
Development

No branches or pull requests

3 participants