Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test alternate ordering of templates for Drug - treats - Disease creative mode query #758

Open
andrewsu opened this issue Nov 3, 2023 · 2 comments

Comments

@andrewsu
Copy link
Member

andrewsu commented Nov 3, 2023

@mbrush noted:

I have noted that every time I see a BTE-reasoned prediction, all of the support paths seem to be of the form Drug - treats -> Phenotype -phenotype_of-> Disease (or this with an additional subclass_of edge as a third hop). I haven't come across any more molecular/mechanistic paths behind BTE predictions.

For Drug - treats - Disease creative mode queries, the template list that BTE uses is defined in templateGroups.json:

[
  {
    "name": "Drug treats Disease",
    "subject": ["Drug", "SmallMolecule", 
                "ChemicalEntity", "ComplexMolecularMixture", "MolecularMixture"
               ],
    "predicate": ["treats", "ameliorates"],
    "object": ["Disease", "PhenotypicFeature",
               "DiseaseOrPhenotypicFeature"
              ],
    "templates": [
      "Chem-treats-DoP.json",
      "Chem-treats-PhenoOfDisease.json",
      "Chem-regulates,affects-Gene-biomarker,associated_condition-DoP.json"
    ]
  },

So the phenotype-based template is the second template executed (after direct treats edges), and it's appearing like very often, BTE fills up its entire answer list with entries from this template, so the other templates in our template library are not used.

In this issue, I propose that we systematically test the performance of each template (and some subset of template combinations) using the Benchmarks tool. Some systematic testing will give us a more data-driven basis for the selection and ordering of templates that BTE uses.

@mbrush
Copy link

mbrush commented Nov 3, 2023

I do think that we want to showcase results based on ALL of the BTE templates. From my testing, the ONLY template I see being used is the Chem-treats-Pheno-of-Disease one. The reason for this as I understand it is that the other templates BTE has created never get used because the phenotype-based one executes first, and fills up/times out the results before other templates can be executed. As a consequence, the only support paths we ever see from BTE in the UI are instances of this phenotype-based template.

Naively, I would propose the simplest solution of increasing the limit on how many support paths can be returned so that the query finds all the phenotype-based paths, and is able to execute the other templates and return support paths based on them. But I suspect maybe there may be performance/timeout issues with this.

Alternatively, you could limit the number of results from the phenotype-based template in a way that leaves room for executing and returning paths based on the other templates. I suspect that you will find many fewer results from these other templates - based on what I know about knowledge sources serving the data needed for them, and what I've seen from other reasoners that employ similar templates/rules. But I do think these other templates represent convincing additional evidence that would be critical to surface when it is available.

I think the performance tests Andrew suggests would be a good place to start to asses the feasibility of these possible solution, and/or surface other alternative approaches to address the issue.

@colleenXu
Copy link
Collaborator

colleenXu commented Dec 4, 2023

Adding some context:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants