Combinatorial explosion in the number of answers returned to a query #33

karafecho · 2022-10-25T21:13:59Z

This issue is to formally report a known Translator issue, namely, a tendency for answer sets to explode combinatorially with certain types of queries.

For instance, during the October 2022 QotM, Translator team members found that moving from connections between ATP1A3 and chemical entities or diseases yields a reasonable number of results; however, when adding in intermediary genes and pathways, the answer sets explode and become unmanageable.

Example from comment posted by @colleenXu here:

"Not sure how to get from ATP1A3 -> related genes -> ChemicalEntity, Procedure, Treatment in a way that doesn't explode / become unmanageable

Pathways / BiologicalProcessOrActivity...caused explosions since they were linked to pathways that had lots of genes"

sierra-moxon · 2023-01-27T17:50:27Z

From TAQA:

this is a huge issue, we probably need to break it down
UI is working on fixes here, where you could eliminate a node and reduce the hairball.

from Sharat: four issues to be broken down into

filter controls to the user. - UI group
merging records - already happening in different levels (ARS, agents, etc.).
grouping records - travel up the ontology and give it to me above. - is this a UI issue? Andy: will be working on it for sure (we need other input)
scoring records - still have some way of bringing the bit to the top. - O&O
user workflows, can we help the user refine their query (e.g. if two ARAs return the same result, etc.)
- have a formal way to communicate this (all ARAs do it the same way) ( information content < x for example)
- ask the TRAPI folks for a way to return the "cap"
- ask the UI to be able to return the "cap" to the user
- pagination to the architecture group for discussion
big "hub" nodes are taken into account in the ARA - this is a tunable parameter.

big picture -> deep dive is important

from Chris B: perhaps another issue here is: this is a known query with many results - sorry. Or, can we filter/sort our way out of this one? - is this doable? Suggest to the user that they tighten this up. Here are the common predicates associated with the answers you're getting back, can we try to help the user write a better query?

from Sharat: agree; this is the best we can do, here are ways to tighten it up.
work on user workflows for two big queries.
Andy is interested in more brainstorming on this; UI needs direction.
In the end, there is one place where the quality of the results is the measurement (either UI/O&O or someplace we could get it).

Andrew: Big "hub" nodes are taken into account in the Normalized Google Distance (which is used in scoring by BTE and ARAX) - this is a tunable parameter.

sharatisrani · 2023-02-14T23:32:30Z

For the case of grouping records, O&O has a tracking issue at NCATSTranslator/Ordering-Organizing#15, with a few additional comments.
For the case of scoring records, O&O has a tracking issue at NCATSTranslator/Ordering-Organizing#6

sharatisrani · 2023-07-13T22:03:49Z

This is a major issue, but how likely is it to bite us for the September release?

karafecho · 2023-07-14T17:32:11Z

I think this is being addressed as described here and recorded here.

karafecho added the O&O issue ordering & organizing issue label Oct 25, 2022

sierra-moxon mentioned this issue Jan 4, 2023

Combinatorial explosion in answers #78

Closed

sierra-moxon assigned sharatisrani Feb 16, 2023

sierra-moxon added this to the October - 2023 milestone Jun 1, 2023

sierra-moxon closed this as completed May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combinatorial explosion in the number of answers returned to a query #33

Combinatorial explosion in the number of answers returned to a query #33

karafecho commented Oct 25, 2022 •

edited

Loading

sierra-moxon commented Jan 27, 2023 •

edited by andrewsu

Loading

sharatisrani commented Feb 14, 2023 •

edited

Loading

sharatisrani commented Jul 13, 2023

karafecho commented Jul 14, 2023

Combinatorial explosion in the number of answers returned to a query #33

Combinatorial explosion in the number of answers returned to a query #33

Comments

karafecho commented Oct 25, 2022 • edited Loading

sierra-moxon commented Jan 27, 2023 • edited by andrewsu Loading

sharatisrani commented Feb 14, 2023 • edited Loading

sharatisrani commented Jul 13, 2023

karafecho commented Jul 14, 2023

karafecho commented Oct 25, 2022 •

edited

Loading

sierra-moxon commented Jan 27, 2023 •

edited by andrewsu

Loading

sharatisrani commented Feb 14, 2023 •

edited

Loading