-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement configurable cap on number of entities being tracked #324
Comments
The options explored were:
|
Exactly what the "some limit" is depended on which option was chosen and what was happening in the query specified in #323 |
Will be addressed in biothings/bte_trapi_query_graph_handler#53 |
PR has been merged and deployed. Closing... |
@marcodarko to add a sample query where this threshold will be triggered, and a sample output showing the error message |
I believe Marco is still working on this issue (quote from Slack)
|
Other examplesNote that I'm using a local api list (removes pending biothings apis except for clinical risk kp api / multiomics wellness) for all of these examples... Example 1This 1-hop query returns just over 1000 IDs (1060)...
Therefore, we expect an error to be triggered if we add another hop that uses those 1060 IDs as input. This does happen... the returned response that has the query in it
Example 2Other queries that correctly trigger the exception are any Workflow B.1 queries with an e03 predict edge - since the number of genes is too large to use as input to another step. Note that this is likely to fail at an earlier edge if the full api list is used... An example of a query that fails is Demo B.1
A related query to B.1 would previously crash our programs because the computer/server would run out of memory. It now correctly fails... the query``` { "message": { "query_graph": { "nodes": { "n0": { "ids": ["MONDO:0005359", "SNOMEDCT:197354009"], "categories": ["biolink:DiseaseOrPhenotypicFeature"] }, "n1": { "categories": ["biolink:DiseaseOrPhenotypicFeature"] }, "n2": { "categories": ["biolink:Gene"] }, "n3": { "categories": ["biolink:Drug", "biolink:SmallMolecule"] } }, "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:has_real_world_evidence_of_association_with"] }, "e02": { "subject": "n2", "object": "n1", "predicates": ["biolink:gene_associated_with_condition"] }, "e03": { "subject": "n3", "object": "n2", "predicates": ["biolink:affects", "biolink:interacts_with"] } } } } } ```Example 3this query seems to run fully (doesn't hit the error). I believe that's correct because of the filtering down that happens with intersections (Explain style). the query``` { "message": { "query_graph": { "nodes": { "n0": { "ids": ["CHEMBL.COMPOUND:CHEMBL1431"], "categories": ["biolink:SmallMolecule"] }, "n1": { "categories": ["biolink:Protein"] }, "n2": { "categories": ["biolink:Protein"] }, "n3": { "ids": [ "UniProtKB:P02794", "UniProtKB:P02792" ], "categories": ["biolink:Protein"] } }, "edges": { "e0": { "subject": "n0", "object": "n1" }, "e1": { "subject": "n1", "object": "n2" }, "e2": { "subject": "n2", "object": "n3" } } } } } ``` |
Perhaps we could fail earlier in the process - sometimes before the failure point, BTE takes a while with ID resolution because there are >60000 IDs to send to the ID resolver....see #338 (comment) . |
Closing this. After discussion with Andrew, I'll clarify #338 and we'll see how things progress. If needed, we could do a cap (BTE would return failure) related to ID resolution, using a multiplier of this entity cap (like 10,000 - aka 10 * 1000 (entity cap))... |
For longer and/or open-ended queries, the number of entities being tracked by BTE can grow absurdly high. These cases may contribute to out-of-memory errors and server instability. As one possible solution, we could implement a configurable cap on the number of entities being tracked by BTE. If that cap is exceeded at any point in the execution, BTE could respond with an error and gracefully exit.
#323 may contain a possible example query to test.
The text was updated successfully, but these errors were encountered: