Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement configurable cap on number of entities being tracked #324

Closed
andrewsu opened this issue Oct 20, 2021 · 9 comments
Closed

implement configurable cap on number of entities being tracked #324

andrewsu opened this issue Oct 20, 2021 · 9 comments
Assignees

Comments

@andrewsu
Copy link
Member

For longer and/or open-ended queries, the number of entities being tracked by BTE can grow absurdly high. These cases may contribute to out-of-memory errors and server instability. As one possible solution, we could implement a configurable cap on the number of entities being tracked by BTE. If that cap is exceeded at any point in the execution, BTE could respond with an error and gracefully exit.

#323 may contain a possible example query to test.

@andrewsu andrewsu added this to the 2021-11 feature/code freeze milestone Oct 20, 2021
@marcodarko marcodarko self-assigned this Oct 20, 2021
@colleenXu
Copy link
Collaborator

colleenXu commented Oct 20, 2021

The options explored were:

  1. Based on input to an edge: When the edge manager is deciding the next edge to do, if the "next best" edge to do has > "some limit" IDs in the input node.....stop execution / return error? saying that a step in the query had > limit IDs as input so it was too large
  2. Based on output of an edge: When the sub-queries are being done, if one returns > "some limit" IDs (after the api-response-transform / ID resolution work?)....stop execution / return error? saying that a step in the query had > limit IDs as output so it was too large

@colleenXu
Copy link
Collaborator

Exactly what the "some limit" is depended on which option was chosen and what was happening in the query specified in #323

@marcodarko
Copy link
Contributor

Will be addressed in biothings/bte_trapi_query_graph_handler#53

@colleenXu
Copy link
Collaborator

PR has been merged and deployed. Closing...

@andrewsu
Copy link
Member Author

@marcodarko to add a sample query where this threshold will be triggered, and a sample output showing the error message

@colleenXu colleenXu reopened this Oct 28, 2021
@colleenXu
Copy link
Collaborator

colleenXu commented Oct 28, 2021

I believe Marco is still working on this issue (quote from Slack)

I'm actually gonna make some changes to the entity max solution, I realized it was getting invoked at the wrong place so it wasn't always checked... so fixing that but also how the error is thrown, I don't think I can send a 200 code error (not sure if possible actually)

@colleenXu
Copy link
Collaborator

colleenXu commented Oct 29, 2021

Other examples

Note that I'm using a local api list (removes pending biothings apis except for clinical risk kp api / multiomics wellness) for all of these examples...

Example 1

This 1-hop query returns just over 1000 IDs (1060)...
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": ["NCBIGene:7157"],
		            "categories":["biolink:Gene"]
                },
                "n1": {
                    "categories": ["biolink:Disease"]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1"
                }
            }
        }
    }
}

Therefore, we expect an error to be triggered if we add another hop that uses those 1060 IDs as input. This does happen...

the returned response that has the query in it
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": [
                        "NCBIGene:7157"
                    ],
                    "categories": [
                        "biolink:Gene"
                    ]
                },
                "n1": {
                    "categories": [
                        "biolink:Disease"
                    ]
                },
                "n2": {
                    "categories": [
                        "biolink:PhenotypicFeature"
                    ]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1"
                },
                "e1": {
                    "subject": "n1",
                    "object": "n2"
                }
            }
        },
        "knowledge_graph": {
            "nodes": {},
            "edges": {}
        },
        "results": []
    },
    "status": 500,
    "description": "Error: Max number of entities exceeded (1000) in 'e1'"
}

Example 2

Other queries that correctly trigger the exception are any Workflow B.1 queries with an e03 predict edge - since the number of genes is too large to use as input to another step. Note that this is likely to fail at an earlier edge if the full api list is used...

An example of a query that fails is Demo B.1

Link

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                     "ids": ["MONDO:0005359", "SNOMEDCT:197354009"],
                     "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                },
                "n1": {
                    "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                },
                "n2": {
                    "categories": ["biolink:Gene"]
                },
                "n3": {
                    "categories": ["biolink:Drug"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:has_real_world_evidence_of_association_with"]
                },
                "e02": {
                    "subject": "n2",
                    "object": "n1",
                    "predicates": ["biolink:gene_associated_with_condition"]
                },
                "e03": {
                    "subject": "n3",
                    "object": "n2",
                    "predicates": ["biolink:affects"]
                }
            }
        }
    }
}

A related query to B.1 would previously crash our programs because the computer/server would run out of memory. It now correctly fails...

the query ``` { "message": { "query_graph": { "nodes": { "n0": { "ids": ["MONDO:0005359", "SNOMEDCT:197354009"], "categories": ["biolink:DiseaseOrPhenotypicFeature"] }, "n1": { "categories": ["biolink:DiseaseOrPhenotypicFeature"] }, "n2": { "categories": ["biolink:Gene"] }, "n3": { "categories": ["biolink:Drug", "biolink:SmallMolecule"] } }, "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:has_real_world_evidence_of_association_with"] }, "e02": { "subject": "n2", "object": "n1", "predicates": ["biolink:gene_associated_with_condition"] }, "e03": { "subject": "n3", "object": "n2", "predicates": ["biolink:affects", "biolink:interacts_with"] } } } } } ```

Example 3

this query seems to run fully (doesn't hit the error). I believe that's correct because of the filtering down that happens with intersections (Explain style).

the query ``` { "message": { "query_graph": { "nodes": { "n0": { "ids": ["CHEMBL.COMPOUND:CHEMBL1431"], "categories": ["biolink:SmallMolecule"] }, "n1": { "categories": ["biolink:Protein"] }, "n2": { "categories": ["biolink:Protein"] }, "n3": { "ids": [ "UniProtKB:P02794", "UniProtKB:P02792" ], "categories": ["biolink:Protein"] } }, "edges": { "e0": { "subject": "n0", "object": "n1" }, "e1": { "subject": "n1", "object": "n2" }, "e2": { "subject": "n2", "object": "n3" } } } } } ```

@colleenXu
Copy link
Collaborator

colleenXu commented Oct 29, 2021

Perhaps we could fail earlier in the process - sometimes before the failure point, BTE takes a while with ID resolution because there are >60000 IDs to send to the ID resolver....see #338 (comment) .

What do you think, @andrewsu @newgene ?

@colleenXu
Copy link
Collaborator

Closing this.

After discussion with Andrew, I'll clarify #338 and we'll see how things progress. If needed, we could do a cap (BTE would return failure) related to ID resolution, using a multiplier of this entity cap (like 10,000 - aka 10 * 1000 (entity cap))...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants