Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is something happening that makes BTE queries take a long time? #338

Closed
colleenXu opened this issue Oct 27, 2021 · 6 comments
Closed

is something happening that makes BTE queries take a long time? #338

colleenXu opened this issue Oct 27, 2021 · 6 comments

Comments

@colleenXu
Copy link
Collaborator

I've noticed that since switching from GET to POST, queries seem to take longer on the ID resolution step. Discussion on this topic includes:

  • comparing the previous GET method to the current POST method to confirm whether this observation is true
  • playing with the batch size of IDs sent in each POST query to the resolver, to find a batch query size that runs as fast or quicker than the GET method (and if this doesn't seem possible, we would likely follow up with the SRI team about this -- and maybe revert to GET?)
@colleenXu
Copy link
Collaborator Author

@newgene discussed assigning this to @ericz1803

@colleenXu
Copy link
Collaborator Author

colleenXu commented Oct 29, 2021

Note that this may be happening with very large numbers of IDs to resolve...

I'm noticing that the query below is having a lot of IDs to resolve in the DiseaseOrPhenotypicFeature -> Gene edge...and this is taking ~10 min to do...

  bte:call-apis:query Total number of results returned for this query is 66451 +293ms
  bte:call-apis:query Start to use id resolver module to annotate output ids. +0ms
An example of a query that fails is Demo B.1

Link

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                     "ids": ["MONDO:0005359", "SNOMEDCT:197354009"],
                     "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                },
                "n1": {
                    "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                },
                "n2": {
                    "categories": ["biolink:Gene"]
                },
                "n3": {
                    "categories": ["biolink:Drug"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:has_real_world_evidence_of_association_with"]
                },
                "e02": {
                    "subject": "n2",
                    "object": "n1",
                    "predicates": ["biolink:gene_associated_with_condition"]
                },
                "e03": {
                    "subject": "n3",
                    "object": "n2",
                    "predicates": ["biolink:affects"]
                }
            }
        }
    }
}

@colleenXu colleenXu changed the title explore ID resolver post: batch size vs time to return explore ID resolver: is something happening that makes BTE queries take a long time? Nov 1, 2021
@colleenXu
Copy link
Collaborator Author

From Eric's comments, this is not an issue with how fast it takes the SRI service to return. Instead, there seems to be something else going on...

@colleenXu colleenXu changed the title explore ID resolver: is something happening that makes BTE queries take a long time? is something happening that makes BTE queries take a long time? Nov 1, 2021
@colleenXu
Copy link
Collaborator Author

moving off of @ericz1803 @andrewsu since there is more going on here.... related to #346 / edge managing / results assembly

@andrewsu
Copy link
Member

andrewsu commented Nov 18, 2021

quick note that I see this behavior on Query D.6 (described in NCATSTranslator/minihackathons#115) as well on both prod and local. This is the last update on the console (at least 30 mins ago):

  bte:call-apis:query query completes. +1ms
  bte:call-apis:query Total number of results returned for this query is 376432 +4m
  bte:call-apis:query Start to use id resolver module to annotate output ids. +1ms

UPDATE: hmm, BTE eventually progressed off of the seemingly-stalled state above. This is what happened (first three lines repeated from above):

  bte:call-apis:query query completes. +1ms
  bte:call-apis:query Total number of results returned for this query is 376432 +4m
  bte:call-apis:query Start to use id resolver module to annotate output ids. +1ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0017817'. Adding entry for 'Gene'. +1h
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0289507'. Adding entry for 'Gene'. +1ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0027280'. Adding entry for 'Gene'. +1ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0143327'. Adding entry for 'Gene'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0065011'. Adding entry for 'Gene'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0084210'. Adding entry for 'Gene'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0118522'. Adding entry for 'Gene'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0528561'. Adding entry for 'Gene'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'Protein' doesn't match input semantic type 'Gene' for curie 'UMLS:C0038838'. Adding entry for 'Gene'. +1ms

[SNIP lots of similar lines invoking biomedical-id-resolver]

  bte:call-apis:query id annotation completes +34m
  bte:call-apis:query Query completes +0ms
  bte:biothings-explorer-trapi:batch_edge_query BTEEdges are successfully queried.... +1h
  bte:biothings-explorer-trapi:batch_edge_query Filtering out any "undefined" items in (376432) results +0ms
  bte:biothings-explorer-trapi:batch_edge_query Total number of results is (376432) +36ms
  bte:biothings-explorer-trapi:cache_handler Start to cache query results. +1h

<--- Last few GCs --->

[2023:0x4c48c00]  4982119 ms: Mark-sweep 1956.5 (2082.1) -> 1940.5 (2082.1) MB, 2841.3 / 0.2 ms  (average mu = 0.094, current mu = 0.025) allocation failure scavenge might not succeed
[2023:0x4c48c00]  4985248 ms: Mark-sweep 1956.3 (2082.1) -> 1940.7 (2082.4) MB, 3056.3 / 0.3 ms  (average mu = 0.058, current mu = 0.023) allocation failure scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xb12b00 node::Abort() [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 2: 0xa2fe25 node::FatalError(char const*, char const*) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 3: 0xcf8a9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 4: 0xcf8e17 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 5: 0xee2d65  [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 6: 0xee38ac  [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 7: 0xef17e1 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 8: 0xef4d3c v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
 9: 0xeb8be5 v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
10: 0xeb1754 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
11: 0xeb3a01 v8::internal::FactoryBase<v8::internal::Factory>::NewRawTwoByteString(int, v8::internal::AllocationType) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
12: 0x112b035 v8::internal::String::SlowFlatten(v8::internal::Isolate*, v8::internal::Handle<v8::internal::ConsString>, v8::internal::AllocationType) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
13: 0xd047a0 v8::String::WriteUtf8(v8::Isolate*, char*, int, int*, int) const [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
14: 0xbf9b43 node::StringBytes::Write(v8::Isolate*, char*, unsigned long, v8::Local<v8::Value>, node::encoding, int*) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
15: 0xca3aa6 void node::crypto::Decode<node::crypto::Hash>(v8::FunctionCallbackInfo<v8::Value> const&, void (*)(node::crypto::Hash*, v8::FunctionCallbackInfo<v8::Value> const&, char const*, unsigned long)) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
16: 0xd5f70b  [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
17: 0xd60bac  [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
18: 0xd61226 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/home/asu/.nvm/versions/node/v16.0.0/bin/node]
19: 0x160c579  [/home/asu/.nvm/versions/node/v16.0.0/bin/node]

@colleenXu
Copy link
Collaborator Author

This has been tracked down and resolved by using a new query algorithm. Discussed in issues #359 . Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants