Query performance issue in inclusion resolver for hasManyThrough #8250

0x0aNL · 2022-01-19T22:24:32Z

Describe the bug

Similar to #8074 , the hasManyThrough inclusion resolver suffers from a heavily degraded performance. It will fetch all through models for a given set of source models using a single query, but afterwards it will perform a query for each source model (or set of through models). For example, when querying 100 Items and including its Category has-many-through relation, it will perform 100 Category queries instead of just 1.

Adding a layer of has-many relations, e.g. also including all Items for all Categorys will have an exponential effect: 100 Items in 10 Categorys will result in 1000 queries, instead of 2.

In the linked branch hasManyThroughPerformanceIssue, the TodoList example has been modified with an added Category model, set up with a bi-directional hasManyThrough relation Todo <-> Category <-> Todo. There's a test added which includes todos with categories with todos: it should execute 3 queries, but fails (see npm test). After applying the patch from the PR, it succeeds.

Logs

No response

Additional information

My proposal (PR included) is to:

collect all target model keys from the through models
perform a single findByForeignKeys call
do some magic to adhere to scope limit and order
"rearrange" the result in the proper format and return

Possible issues:

scope magic (see PR)

Notes:

Doesn't fix the "issue" of re-querying an Item in an inclusion of depth 3 with bi-directional hasManyThrough (e.g., when finding Item with relation Category with relation Item, all Items will be queried twice)

Anecdotal result: a single .find with a bunch of included relations went from 295 queries in 421 ms to 7 queries in 101 ms.

Reproduction

https://github.com/0x0aNL/loopback-next/tree/hasManyThroughPerformanceIssue

The text was updated successfully, but these errors were encountered:

ivanvaccari · 2022-01-20T10:00:07Z

Following this one, same issue for me. For comparison, the lb3 executing the same query takes about 150ms, while lb4 takes an average of 2s.

My case is then more complex than this because involves datasource providers, and for each relation resolved, it create a datasource instance. This grow exponentially with nested relations.

0x0aNL added the bug label Jan 19, 2022

0x0aNL mentioned this issue Jan 19, 2022

Fix: hasManyThrough inclusion performance #8251

Merged

8 tasks

dhmlau closed this as completed in #8251 Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query performance issue in inclusion resolver for hasManyThrough #8250

Query performance issue in inclusion resolver for hasManyThrough #8250

0x0aNL commented Jan 19, 2022 •

edited

Loading

ivanvaccari commented Jan 20, 2022

Query performance issue in inclusion resolver for hasManyThrough #8250

Query performance issue in inclusion resolver for hasManyThrough #8250

Comments

0x0aNL commented Jan 19, 2022 • edited Loading

Describe the bug

Logs

Additional information

Reproduction

ivanvaccari commented Jan 20, 2022

0x0aNL commented Jan 19, 2022 •

edited

Loading