Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repository v3.7.3: inefficient inclusion lookup when ANY scope props defined #8074

Closed
allanpaiste opened this issue Nov 10, 2021 · 6 comments · Fixed by #8100
Closed

repository v3.7.3: inefficient inclusion lookup when ANY scope props defined #8074

allanpaiste opened this issue Nov 10, 2021 · 6 comments · Fixed by #8100
Assignees
Labels
bug Performance Issues related to runtime performance regression

Comments

@allanpaiste
Copy link

allanpaiste commented Nov 10, 2021

Previously inclusion lookup was done with a single query that used inq filter that involves every fkValue where now (since 3.7.3) the lookup is performed with find per fkValue.

The new implementation itself states (in code comment) that it is used for very specific cases, but there are no checks implemented in code to validate that such case (as stated in the comment that you can see in the code) was actually encountered.

if (isScopeSet && Array.isArray(fkValues) && !useScopeFilterGlobally) {
// since there is a scope, there could be a where filter, a limit, an order
// and we should run the scope in multiple queries so we can respect the
// scope filter params
const findPromises = fkValues.map(fk => {
const where = {[fkName]: fk} as unknown as Where<Target>;
let localScope = cloneDeep(scope);
// combine where clause to scope filter
localScope = new FilterBuilder(localScope).impose({where}).filter;
return targetRepository.find(localScope, options);
});

This issue report stems from a discovery where certain inclusion lookup that previously produced 1 query now produces 494 instead - which DID result in very noticeable performance impact.

Steps to reproduce

  1. Define two models and define both repositories: modelA, modelB
  2. Define a relation between the two models (where modelA can have multiple modelB's)
  3. create FIVE modelA
  4. create ONE modelB's for each modelA instance
  5. perform a lookup query for modelA with included relation:
{
  include: [
    {
      relation: 'modelB',
      scope: {
        fields: ['fieldA', 'fieldB'],
      },
    },
  ],
}

Current Behavior

Loopback ends up doing 5 separate FIND queries for the relations.

Expected Behavior

Loopback does a single FIND query for the relations with inq: [1, 2, 3, 4, 5]

Additional information

The steps/example given uses small number of entities, but you can probably see how this issue escalates when you are dealing with thousands of records.

I don't see any reason why the implementation that caused this issue should be used for anything than in situation where developer defines a 'limit' for specific scope, and even then I would question if this should be something that gets implemented without a huge warning label attached to it as it introduces serious performance degradation.

The fact that this was introduced in a PATCH semver is also troubling as it diminishes the trust where one expects fix releases to introduce fixes rather than introducing code logic branches with completely new execution strategy

Proposed fix

  1. choose that execution path only on the existence of scope.limit
  2. introduce different strategies used for dealing with applying that limit (can be made dependent on the overall item count when limit is not applied).
  3. release this as a feature release instead of a fix release
  4. log a warning (or even better: implement linter that targets such use-cases). when the execution path is chosen for developer to be very aware that they just potentially caused a performance degradation.

Recommendation for others who encounter this

Lock your @loopback/repository to 3.7.2 until this issue gets fixed.

Related Issues

#6832 (introduced the issue)

@allanpaiste allanpaiste changed the title repository v3.7.3: inclusion lookup done in an inefficient manner repository v3.7.3: inefficient inclusion lookup when ANY scope properties defined Nov 10, 2021
@allanpaiste allanpaiste changed the title repository v3.7.3: inefficient inclusion lookup when ANY scope properties defined repository v3.7.3: inefficient inclusion lookup when ANY scope props defined Nov 10, 2021
@collaorodrigo7
Copy link
Contributor

Thanks for the detail explanation @allanpaiste. You are right, if the scope has only a fields filter, we should not need to do the per-fk filter.
Initially I thought where filter will still need the per-fk filter, but now I do not think we need it there, and neither we need it for order, do we? Because it will be the same to apply the where or order to the entire set and then just filter them to assign it to the parent model, than to query each by fk and apply the where and order.
So, we should only do per-fk filter if the filter has a limit, and skip?
@achrinza what do you think?

If that's right I can push a PR later this week to limit this to only limit, and skip

@achrinza achrinza added Performance Issues related to runtime performance regression labels Nov 11, 2021
@achrinza
Copy link
Member

@collaorodrigo7 If memory serves, I don't think there's an equivalent to "skip" in SQL left joins. So the check should only apply to LIMIT relation filters.

@allanpaiste
Copy link
Author

Any updates on this @collaorodrigo7 ? :)

@allanpaiste
Copy link
Author

@collaorodrigo7 I'd take it's a one-liner fix where we just add a conditional around that logic-branching? Yes?

@achrinza
Copy link
Member

achrinza commented Dec 1, 2021

The fix should be a one-liner fix, though I've not checked on it yet. I'll try to timebox this week to fix the issue, but if you can open a PR earlier, I can help review it with the other mantainers.

@achrinza achrinza self-assigned this Dec 1, 2021
achrinza added a commit that referenced this issue Dec 1, 2021
fixes: #8074
Signed-off-by: Rifa Achrinza <[email protected]>
achrinza added a commit that referenced this issue Dec 1, 2021
fixes: #8074
Signed-off-by: Rifa Achrinza <[email protected]>
@collaorodrigo7
Copy link
Contributor

Sorry guys, I had to take a short time off and couldn't take a look at this before leaving.
Thank you @achrinza

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Performance Issues related to runtime performance regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants