Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(elastic): improve indexing performance #833

Merged

Conversation

hendrik-advantitge
Copy link
Contributor

As promised, a PR with some performance upgrades for the ElasticSearch plugin.

  • Returned FacetValues are now filtered based on Channel since those became ChannelAware. I fixed this by making the findByIds() function ChannelAware, which makes this function not consistent with the others. I'm open to suggestions here.
  • We expanded the connectivity check on startup to retry a configurable amount of times. Reason for this is that hosting platforms perform maintenance tasks that often involve restarting nodes. ElasticSearch is much slower in startup than Vendure, and hence the worker would start without subscribing to the indexing jobs.
  • During indexing, the Product and its variants are queried separately. This greatly reduces memory usage of the worker.
  • Indexing itself is simplified a lot. Every indexing job now just performs a delete and insert on all Channels.
  • During reindexing, if the dropIndices parameter is false, the products are deleted just before they are inserted again. This allows admins to reindex while the shop is live. Before, a reindex would clear the index entirely before inserting everything.
  • Batching for reindexing is removed.

I'm aware this could be further cleaned up, but if that is okay with you I'm leaving that to you for now.

@michaelbromley
Copy link
Member

Great, thanks! Looks like you need to include @types/lodash in devDependencies - that seems to be what is making the CI runs fail.

I'll review properly later today 👍

@hendrik-advantitge
Copy link
Contributor Author

I see some tests are failing for search. Many of them seem to be concerning the order of results, which is just hard coded? Others are regarding the count of FacetValues. Probably the tests are not updated for the ChannelAware FacetValues? I'm not sure.

@michaelbromley
Copy link
Member

I'll investigate when I review the changes later. Perhaps the ordering is affected by the new method of indexing?

@hendrik-advantitge
Copy link
Contributor Author

That is very likely. To be determined if the order in these tests is important or not?

@michaelbromley
Copy link
Member

The order is only important if there is an explicit sort order given in the search query.

@Ctx() ctx: RequestContext,
@Parent() parent: Omit<SearchResponse, 'facetValues'>,
): Promise<Array<{ facetValue: FacetValue; count: number }>> {
const facetValueIds = parent.items.map(item => item.facetValueIds).flat();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this change to the way facetValues are counted is responsible for some of the test failures. This change does not replicate the old behaviour: previously, it would return the facetValues counts for the entire result set. Now it is only returning the count for the current page of results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. We are running this revised plugin for quite some time now but hadn't noticed as we only use this for the items in the current result set. Now I also understand why there were two calls to the Elastic instance before. Should be relatively straightforward to put the old system back in place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll revert that part 👍

@michaelbromley michaelbromley merged commit 9807478 into vendure-ecommerce:master Apr 22, 2021
@michaelbromley
Copy link
Member

This is now tidied up and merged. Thank you for your work on this!

@thomas-advantitge thomas-advantitge deleted the feature/elastic-performance branch April 22, 2021 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants