Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure Search] Catalog2AzureSearch performance seems worse than Catalog2Lucene against packages with many versions #7265

Closed
scottbommarito opened this issue Jun 13, 2019 · 1 comment
Assignees

Comments

@scottbommarito
Copy link
Contributor

Earlier today I ran a test on DEV by bulk updating 10,000 versions of a package with 10,000 versions.

The results looked roughly like this:
image.png

(Db2Catalog's cursor is off-screen, but it started running at 22:10.)

As you can see, Global finished in roughly 30 minutes and China finished in roughly 75.

Global Catalog2Registration took roughly 14 minutes to finish.
Global Catalog2AzureSearch took roughly 8-12 minutes to finish.

China Catalog2Registration took roughly 11 minutes to finish.
China Catalog2AzureSearch took roughly 36-38 minutes to finish!!!

In the past, Catalog2Lucene performed much better against these bulk update events, taking only 5-6 minutes on both regions. Additionally, the slowness of China Catalog2AzureSearch seems extremely suspicious.

Ideally, Catalog2AzureSearch should perform just as fast as Catalog2Lucene did.

@joelverhagen
Copy link
Member

We haven't done any load testing to determine the scale we need for our Azure Search resource. I was hoping to let query volume dictate this.

That being said, we could easily increase the parallel catalog leaf downloads to match that behavior in Catalog2Lucene. Currently Catalog2AzureSearch is using 4 parallel workers for both catalog leaf downloads and document pushes. It would be easy to download the catalog leafs with a higher degree of parallelism.

Nice find!

joelverhagen added a commit to NuGet/NuGet.Services.Metadata that referenced this issue Jun 15, 2019
joelverhagen added a commit to NuGet/NuGet.Services.Metadata that referenced this issue Jun 15, 2019
@joelverhagen joelverhagen changed the title Azure Search performance seems worse than Catalog to Lucene against packages with many versions [AzureSearch] Catalog2AzureSearch performance seems worse than Catalog2Lucene against packages with many versions Jun 15, 2019
joelverhagen added a commit to NuGet/NuGet.Services.Metadata that referenced this issue Jun 17, 2019
@joelverhagen joelverhagen self-assigned this Jun 17, 2019
@joelverhagen joelverhagen added this to the S154 - 2019.06.03 milestone Jun 17, 2019
@joelverhagen joelverhagen changed the title [AzureSearch] Catalog2AzureSearch performance seems worse than Catalog2Lucene against packages with many versions [Azure Search] Catalog2AzureSearch performance seems worse than Catalog2Lucene against packages with many versions Jun 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants