Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Performance tuning/Recommendations #177

Merged
merged 8 commits into from
Jul 28, 2020

Conversation

vamshin
Copy link
Member

@vamshin vamshin commented Jul 24, 2020

Issue #, if available:
#64

Description of changes:
Adds Performance tuning/Recommendations to improve indexing/search performance.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this it looks like all headers that do not have a space between them and the title do not show up as headers.

Still need to review more, will continue this week.

PerformanceTuning.md Outdated Show resolved Hide resolved

### Warm up

The graphs are constructed during indexing, but they are loaded into memory during the first search. The way search works in Lucene is that each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point) and the results are aggregated together and ranked based on the score of each result (higher score --> better result).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there documentation for this?

PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Outdated Show resolved Hide resolved

### Warm up

The graphs are constructed during indexing, but they are loaded into memory during the first search. The way search works in Lucene is that each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point) and the results are aggregated together and ranked based on the score of each result (higher score --> better result).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, makes sense thanks

PerformanceTuning.md Outdated Show resolved Hide resolved
PerformanceTuning.md Show resolved Hide resolved

In order to avoid this latency penalty during your first queries, a user should use the warmup API on the indices they want to search. The API looks like this:

GET /_opendistro/_knn/warmup/index1,index2,index3?pretty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: wrap in code block

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


The following steps could help improve indexing performance especially when you plan to index large number of vectors at once.

1 Disable refresh interval (Default = 1 sec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: add "." after numbering

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for adding this @vamshin!

@vamshin vamshin merged commit ac83e49 into opendistro-for-elasticsearch:master Jul 28, 2020
@jmazanec15 jmazanec15 added the Documentation Improvements or additions to documentation label Aug 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants