Performance tuning/Recommendations #177

vamshin · 2020-07-24T21:19:15Z

Issue #, if available:
#64

Description of changes:
Adds Performance tuning/Recommendations to improve indexing/search performance.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

PerformanceTuning.md

jmazanec15

From this it looks like all headers that do not have a space between them and the title do not show up as headers.

Still need to review more, will continue this week.

PerformanceTuning.md

jmazanec15 · 2020-07-27T23:49:57Z

PerformanceTuning.md

+
+### Warm up
+
+The graphs are constructed during indexing, but they are loaded into memory during the first search. The way search works in Lucene is that each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point) and the results are aggregated together and ranked based on the score of each result (higher score --> better result). 


Is there documentation for this?

PerformanceTuning.md

jmazanec15 · 2020-07-28T19:00:29Z

PerformanceTuning.md

+
+### Warm up
+
+The graphs are constructed during indexing, but they are loaded into memory during the first search. The way search works in Lucene is that each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point) and the results are aggregated together and ranked based on the score of each result (higher score --> better result). 


Interesting, makes sense thanks

PerformanceTuning.md

jmazanec15 · 2020-07-28T21:19:32Z

PerformanceTuning.md

+
+In order to avoid this latency penalty during your first queries, a user should use the warmup API on the indices they want to search. The API looks like this:
+
+GET /_opendistro/_knn/warmup/index1,index2,index3?pretty


Minor: wrap in code block

jmazanec15 · 2020-07-28T21:20:09Z

PerformanceTuning.md

+
+The following steps could help improve indexing performance especially when you plan to index large number of vectors at once. 
+
+1 Disable refresh interval  (Default = 1 sec)


Minor: add "." after numbering

jmazanec15

Looks good to me. Thanks for adding this @vamshin!

vamshin added 3 commits July 20, 2020 13:04

synced from master

465e1f0

Merge branch 'master' of github.com:opendistro-for-elasticsearch/k-NN

e29432b

add performance tuning doc

4633095

jmazanec15 reviewed Jul 24, 2020

View reviewed changes

vamshin added 2 commits July 24, 2020 22:54

incorporated comments

a7a429f

incorporated comments

330076a

jmazanec15 reviewed Jul 27, 2020

View reviewed changes

jmazanec15 reviewed Jul 28, 2020

View reviewed changes

vamshin added 2 commits July 28, 2020 12:35

incorporated comments

5568717

incorporated comments

d0e889c

jmazanec15 reviewed Jul 28, 2020

View reviewed changes

incorporated comments

296a368

jmazanec15 approved these changes Jul 28, 2020

View reviewed changes

vamshin merged commit ac83e49 into opendistro-for-elasticsearch:master Jul 28, 2020

jmazanec15 added the Documentation Improvements or additions to documentation label Aug 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance tuning/Recommendations #177

Performance tuning/Recommendations #177

vamshin commented Jul 24, 2020

jmazanec15 left a comment

jmazanec15 Jul 27, 2020

jmazanec15 Jul 28, 2020

jmazanec15 Jul 28, 2020

vamshin Jul 28, 2020

jmazanec15 Jul 28, 2020

vamshin Jul 28, 2020

jmazanec15 left a comment


		### Warm up

		The graphs are constructed during indexing, but they are loaded into memory during the first search. The way search works in Lucene is that each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point) and the results are aggregated together and ranked based on the score of each result (higher score --> better result).


		In order to avoid this latency penalty during your first queries, a user should use the warmup API on the indices they want to search. The API looks like this:

		GET /_opendistro/_knn/warmup/index1,index2,index3?pretty


		The following steps could help improve indexing performance especially when you plan to index large number of vectors at once.

		1 Disable refresh interval (Default = 1 sec)

Performance tuning/Recommendations #177

Performance tuning/Recommendations #177

Conversation

vamshin commented Jul 24, 2020

jmazanec15 left a comment

Choose a reason for hiding this comment

jmazanec15 Jul 27, 2020

Choose a reason for hiding this comment

jmazanec15 Jul 28, 2020

Choose a reason for hiding this comment

jmazanec15 Jul 28, 2020

Choose a reason for hiding this comment

vamshin Jul 28, 2020

Choose a reason for hiding this comment

jmazanec15 Jul 28, 2020

Choose a reason for hiding this comment

vamshin Jul 28, 2020

Choose a reason for hiding this comment

jmazanec15 left a comment

Choose a reason for hiding this comment