Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector efficient math operations for KNN distance functions #617

Closed
Tradunsky opened this issue Nov 5, 2022 · 5 comments
Closed

Vector efficient math operations for KNN distance functions #617

Tradunsky opened this issue Nov 5, 2022 · 5 comments
Labels

Comments

@Tradunsky
Copy link

Tradunsky commented Nov 5, 2022

Is your feature request related to a problem? Please describe.
Exact-KNN distance functions use sub-optimal vector math operations. This can contribute to extra time on high vector dimension distance compute.

Describe the solution you'd like
Do you think it is possible to use Vector API for math operations in the distance functions?
For example, the following cosine similarity that iterates over each element in both vectors to multiply numbers:

public static float cosinesimilOptimized(float[] queryVector, float[] inputVector, float normQueryVector) {

could apply vector optimal fused multiply-add instead.

If for the usage of Java Vector API, supported java version is a concern, it could also start with Math.fma().

Describe alternatives you've considered
ANN would be an option if pre-filtering supported.

Additional context
Much appreciate OpenSearch work on KNN, as it gets bigger demand with ML to relay on vector supporting DBs on distributed computations.

@anasalkouz anasalkouz transferred this issue from opensearch-project/OpenSearch Nov 7, 2022
@vamshin vamshin added the good first issue Good for newcomers label Nov 7, 2022
@dblock
Copy link
Member

dblock commented Nov 7, 2022

Where would we want to configure which distance function to use? Index level options?

@vamshin
Copy link
Member

vamshin commented Nov 7, 2022

@dblock this will be applicable to all https://opensearch.org/docs/latest/search-plugins/knn/knn-score-script/ functions and will not be part of index options. This will be optimization to our existing vector calculations for bruteforce/exact knn search

@vamshin
Copy link
Member

vamshin commented Jan 4, 2023

Looks like we need OpenSearch core to move to open jdk 19 to use Math.fma(). Will prioritize once we have this version in core

@Tradunsky
Copy link
Author

Linking here for visibility:
apache/lucene#12091

This is also implemented in Elasticsearch and dropped latencies by at least 50% in my case.

@jmazanec15
Copy link
Member

Addressed in #1699

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants