Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor kNN codec related classes #582

Merged

Conversation

martin-gaievski
Copy link
Member

@martin-gaievski martin-gaievski commented Oct 14, 2022

Signed-off-by: Martin Gaievski [email protected]

Description

Refactor codec related classes, main idea is to simplify changes related to Lucene/core version bump in future.

Introduce codec version class that abstracts context that is specific to Lucene version, current version of codec is defined in this version class.

Check List

  • All tests pass
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@martin-gaievski martin-gaievski added Refactoring Improve the design, structure, and implementation while preserving its functionality backport 2.x labels Oct 14, 2022
@codecov-commenter
Copy link

codecov-commenter commented Oct 14, 2022

Codecov Report

Merging #582 (ab960e0) into main (6d77882) will increase coverage by 0.43%.
The diff coverage is 73.61%.

@@             Coverage Diff              @@
##               main     #582      +/-   ##
============================================
+ Coverage     83.60%   84.04%   +0.43%     
- Complexity     1026     1034       +8     
============================================
  Files           148      148              
  Lines          4233     4224       -9     
  Branches        373      373              
============================================
+ Hits           3539     3550      +11     
+ Misses          518      498      -20     
  Partials        176      176              
Impacted Files Coverage Δ
...ec/KNN920Codec/KNN920PerFieldKnnVectorsFormat.java 50.00% <33.33%> (+46.15%) ⬆️
.../knn/index/codec/BasePerFieldKnnVectorsFormat.java 57.69% <57.69%> (ø)
...ec/KNN940Codec/KNN940PerFieldKnnVectorsFormat.java 75.00% <66.66%> (+16.66%) ⬆️
...rg/opensearch/knn/index/codec/KNNCodecVersion.java 80.76% <80.76%> (ø)
...earch/knn/index/codec/KNN910Codec/KNN910Codec.java 100.00% <100.00%> (+14.28%) ⬆️
...earch/knn/index/codec/KNN920Codec/KNN920Codec.java 90.90% <100.00%> (+10.90%) ⬆️
...earch/knn/index/codec/KNN940Codec/KNN940Codec.java 100.00% <100.00%> (ø)
...rg/opensearch/knn/index/codec/KNNCodecService.java 100.00% <100.00%> (ø)
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@martin-gaievski martin-gaievski force-pushed the refactor-knn-codecs branch 2 times, most recently from 49f71d4 to f853c7b Compare October 17, 2022 17:32
@martin-gaievski martin-gaievski marked this pull request as ready for review October 18, 2022 17:30
@martin-gaievski martin-gaievski requested a review from a team October 18, 2022 17:30
public KnnVectorsFormat getKnnVectorsFormatForField(final String field) {
if (isKnnVectorFieldType(field) == false) {
log.debug(
String.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to use String.format here.

Suggested change
String.format(
log.debug("Initialize KNN vector format for field [{}] with default params [max_connections] = \"{}\" and [beam_width] = \"{}\"", field, defaultMaxConnections, defaultBeamWidth);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

int maxConnections = getMaxConnections(params);
int beamWidth = getBeamWidth(params);
log.debug(
String.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use String.format

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack


/**
* Extends the Codec to support a new file format for KNN index
* based on the mappings.
*
*/
public final class KNN910Codec extends FilterCodec {

private static final String KNN910 = "KNN910Codec";
private static final KNNCodecVersion version = KNNCodecVersion.V_9_1_0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For static final variable, the naming convention should be VERSION I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, good catch


private static final String KNN920 = "KNN920Codec";

private static final KNNCodecVersion version = KNNCodecVersion.V_9_2_0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final KNNCodecVersion version = KNNCodecVersion.V_9_2_0;
private static final KNNCodecVersion VERSION = KNNCodecVersion.V_9_2_0;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack


public class KNN940Codec extends FilterCodec {
private static final String KNN940 = "KNN940Codec";
private static final KNNCodecVersion version = KNNCodecVersion.V_9_4_0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final KNNCodecVersion version = KNNCodecVersion.V_9_4_0;
private static final KNNCodecVersion VERSION = KNNCodecVersion.V_9_4_0;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@@ -16,15 +16,15 @@
import org.opensearch.knn.index.codec.KNNFormatFacade;

public class KNN940Codec extends FilterCodec {
private static final KNNCodecVersion version = KNNCodecVersion.V_9_4_0;
private static final KNNCodecVersion VERSION = KNNCodecVersion.V_9_4_0;
private final KNNFormatFacade knnFormatFacade;
private final PerFieldKnnVectorsFormat perFieldKnnVectorsFormat;

/**
* No arg constructor that uses Lucene94 as the delegate
*/
public KNN940Codec() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FAR: why not pass KNNCodecVersion as parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to have default constructor as per SPI contract, all codec classes are defined in corresponding registry file. In such case approach with parameter will not work

Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Thanks @martin-gaievski!

@martin-gaievski martin-gaievski merged commit 3d0a9d7 into opensearch-project:main Oct 19, 2022
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-582-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3d0a9d7ed6b1c609a9ab1ed2eff897dbe05fca63
# Push it to GitHub
git push --set-upstream origin backport/backport-582-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-582-to-2.x.

martin-gaievski added a commit that referenced this pull request Oct 19, 2022
* Refactor codec related classes, create KNNCodecVersion abstraction

Signed-off-by: Martin Gaievski <[email protected]>
(cherry picked from commit 3d0a9d7)
martin-gaievski added a commit that referenced this pull request Oct 19, 2022
* Refactor codec related classes, create KNNCodecVersion abstraction

Signed-off-by: Martin Gaievski <[email protected]>
(cherry picked from commit 3d0a9d7)
martin-gaievski added a commit that referenced this pull request Oct 20, 2022
* Refactor codec related classes, create KNNCodecVersion abstraction

Signed-off-by: Martin Gaievski <[email protected]>
@heemin32 heemin32 added 2.4.0 v2.4.0 'Issues and PRs related to version v2.4.0' and removed 2.4.0 labels Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Refactoring Improve the design, structure, and implementation while preserving its functionality v2.4.0 'Issues and PRs related to version v2.4.0'
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants