Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change VectorReaderListener to expect number array #416

Merged
merged 2 commits into from
Jun 3, 2022

Conversation

jmazanec15
Copy link
Member

Description

Refactors VectorReaderListener onResponse to expect arrays of Number type from search result instead of Double type. This will allow the training to handle cases where the vectors source is stored as an integer array. Adds test case to confirm that it can handle Integer type. Cleans up tests in VectorReaderTest class.

Credit to @martin-gaievski for the suggested fix.

Issues Resolved

#415

Check List

  • New functionality includes testing.
    • All tests pass
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Refactors VectorReaderListener onResponse to expect arrays of Number
type from search result instead of Double type. Adds test case to
confirm that it can handle Integer type. Cleans up tests in
VectorReaderTest class.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 added Bug Fixes Changes to a system or product designed to handle a programming bug/glitch backport 1.x backport 1.2 label to add to PRs to auto backport backport 1.3 Backports PRs to 1.3 branch backport 2.0 auto backport label labels Jun 2, 2022
@jmazanec15 jmazanec15 requested a review from a team June 2, 2022 23:47

for (int j = 0; j < DEFAULT_DIMENSION; j++) {
vector[j] = random.nextInt();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can do the same but with less code via streams:

Integer[] array = random.ints(size, lowBound, highBound).boxed().toArray(Integer[]::new);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks will update


// Create list of random vectors and ingest
Random random = new Random();
List<Integer[]> vectors = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have mix of lists of different types? If yes I think we need to have such test case with mixed docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. Hadn't thought of that. I can add this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before adding the test case can you please validate do we even allow such kind of vectors as input?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Itd be like if someone indexed 2 docs, 1 with [1, 2, 3] and the other with [1.0, 2.0, 3.0]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I dont think this would need to be added. Given that we process each hit's vector separately, I dont see the need to add a case for when the hits have different representations.

Signed-off-by: John Mazanec <[email protected]>
@codecov-commenter
Copy link

codecov-commenter commented Jun 3, 2022

Codecov Report

Merging #416 (0bab5ca) into main (dbe32fc) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##               main     #416   +/-   ##
=========================================
  Coverage     84.01%   84.01%           
  Complexity      911      911           
=========================================
  Files           130      130           
  Lines          3879     3879           
  Branches        359      359           
=========================================
  Hits           3259     3259           
  Misses          458      458           
  Partials        162      162           
Impacted Files Coverage Δ
...java/org/opensearch/knn/training/VectorReader.java 83.56% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dbe32fc...0bab5ca. Read the comment docs.

@jmazanec15 jmazanec15 merged commit 7735351 into opensearch-project:main Jun 3, 2022
@opensearch-trigger-bot
Copy link
Contributor

The backport to 1.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.x 1.x
# Navigate to the new working tree
cd .worktrees/backport-1.x
# Create a new branch
git switch --create backport/backport-416-to-1.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 77353512c1f15e0dc996428a982941a7ee3036fb
# Push it to GitHub
git push --set-upstream origin backport/backport-416-to-1.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.x

Then, create a pull request where the base branch is 1.x and the compare/head branch is backport/backport-416-to-1.x.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 1.2 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.2 1.2
# Navigate to the new working tree
cd .worktrees/backport-1.2
# Create a new branch
git switch --create backport/backport-416-to-1.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 77353512c1f15e0dc996428a982941a7ee3036fb
# Push it to GitHub
git push --set-upstream origin backport/backport-416-to-1.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.2

Then, create a pull request where the base branch is 1.2 and the compare/head branch is backport/backport-416-to-1.2.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 1.3 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.3 1.3
# Navigate to the new working tree
cd .worktrees/backport-1.3
# Create a new branch
git switch --create backport/backport-416-to-1.3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 77353512c1f15e0dc996428a982941a7ee3036fb
# Push it to GitHub
git push --set-upstream origin backport/backport-416-to-1.3
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.3

Then, create a pull request where the base branch is 1.3 and the compare/head branch is backport/backport-416-to-1.3.

opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 3, 2022
Refactors VectorReaderListener onResponse to expect arrays of Number
type from search result instead of Double type. Adds test case to
confirm that it can handle Integer type. Cleans up tests in
VectorReaderTest class.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 7735351)
jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Jun 3, 2022
…t#416)

Refactors VectorReaderListener onResponse to expect arrays of Number
type from search result instead of Double type. Adds test case to
confirm that it can handle Integer type. Cleans up tests in
VectorReaderTest class.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 7735351)
jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Jun 3, 2022
…t#416)

Refactors VectorReaderListener onResponse to expect arrays of Number
type from search result instead of Double type. Adds test case to
confirm that it can handle Integer type. Cleans up tests in
VectorReaderTest class.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 7735351)
jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Jun 3, 2022
…t#416)

Refactors VectorReaderListener onResponse to expect arrays of Number
type from search result instead of Double type. Adds test case to
confirm that it can handle Integer type. Cleans up tests in
VectorReaderTest class.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 7735351)
jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Jun 6, 2022
…t#416)

Refactors VectorReaderListener onResponse to expect arrays of Number
type from search result instead of Double type. Adds test case to
confirm that it can handle Integer type. Cleans up tests in
VectorReaderTest class.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 7735351)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.x backport 1.2 label to add to PRs to auto backport backport 1.3 Backports PRs to 1.3 branch backport 2.0 auto backport label Bug Fixes Changes to a system or product designed to handle a programming bug/glitch v2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants