Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add hybrid search with rescore IT #1066

Conversation

will-hwang
Copy link
Contributor

@will-hwang will-hwang commented Jan 7, 2025

Description

Currently, main branch doesn't contain HybridSearchWithRescoreIT test cases in both restart-upgrade and rolling-upgrade test suites, which are present in 2.x branch. This discrepancy causes failures to changes made to 2.x (e.g: https://github.com/opensearch-project/neural-search/actions/runs/12644267233)

This PR adds the missing commits to main branch between #917 and manually created backport (#924)

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#1065

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Jan 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.77%. Comparing base (2ecd32c) to head (eb2cb20).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1066      +/-   ##
============================================
+ Coverage     79.71%   79.77%   +0.05%     
- Complexity     1110     1113       +3     
============================================
  Files            87       87              
  Lines          3846     3846              
  Branches        646      646              
============================================
+ Hits           3066     3068       +2     
  Misses          531      531              
+ Partials        249      247       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@heemin32 heemin32 added skip-changelog backport 2.x Label will add auto workflow to backport PR to 2.x branch labels Jan 7, 2025
Copy link
Collaborator

@heemin32 heemin32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@martin-gaievski
Copy link
Member

martin-gaievski commented Jan 7, 2025

test for rescore for rolling upgrade is already part of he main https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/src/test/java/org/opensearch/neuralsearch/bwc/HybridSearchIT.java#L71-L72, added in #917.
Main reason why we have a separate class for rescore test in 2.x is to be able to block is for lower versions, where rescore is not supported. Such blockage and filtering is easy to do if you have a separate test class. At the same time it's not needed for main because in BWC we need to support only latest 2.x (2.18 as of today) and that version supports rescore.

What you need to do is to add rescore part to the restart BWC test, that's it.

@heemin32
Copy link
Collaborator

heemin32 commented Jan 7, 2025

test for rescore for rolling upgrade is already part of he main https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/src/test/java/org/opensearch/neuralsearch/bwc/HybridSearchIT.java#L71-L72, added in #917. Main reason why we have a separate class for rescore test in 2.x is to be able to block is for lower versions, where rescore is not supported. Such blockage and filtering is easy to do if you have a separate test class. At the same time it's not needed for main because in BWC we need to support only latest 2.x (2.18 as of today) and that version supports rescore.

What you need to do is to add rescore part to the restart BWC test, that's it.

  1. The test is included in a separate class only in the 2.x branch to simplify blocking for lower versions.
  2. Blocking is not required in the main branch.

Despite these reasons, I prefer to keep the main and 2.x branches as similar as possible for easier maintenance, unless there are technical constraints preventing it.

Copy link
Member

@martin-gaievski martin-gaievski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check my comments, I leaving up to you if want to have a separate class just for rescore as in 2.x, or keep it as part of the https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/src/test/java/org/opensearch/neuralsearch/bwc/HybridSearchIT.java, the wayt it's done in main. Just pick one of the approaches.
You need to drop unnecessary checks for version in both restart and rolling upgrade, this will be main only change

@@ -102,6 +102,7 @@ task testAgainstOldCluster(type: StandaloneRestIntegTestTask) {
if (versionsBelow2_15.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.NeuralSparseTwoPhaseProcessorIT.*"
excludeTestsMatching "org.opensearch.neuralsearch.bwc.HybridSearchWithRescoreIT.*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check is not needed in main, latest 2.x which is 2.18 already supports rescore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep it so that the code looks same between main and 2.x

@@ -166,6 +167,7 @@ task testAgainstNewCluster(type: StandaloneRestIntegTestTask) {
if (versionsBelow2_15.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.NeuralSparseTwoPhaseProcessorIT.*"
excludeTestsMatching "org.opensearch.neuralsearch.bwc.HybridSearchWithRescoreIT.*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -101,6 +101,7 @@ task testAgainstOldCluster(type: StandaloneRestIntegTestTask) {
if (versionsBelow2_15.any { ext.neural_search_bwc_version.startsWith(it) }){
filter {
excludeTestsMatching "org.opensearch.neuralsearch.bwc.NeuralSparseTwoPhaseProcessorIT.*"
excludeTestsMatching "org.opensearch.neuralsearch.bwc.HybridSearchWithRescoreIT.*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, 2.18 supports rescore in hybrid not need to exclude test class

@heemin32
Copy link
Collaborator

heemin32 commented Jan 7, 2025

You need to drop unnecessary checks for version in both restart and rolling upgrade, this will be main only change

@martin-gaievski I would like to have version check in main so that the code will looks same between main and 2.x even if the check is not needed for main

@martin-gaievski
Copy link
Member

test for rescore for rolling upgrade is already part of he main https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/src/test/java/org/opensearch/neuralsearch/bwc/HybridSearchIT.java#L71-L72, added in #917. Main reason why we have a separate class for rescore test in 2.x is to be able to block is for lower versions, where rescore is not supported. Such blockage and filtering is easy to do if you have a separate test class. At the same time it's not needed for main because in BWC we need to support only latest 2.x (2.18 as of today) and that version supports rescore.
What you need to do is to add rescore part to the restart BWC test, that's it.

  1. The test is included in a separate class only in the 2.x branch to simplify blocking for lower versions.
  2. Blocking is not required in the main branch.

Despite these reasons, I prefer to keep the main and 2.x branches as similar as possible for easier maintenance, unless there are technical constraints preventing it.

then in this PR we need to drop the part for rescore from the https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/src/test/java/org/opensearch/neuralsearch/bwc/HybridSearchIT.java, so classes in both branches are similar.
We don't need version checks in main, it doesn't have any effect on the test execution because main suppose to be BWC against 2.18 and above only

@heemin32
Copy link
Collaborator

heemin32 commented Jan 7, 2025

We don't need version checks in main, it doesn't have any effect on the test execution because main suppose to be BWC against 2.18 and above only

Yes, but it would be ideal to keep the code consistent between main and 2.x. Consistency simplifies maintenance and eliminates the need to figure out why the code differs between the two branches.

@martin-gaievski
Copy link
Member

We don't need version checks in main, it doesn't have any effect on the test execution because main suppose to be BWC against 2.18 and above only

Yes, but it would be ideal to keep the code consistent between main and 2.x. Consistency simplifies maintenance and eliminates the need to figure out why the code differs between the two branches.

I don't agree, I think it actually makes it harder to understand and maintain, like having old code commented instead of just removing it after it's not longer needed. It's like the OPS activities after version release, checks that are no longer valid can be safely removed

@heemin32
Copy link
Collaborator

heemin32 commented Jan 7, 2025

We don't need version checks in main, it doesn't have any effect on the test execution because main suppose to be BWC against 2.18 and above only

Yes, but it would be ideal to keep the code consistent between main and 2.x. Consistency simplifies maintenance and eliminates the need to figure out why the code differs between the two branches.

I don't agree, I think it actually makes it harder to understand and maintain, like having old code commented instead of just removing it after it's not longer needed. It's like the OPS activities after version release, checks that are no longer valid can be safely removed

If you see https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/build.gradle, we are still keeping version check in rolling upgrade in main branch instead of removing them for many tests.

@will-hwang will-hwang closed this Jan 7, 2025
@will-hwang will-hwang reopened this Jan 7, 2025
@martin-gaievski
Copy link
Member

martin-gaievski commented Jan 7, 2025

We don't need version checks in main, it doesn't have any effect on the test execution because main suppose to be BWC against 2.18 and above only

Yes, but it would be ideal to keep the code consistent between main and 2.x. Consistency simplifies maintenance and eliminates the need to figure out why the code differs between the two branches.

I don't agree, I think it actually makes it harder to understand and maintain, like having old code commented instead of just removing it after it's not longer needed. It's like the OPS activities after version release, checks that are no longer valid can be safely removed

If you see https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/build.gradle, we are still keeping version check in rolling upgrade in main branch instead of removing them for many tests.

and this isn't correct in my opinion, those should be cleared as part of the post-release activities. Typical sequence should be: add code and version check in main, do backport, release 2.x version, do cleanup. This cleanup doesn't happen for whatever reason. Why we should follow the broken pattern if we know it's not correct.

I still think we need to drop the version checks, I'm fine with the java code for tests in this PR.

@will-hwang
Copy link
Contributor Author

will-hwang commented Jan 7, 2025

We don't need version checks in main, it doesn't have any effect on the test execution because main suppose to be BWC against 2.18 and above only

Yes, but it would be ideal to keep the code consistent between main and 2.x. Consistency simplifies maintenance and eliminates the need to figure out why the code differs between the two branches.

I don't agree, I think it actually makes it harder to understand and maintain, like having old code commented instead of just removing it after it's not longer needed. It's like the OPS activities after version release, checks that are no longer valid can be safely removed

If you see https://github.com/opensearch-project/neural-search/blob/main/qa/rolling-upgrade/build.gradle, we are still keeping version check in rolling upgrade in main branch instead of removing them for many tests.

and this isn't correct in my opinion, those should be cleared as part of the post-release activities. Typical sequence should be: add code and version check in main, do backport, release 2.x version, do cleanup. This cleanup doesn't happen for whatever reason. Why we should follow the broken pattern if we know it's not correct.

I still think we need to drop the version checks, I'm fine with the java code for tests in this PR.

what is the practice followed in other teams? i think there should be a set guideline. For example, keeping the version check for all 2.x upgrades, and clean them all at once once there's an update to 3.x. etc

@will-hwang will-hwang force-pushed the add_hybrid_query_sort_with_rescore branch from 2519574 to 9cb49ea Compare January 7, 2025 23:56
@will-hwang will-hwang force-pushed the add_hybrid_query_sort_with_rescore branch from 05bed51 to 017e2c7 Compare January 8, 2025 00:30
@martin-gaievski martin-gaievski removed the backport 2.x Label will add auto workflow to backport PR to 2.x branch label Jan 8, 2025
Copy link
Member

@martin-gaievski martin-gaievski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing all comments

@heemin32 heemin32 merged commit 97cf25c into opensearch-project:main Jan 8, 2025
40 checks passed
@martin-gaievski
Copy link
Member

with this PR we can close #951

@heemin32 heemin32 added the backport 2.x Label will add auto workflow to backport PR to 2.x branch label Jan 8, 2025
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-1066-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 97cf25c31f31f9817bedd44bf4b8ce980bcf0d63
# Push it to GitHub
git push --set-upstream origin backport/backport-1066-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-1066-to-2.x.

will-hwang added a commit to will-hwang/neural-search that referenced this pull request Jan 8, 2025
* add hybrid search with rescore IT

Signed-off-by: will-hwang <[email protected]>

* remove rescore in hybrid search IT

Signed-off-by: will-hwang <[email protected]>

* remove previous version checks in build file

Signed-off-by: will-hwang <[email protected]>

* removing version checks only in rolling upgrade tests

Signed-off-by: will-hwang <[email protected]>

* remove newly added tests in restart test

Signed-off-by: will-hwang <[email protected]>

* Revert "remove newly added tests in restart test"

This reverts commit 0987831.

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
(cherry picked from commit 97cf25c)
will-hwang added a commit to will-hwang/neural-search that referenced this pull request Jan 8, 2025
* add hybrid search with rescore IT

Signed-off-by: will-hwang <[email protected]>

* remove rescore in hybrid search IT

Signed-off-by: will-hwang <[email protected]>

* remove previous version checks in build file

Signed-off-by: will-hwang <[email protected]>

* removing version checks only in rolling upgrade tests

Signed-off-by: will-hwang <[email protected]>

* remove newly added tests in restart test

Signed-off-by: will-hwang <[email protected]>

* Revert "remove newly added tests in restart test"

This reverts commit 0987831.

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
(cherry picked from commit 97cf25c)
will-hwang added a commit to will-hwang/neural-search that referenced this pull request Jan 8, 2025
* add hybrid search with rescore IT

Signed-off-by: will-hwang <[email protected]>

* remove rescore in hybrid search IT

Signed-off-by: will-hwang <[email protected]>

* remove previous version checks in build file

Signed-off-by: will-hwang <[email protected]>

* removing version checks only in rolling upgrade tests

Signed-off-by: will-hwang <[email protected]>

* remove newly added tests in restart test

Signed-off-by: will-hwang <[email protected]>

* Revert "remove newly added tests in restart test"

This reverts commit 0987831.

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
(cherry picked from commit 97cf25c)
Signed-off-by: will-hwang <[email protected]>
will-hwang added a commit to will-hwang/neural-search that referenced this pull request Jan 8, 2025
* add hybrid search with rescore IT

Signed-off-by: will-hwang <[email protected]>

* remove rescore in hybrid search IT

Signed-off-by: will-hwang <[email protected]>

* remove previous version checks in build file

Signed-off-by: will-hwang <[email protected]>

* removing version checks only in rolling upgrade tests

Signed-off-by: will-hwang <[email protected]>

* remove newly added tests in restart test

Signed-off-by: will-hwang <[email protected]>

* Revert "remove newly added tests in restart test"

This reverts commit 0987831.

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
(cherry picked from commit 97cf25c)
Signed-off-by: will-hwang <[email protected]>
heemin32 pushed a commit that referenced this pull request Jan 9, 2025
* add hybrid search with rescore IT (#1066)

* add hybrid search with rescore IT

Signed-off-by: will-hwang <[email protected]>

* remove rescore in hybrid search IT

Signed-off-by: will-hwang <[email protected]>

* remove previous version checks in build file

Signed-off-by: will-hwang <[email protected]>

* removing version checks only in rolling upgrade tests

Signed-off-by: will-hwang <[email protected]>

* remove newly added tests in restart test

Signed-off-by: will-hwang <[email protected]>

* Revert "remove newly added tests in restart test"

This reverts commit 0987831.

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
(cherry picked from commit 97cf25c)
Signed-off-by: will-hwang <[email protected]>

* use the old contructor for neural query builder temporarily

Signed-off-by: will-hwang <[email protected]>

* use old contrutor for neural search builder temporarily

Signed-off-by: will-hwang <[email protected]>

* add version check for backport 2.x

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
martin-gaievski pushed a commit that referenced this pull request Jan 10, 2025
* add hybrid search with rescore IT

Signed-off-by: will-hwang <[email protected]>

* remove rescore in hybrid search IT

Signed-off-by: will-hwang <[email protected]>

* remove previous version checks in build file

Signed-off-by: will-hwang <[email protected]>

* removing version checks only in rolling upgrade tests

Signed-off-by: will-hwang <[email protected]>

* remove newly added tests in restart test

Signed-off-by: will-hwang <[email protected]>

* Revert "remove newly added tests in restart test"

This reverts commit 0987831.

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
martin-gaievski pushed a commit that referenced this pull request Jan 13, 2025
* add hybrid search with rescore IT

Signed-off-by: will-hwang <[email protected]>

* remove rescore in hybrid search IT

Signed-off-by: will-hwang <[email protected]>

* remove previous version checks in build file

Signed-off-by: will-hwang <[email protected]>

* removing version checks only in rolling upgrade tests

Signed-off-by: will-hwang <[email protected]>

* remove newly added tests in restart test

Signed-off-by: will-hwang <[email protected]>

* Revert "remove newly added tests in restart test"

This reverts commit 0987831.

Signed-off-by: will-hwang <[email protected]>

---------

Signed-off-by: will-hwang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants