Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ReplicationFailedException instead of OpensearchException in Repl… #4725

Merged
merged 9 commits into from
Oct 13, 2022

Conversation

ayushKataria
Copy link
Contributor

@ayushKataria ayushKataria commented Oct 10, 2022

…icationTarget

Signed-off-by: Ayush Kataria [email protected]

Description

Uses a narrower ReplicationFailedException instead of OpensearchException in ReplicationTarget and ReplicationListener

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ayushKataria ayushKataria requested review from a team and reta as code owners October 10, 2022 13:13
Signed-off-by: Ayush Kataria <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Ayush Kataria <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Ayush Kataria <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Ayush Kataria <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change @ayushKataria. Some minor comments that I think are causing the test failures.

new CancellableThreads.ExecutionCancelledException("replication was canceled reason [" + reason + "]");
final ReplicationFailedException executionCancelledException = new ReplicationFailedException(
"replication was canceled reason [" + reason + "]"
);
notifyListener(executionCancelledException, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With notifyListener now accepting a ReplicationFailedException, we should wrap the cancellationException in ReplicationFailedException.

        cancellableThreads.setOnCancel((reason, beforeCancelEx) -> {
            // This method only executes when cancellation is triggered by this node and caught by a call to checkForCancel,
            // SegmentReplicationSource does not share CancellableThreads.
            final CancellableThreads.ExecutionCancelledException executionCancelledException =
                new CancellableThreads.ExecutionCancelledException("replication was canceled reason [" + reason + "]");
            notifyListener(new ReplicationFailedException("Segment replication failed to complete", executionCancelledException), false);
            throw executionCancelledException;
        });

There is some special logic above in SegmentReplicationTarget that overrides notifyListener to unwrap the failure and mark it as cancelled. This block will never trigger if the cancellation initiates from the primary however. So thats why the special logic is not in this setOnCancel block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, thanks for the help. For some reason I was not able to figure the trigger out exactly

// Cancellations still are passed to our SegmentReplicationListner as failures, if we have failed because of cancellation
// update the stage.
final Throwable cancelledException = ExceptionsHelper.unwrap(e, CancellableThreads.ExecutionCancelledException.class);
if (cancelledException != null) {
state.setStage(SegmentReplicationState.Stage.CANCELLED);
listener.onFailure(state(), (CancellableThreads.ExecutionCancelledException) cancelledException, sendShardFailure);
listener.onFailure(state(), new ReplicationFailedException(indexShard, cancelledException), sendShardFailure);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit - I don't think we need to cast or re-wrap the exception here.

    @Override
    public void notifyListener(ReplicationFailedException e, boolean sendShardFailure) {
        // Cancellations still are passed to our SegmentReplicationListner as failures, if we have failed because of cancellation
        // update the stage.
        if (ExceptionsHelper.unwrap(e, CancellableThreads.ExecutionCancelledException.class) != null) {
            state.setStage(SegmentReplicationState.Stage.CANCELLED);
        }
        listener.onFailure(state(), e, sendShardFailure);
}

* @param e exception that encapsulates the failure
* @param sendShardFailure indicates whether to notify the master of the shard failure
*/
public void fail(RecoveryFailedException e, boolean sendShardFailure) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With RecoveryFailedException extending ReplicationFailedException I don't think we need to override here in RecoveryTarget?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense. This is redundant now, so will remove it.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

codecov-commenter commented Oct 11, 2022

Codecov Report

Merging #4725 (f6c25fe) into main (d15795a) will decrease coverage by 0.00%.
The diff coverage is 46.69%.

@@             Coverage Diff              @@
##               main    #4725      +/-   ##
============================================
- Coverage     70.66%   70.65%   -0.01%     
- Complexity    57578    57594      +16     
============================================
  Files          4661     4669       +8     
  Lines        276662   276870     +208     
  Branches      40325    40346      +21     
============================================
+ Hits         195501   195624     +123     
- Misses        64926    64949      +23     
- Partials      16235    16297      +62     
Impacted Files Coverage Δ
.../delete/DeleteDecommissionStateRequestBuilder.java 0.00% <0.00%> (ø)
...apshots/restore/RestoreSnapshotRequestBuilder.java 15.78% <0.00%> (-0.88%) ⬇️
.../org/opensearch/client/support/AbstractClient.java 32.61% <0.00%> (-0.32%) ⬇️
.../java/org/opensearch/common/util/FeatureFlags.java 50.00% <ø> (ø)
...ateway/TransportNodesListGatewayStartedShards.java 51.00% <0.00%> (-0.68%) ⬇️
...h/index/shard/RemoveCorruptedShardDataCommand.java 81.51% <ø> (ø)
...java/org/opensearch/index/shard/StoreRecovery.java 67.88% <ø> (+0.01%) ⬆️
...h/index/store/InMemoryRemoteSnapshotDirectory.java 0.00% <0.00%> (ø)
...in/java/org/opensearch/indices/IndicesService.java 69.65% <0.00%> (-0.25%) ⬇️
...arch/indices/recovery/RecoveryFailedException.java 100.00% <ø> (ø)
... and 514 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the change @ayushKataria!

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@mch2 mch2 merged commit 6bae823 into opensearch-project:main Oct 13, 2022
ashking94 pushed a commit to ashking94/OpenSearch that referenced this pull request Nov 7, 2022
opensearch-project#4725)

* Use ReplicationFailedException instead of OpensearchException in ReplicationTarget

Signed-off-by: Ayush Kataria <[email protected]>

* CHANGELOG.md updated

Signed-off-by: Ayush Kataria <[email protected]>

* test fixes

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* fixes for failing test as suggested in PR comments

Signed-off-by: Ayush Kataria <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
@dreamer-89 dreamer-89 added the backport 2.x Backport to 2.x branch label Jan 20, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-4725-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6bae823cefad7373a748ce833ce75d80f48b380b
# Push it to GitHub
git push --set-upstream origin backport/backport-4725-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-4725-to-2.x.

dreamer-89 pushed a commit to dreamer-89/OpenSearch that referenced this pull request Jan 20, 2023
opensearch-project#4725)

* Use ReplicationFailedException instead of OpensearchException in ReplicationTarget

Signed-off-by: Ayush Kataria <[email protected]>

* CHANGELOG.md updated

Signed-off-by: Ayush Kataria <[email protected]>

* test fixes

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* fixes for failing test as suggested in PR comments

Signed-off-by: Ayush Kataria <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
dreamer-89 pushed a commit to dreamer-89/OpenSearch that referenced this pull request Jan 20, 2023
opensearch-project#4725)

* Use ReplicationFailedException instead of OpensearchException in ReplicationTarget

Signed-off-by: Ayush Kataria <[email protected]>

* CHANGELOG.md updated

Signed-off-by: Ayush Kataria <[email protected]>

* test fixes

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* fixes for failing test as suggested in PR comments

Signed-off-by: Ayush Kataria <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
dreamer-89 pushed a commit to dreamer-89/OpenSearch that referenced this pull request Jan 20, 2023
opensearch-project#4725)

* Use ReplicationFailedException instead of OpensearchException in ReplicationTarget

Signed-off-by: Ayush Kataria <[email protected]>

* CHANGELOG.md updated

Signed-off-by: Ayush Kataria <[email protected]>

* test fixes

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* fixes for failing test as suggested in PR comments

Signed-off-by: Ayush Kataria <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
dreamer-89 pushed a commit to dreamer-89/OpenSearch that referenced this pull request Jan 20, 2023
opensearch-project#4725)

* Use ReplicationFailedException instead of OpensearchException in ReplicationTarget

Signed-off-by: Ayush Kataria <[email protected]>

* CHANGELOG.md updated

Signed-off-by: Ayush Kataria <[email protected]>

* test fixes

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* fixes for failing test as suggested in PR comments

Signed-off-by: Ayush Kataria <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
Signed-off-by: Suraj Singh <[email protected]>
dreamer-89 added a commit that referenced this pull request Jan 20, 2023
…icationTarget (#5955)

* Use ReplicationFailedException instead of OpensearchException in Repl… (#4725)

* Use ReplicationFailedException instead of OpensearchException in ReplicationTarget

Signed-off-by: Ayush Kataria <[email protected]>

* CHANGELOG.md updated

Signed-off-by: Ayush Kataria <[email protected]>

* test fixes

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* fixes for failing test as suggested in PR comments

Signed-off-by: Ayush Kataria <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
Signed-off-by: Suraj Singh <[email protected]>

* Update SegmentReplicationListener to use ReplicationFailedException

Signed-off-by: Suraj Singh <[email protected]>

* Spotless fix

Signed-off-by: Suraj Singh <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
Signed-off-by: Suraj Singh <[email protected]>
Co-authored-by: Ayush Kataria <[email protected]>
dreamer-89 added a commit to dreamer-89/OpenSearch that referenced this pull request Jan 20, 2023
kotwanikunal pushed a commit that referenced this pull request Jan 20, 2023
kotwanikunal pushed a commit that referenced this pull request Jan 25, 2023
…icationTarget (#5955)

* Use ReplicationFailedException instead of OpensearchException in Repl… (#4725)

* Use ReplicationFailedException instead of OpensearchException in ReplicationTarget

Signed-off-by: Ayush Kataria <[email protected]>

* CHANGELOG.md updated

Signed-off-by: Ayush Kataria <[email protected]>

* test fixes

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* spotless fix

Signed-off-by: Ayush Kataria <[email protected]>

* fixes for failing test as suggested in PR comments

Signed-off-by: Ayush Kataria <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
Signed-off-by: Suraj Singh <[email protected]>

* Update SegmentReplicationListener to use ReplicationFailedException

Signed-off-by: Suraj Singh <[email protected]>

* Spotless fix

Signed-off-by: Suraj Singh <[email protected]>

Signed-off-by: Ayush Kataria <[email protected]>
Signed-off-by: Suraj Singh <[email protected]>
Co-authored-by: Ayush Kataria <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants