-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Segment replication Backpressure #6563
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
195543a
to
976effd
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
33c5011
to
2bd0f7c
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #6563 +/- ##
============================================
- Coverage 71.22% 70.68% -0.55%
+ Complexity 59521 59129 -392
============================================
Files 4803 4808 +5
Lines 283208 283449 +241
Branches 40842 40868 +26
============================================
- Hits 201712 200348 -1364
- Misses 65266 66608 +1342
- Partials 16230 16493 +263
... and 460 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
server/src/internalClusterTest/java/org/opensearch/index/SegmentReplicationPressureIT.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/SegmentReplicationPressureService.java
Outdated
Show resolved
Hide resolved
4, | ||
1, | ||
Setting.Property.Dynamic, | ||
Setting.Property.NodeScope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this (and other) be index scoped IndexScope
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my thinking initially, but felt it would get a bit difficult to manage. I think we can start with node scope and extend to index if the need is there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be we can rename the setting constants here to reflect here cluster or node scope.
index.segrep.pressure.checkpoint.limit
-> node.segrep....
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack good catch - I've just removed the index. prefix. ex. segrep.pressure.checkpoint.limit
This PR introduces new mechanisms to keep track of the current replicas within a replication group and apply backpressure if they fall too far behind. Writes will be rejected under the following conditions: 1. More than half (default setting) of the replication group is 'stale'. Defined by setting MAX_ALLOWED_STALE_SHARDS. 2. A replica is stale if it is behind more than MAX_INDEXING_CHECKPOINTS, default 4 AND its current replication lag is over MAX_REPLICATION_TIME_SETTING, default 5 minutes. This PR intentionally implements rejections only for index operations, allowing other TransportWriteActions to succeed, TransportResyncReplicationAction and RetentionLeaseSyncAction. Blocking these requests will fail recoveries as new nodes are added. Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Marc Handalian <[email protected]>
force pushed a rebase from main. |
Gradle Check (Jenkins) Run Completed with:
|
Spotless check failure. |
Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Marc Handalian <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-6563-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6babc087f7bd5774b97e67f9e386187fe0db3ecb
# Push it to GitHub
git push --set-upstream origin backport/backport-6563-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
* Add Segment Replication backpressure. This PR introduces new mechanisms to keep track of the current replicas within a replication group and apply backpressure if they fall too far behind. Writes will be rejected under the following conditions: 1. More than half (default setting) of the replication group is 'stale'. Defined by setting MAX_ALLOWED_STALE_SHARDS. 2. A replica is stale if it is behind more than MAX_INDEXING_CHECKPOINTS, default 4 AND its current replication lag is over MAX_REPLICATION_TIME_SETTING, default 5 minutes. This PR intentionally implements rejections only for index operations, allowing other TransportWriteActions to succeed, TransportResyncReplicationAction and RetentionLeaseSyncAction. Blocking these requests will fail recoveries as new nodes are added. Signed-off-by: Marc Handalian <[email protected]> * Add changelog Signed-off-by: Marc Handalian <[email protected]> * Fix test class to match naming conventions. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> * Change setting keys to remove index scope. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]>
* Add Segment Replication backpressure. This PR introduces new mechanisms to keep track of the current replicas within a replication group and apply backpressure if they fall too far behind. Writes will be rejected under the following conditions: 1. More than half (default setting) of the replication group is 'stale'. Defined by setting MAX_ALLOWED_STALE_SHARDS. 2. A replica is stale if it is behind more than MAX_INDEXING_CHECKPOINTS, default 4 AND its current replication lag is over MAX_REPLICATION_TIME_SETTING, default 5 minutes. This PR intentionally implements rejections only for index operations, allowing other TransportWriteActions to succeed, TransportResyncReplicationAction and RetentionLeaseSyncAction. Blocking these requests will fail recoveries as new nodes are added. Signed-off-by: Marc Handalian <[email protected]> * Add changelog Signed-off-by: Marc Handalian <[email protected]> * Fix test class to match naming conventions. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> * Change setting keys to remove index scope. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]>
* Implement Segment replication Backpressure (#6563) * Add Segment Replication backpressure. This PR introduces new mechanisms to keep track of the current replicas within a replication group and apply backpressure if they fall too far behind. Writes will be rejected under the following conditions: 1. More than half (default setting) of the replication group is 'stale'. Defined by setting MAX_ALLOWED_STALE_SHARDS. 2. A replica is stale if it is behind more than MAX_INDEXING_CHECKPOINTS, default 4 AND its current replication lag is over MAX_REPLICATION_TIME_SETTING, default 5 minutes. This PR intentionally implements rejections only for index operations, allowing other TransportWriteActions to succeed, TransportResyncReplicationAction and RetentionLeaseSyncAction. Blocking these requests will fail recoveries as new nodes are added. Signed-off-by: Marc Handalian <[email protected]> * Add changelog Signed-off-by: Marc Handalian <[email protected]> * Fix test class to match naming conventions. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> * Change setting keys to remove index scope. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]> * Fix Xcontent imports. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]>
* Add Segment Replication backpressure. This PR introduces new mechanisms to keep track of the current replicas within a replication group and apply backpressure if they fall too far behind. Writes will be rejected under the following conditions: 1. More than half (default setting) of the replication group is 'stale'. Defined by setting MAX_ALLOWED_STALE_SHARDS. 2. A replica is stale if it is behind more than MAX_INDEXING_CHECKPOINTS, default 4 AND its current replication lag is over MAX_REPLICATION_TIME_SETTING, default 5 minutes. This PR intentionally implements rejections only for index operations, allowing other TransportWriteActions to succeed, TransportResyncReplicationAction and RetentionLeaseSyncAction. Blocking these requests will fail recoveries as new nodes are added. Signed-off-by: Marc Handalian <[email protected]> * Add changelog Signed-off-by: Marc Handalian <[email protected]> * Fix test class to match naming conventions. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> * Change setting keys to remove index scope. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]> Signed-off-by: Mingshi Liu <[email protected]>
Description
This PR is a re-cut of #6520 that includes implementing backpressure and removes the addition of these metrics to NodeStats API. This is to make it easier to see in this PR how this will be used to implement pressure. The metrics additions will be in a separate change.
This PR adds backpressure for index operations when Segment Replication is enabled.
This PR implements backpressure mechanisms for segment replication to prevent lagging
replicas from falling too far behind. Writes will be rejected under the following conditions:
MAX_REPLICATION_TIME_SETTING, default 5 minutes.
This PR intentionally implements rejections only for index operations,
allowing other TransportWriteActions to succeed, TransportResyncReplicationAction and RetentionLeaseSyncAction.
Blocking these requests will fail recoveries as new nodes are added.
Issues Resolved
#4478
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.