Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making shard copy count a multiple of attribute count #3462

Merged
merged 4 commits into from
Jul 29, 2022

Conversation

gbbafna
Copy link
Collaborator

@gbbafna gbbafna commented May 27, 2022

Description

A boolean cluster level setting cluster.routing.allocation.awareness.balance which is false by default . When true, we would validate that total copies is always a multiple of awareness attribute value count . If not, we will throw a validation exception. If there are multiple awareness attributes, the balance needs to ensure that largest varying attribute of awareness_attribute is equally balance. For ex, if there are 2 Awareness Attributes, zones and rack ids, each having 2 possible values , total copies will to be multiple of 2.

Tested

  1. Create/Update Index APIs
  2. Snapshot restore
  3. Index Template creation.

Issues Resolved

#3461

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@gbbafna gbbafna requested review from a team and reta as code owners May 27, 2022 10:10
@gbbafna gbbafna changed the title Making all copies a multiple of attribute count Making shard copy count a multiple of attribute count May 27, 2022
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 5870d5aca952b773ac4632ebf44f0093bc3193bb
Log 5628

Reports 5628

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 7fb9e069ea6c1bdcc109505f33366645630cd018
Log 5853

Reports 5853

@dblock dblock requested a review from andrross June 13, 2022 23:40
@dblock
Copy link
Member

dblock commented Jun 13, 2022

@gbbafna Are you trying to finish this? Care to explain a bit more what this accomplishes above?

@gbbafna gbbafna marked this pull request as draft June 14, 2022 04:00
@gbbafna
Copy link
Collaborator Author

gbbafna commented Jun 14, 2022

@gbbafna Are you trying to finish this? Care to explain a bit more what this accomplishes above?

Hi @dblock , yes, I am working on this.

Please check the details in #3461 . Let me know on which part I need to give more context on .

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 0f3ee95a0fa74f3c764aef072ab98f0855e03575
Log 5975

Reports 5975

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure db0cc2be50e27d2eafe05a1a8d88cca16610d159
Log 6020

Reports 6020

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 870d6dc03ce34ae13537cc45c036cddb04512878
Log 6021

Reports 6021

@gbbafna gbbafna force-pushed the replica-count branch 2 times, most recently from 1e40a15 to 3328553 Compare June 15, 2022 09:15
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 332855374dbfcf2924bfb1c990ba1772a57ee0cc
Log 6023

Reports 6023

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 41e8297a97cbd3bd96c71675e10e99d19fad49b0
Log 6063

Reports 6063

@gbbafna gbbafna marked this pull request as ready for review June 17, 2022 04:12
@gbbafna gbbafna requested a review from Bukhtawar June 17, 2022 06:04
Comment on lines 1170 to 1180
int awarenessAttributes = this.awarenessReplicaBalance.awarenessAttributes();
if (awarenessAttributes > 1) {
if ((settings.getAsInt(IndexMetadata.SETTING_NUMBER_OF_REPLICAS, INDEX_NUMBER_OF_REPLICAS_SETTING.getDefault(Settings.EMPTY))
+ 1) % awarenessAttributes != 0) {
validationErrors.add("expected total copies needs to be a multiple of total awareness attributes " + awarenessAttributes);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to mention which attribute zone/rack etc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It will add the complexity of returning attribute as well from awarenessAttributes function , just for logging purpose. So I prefer not doing it . On seeing the error message with and without the attribute, user would anyways need to revisit awareness attributes values . Do you think this behavior is okay ?

Comment on lines 202 to 203
if (awarenessAttributes > 1) {
if ((updatedNumberOfReplicas + 1) % awarenessAttributes != 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put the multiple logic as a Predicate in the AwarenessReplicaBalance

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. But we still need to expose awarenessAttributes() for the validation exception .

@gbbafna gbbafna requested a review from Bukhtawar June 27, 2022 07:50
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 8b401e5901a19de0bd5c36e7ac0798318e9255aa
Log 6358

Reports 6358

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Bukhtawar
Copy link
Collaborator

Would be great to see rollover tests as well.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@gbbafna
Copy link
Collaborator Author

gbbafna commented Jul 29, 2022

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Gaurav Bafna <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Bukhtawar Bukhtawar merged commit 82ff463 into opensearch-project:main Jul 29, 2022
@Bukhtawar Bukhtawar added the backport 2.x Backport to 2.x branch label Jul 29, 2022
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-3462-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 82ff463fd1bc07c241c8674b72182f2875a60425
# Push it to GitHub
git push --set-upstream origin backport/backport-3462-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-3462-to-2.x.

gbbafna added a commit to gbbafna/OpenSearch that referenced this pull request Aug 1, 2022
…ject#3462)

* Making all the copies a multiple of attribute count

Signed-off-by: Gaurav Bafna <[email protected]>
(cherry picked from commit 82ff463)
gbbafna added a commit to gbbafna/OpenSearch that referenced this pull request Aug 2, 2022
…ject#3462)

* Making all the copies a multiple of attribute count

Signed-off-by: Gaurav Bafna <[email protected]>
(cherry picked from commit 82ff463)
Bukhtawar pushed a commit that referenced this pull request Aug 2, 2022
* [Backport 2.x] Making all the copies a multiple of attribute count

Signed-off-by: Gaurav Bafna <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants