Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

primary weight factor should exchange primary and replica at the same in the smalll cluster in some case #8060

Closed
kkewwei opened this issue Jun 14, 2023 · 6 comments
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request

Comments

@kkewwei
Copy link
Contributor

kkewwei commented Jun 14, 2023

Is your feature request related to a problem? Please describe.
It is very much looking forward to introducing the primary weight factor in #6017, but it seems not work in the small cluster, for example:
index1 settings:

 "number_of_replicas": "2",
 "number_of_shards": "3"

The cluster has three nodes: node0, node1, node2, the allocation of the index1 is:
                node0 node1 node2
shard0           p           r           r
shard1           p           r           r
shard2           p           r           r
there are 3 primary shards in the node0.

When put the settings:

PUT _cluster/settings
{
    "persistent": {
        "cluster.routing.allocation.balance.prefer_primary":"true"
    }
}

There will no primary rebalance as expected, the reason is that a copy of this shard is already allocated to every target host.

Describe the solution you'd like
There are good reasons to rebalance primary shards. When rebalancing the primary ,and target node has the replica shard, if we should exchange the primary and replica at the same time.

In addition, I'm a little confused why we not Introduce cluster.primary.shard.balance.constraint in rebalance, the feature seems usefully in certain scenarios.

@kkewwei kkewwei added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 14, 2023
@Bukhtawar
Copy link
Collaborator

cc: @dreamer-89

@dreamer-89
Copy link
Member

Thank you @kkewwei for opening this issue.

There are good reasons to rebalance primary shards. When rebalancing the primary ,and target node has the replica shard, if we should exchange the primary and replica at the same time.

There are few limitations today which prevent primary shard balancing. One of them prevents primary shard movement when target node already contains replica copy. This is known as SameShardAllocationDecider. We do have issue open to fix this and tracked in #6481 . If you are interested please feel free to take up and work on this issue.

In addition, I'm a little confused why we not Introduce cluster.primary.shard.balance.constraint in rebalance, the feature seems usefully in certain scenarios.

We are actually using cluster.primary.shard.balance.constraint during shard rebalancing. We are using same constants for both allocation and rebalance actions. PR where this change was introduced #6422

@dreamer-89
Copy link
Member

dreamer-89 commented Jun 15, 2023

@kkewwei: Closing this issue since in favour #6481. We can continue discussion on #6481

@kkewwei
Copy link
Contributor Author

kkewwei commented Jun 15, 2023

@dreamer-89 sorry to reopen the issue.
I'm did a little confused about cluster.primary.shard.balance.constraint during shard rebalancing. I test the local scenarios, and it doesn't work:
index* settings:

 "number_of_replicas": "1",
 "number_of_shards": "1"

The cluster has three nodes: node0, node1, node2, every index* has only one primary and one replica, the allocation is:
                node0 node1 node2
index0           p         r
index1           p                     r
index2                     p           r
the node0 has two primary shards, even if cluster.routing.allocation.balance.prefer_primary=true.

In RebalanceConstraints, we don't set cluster.primary.shard.balance.constraint
https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/cluster/routing/allocation/RebalanceConstraints.java#L32

If we should add cluster.primary.shard.balance.constraint in RebalanceConstraints? or i miss anything?

@kkewwei
Copy link
Contributor Author

kkewwei commented Jun 16, 2023

@dreamer-89 please confirm it in your spare time.

@dreamer-89
Copy link
Member

dreamer-89 commented Jun 16, 2023

Thank you @kkewwei for the comment, deep dive and sharing your use case with detail.

Today, rebalancing of primary shard is performed at index level i.e. primary shards belonging to same index are distributed equally across nodes. The use case you shared above relates to primary balance across all indices. This is another problem that we have yet to solve and is tracked in separate issue #6642.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request
Projects
None yet
Development

No branches or pull requests

5 participants