-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add global and index level blocks to IndexSettings #35695
Conversation
Pinging @elastic/es-distributed |
@tlrx can you elaborate more about the failure? sampling the cluster state may indeed not reflect the cluster state that has been used to update the shards what are the steps for that being a problem? I looked at the linked issue and I don't see clear description there either. |
@bleskes sorry, I should have written down what Yannick and I discussed via another channel. The test that failed in #35597 is a rolling upgrade test of a 3 nodes cluster where each node is upgraded sequentially and tests executed after each node upgrade. During the upgrade of a node the cluster has no master and the cluster state contains a "no master" global block. At some point during the test, a replica shard is promoted to primary. The incoming cluster state is processed by the The primary-replica synchronization is executed using a Before #35332 and because the resync action skips the reroute phase, the blocks were just never checked. I talked to @ywelsch about this and it reminded us a conversation we had weeks ago when we discuss the necessity to make cluster blocks from the incoming cluster state visible to the |
Thank you @tlrx. I agree that it will be good to expose the blocks from something the shard owns. That said, I think we need to be careful about what they mean exactly and when we expose them. For example - I think index settings are updated before the shard updates it's internal state when a setting is changes. We want to be careful not to re-introduce the cluster-state-from-the-future problem we used to have where we have to worry about cluster state not being applied after we sample it. Did you guys discuss these semantics? Finally - regarding the specific issue here - I wonder if resync should honour blocks - it's a tricky thing - we are making sure shards are in sync which is a good thing, even if the user marks the index as read only. We don't really need the master block as we validate the primary terms. Am i missing something? |
We talked via another channel and we decided to keep the current behavior for now in order to not reintroduce a cluster-state-from-the-future problem and not break the current semantic. We decided to make it more obvious that the resync action should not be blocked at all because of its internal nature, and I opened #35795 for this. |
This pull request adds a new
getIndexBlocks()
method to theIndexSettings
class. This method returns aClusterBlocks
object that can be used to retrieve the current global level and index level blocks set on theIndexShard
object which the index settings belongs to.While the purpose of such method has been discussed via another channel few weeks ago, it resurfaced recently after the merge of #35332 in which we added check for global/index blocks in the primary action of transport replication actions. This change caused some tests to fail on CI (see #35597): the
TransportResyncReplicationAction
failed and the replica was never promoted to primary before the test timed out. The resync failed because the primary action inTransportResyncReplicationAction
checks blocks using the cluster state from theClusterService
, which is not yet updated and in the case of this tests still contains a global "no master" block, whereas it should check blocks against the blocks from the incoming cluster state that is not yet applied.This pull request changes the
IndicesClusterStateService
so that blocks are updated and propagated to theIndexSettings
. After #35332 has been recomited again (it was reverted to allow CI to pass), a follow up PR will change how blocks are checked in TransportReplicationAction so that it uses blocks fromindexShard.indexSettings().getIndexBlocks()
.