Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax ShardFollowTasksExecutor validation #60054

Merged
merged 1 commit into from
Jul 27, 2020

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jul 22, 2020

If a primary shard of a follower index is being relocated, then we will fail to create a follow-task. This validation is too restricted. We should ensure that all primaries of the follower index are active instead.

Closes #59625

@dnhatn dnhatn added >bug :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v8.0.0 v6.8.12 v7.9.1 labels Jul 22, 2020
@dnhatn dnhatn requested a review from ywelsch July 22, 2020 15:34
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/CCR)

@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 22, 2020
IndexRoutingTable routingTable = clusterState.getRoutingTable().index(params.getFollowShardId().getIndex());
if (routingTable.shard(params.getFollowShardId().id()).primaryShard().started() == false) {
throw new IllegalArgumentException("Not all copies of follow shard are started");
final IndexRoutingTable routingTable = clusterState.getRoutingTable().index(params.getFollowShardId().getIndex());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should remove this validation. Should we be able to create a follow task while some follower shards are temporarily unavailable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'm not sure either what the point of this validation is at this specific point in the code (maybe @martijnvg can help with that). It forces retry behavior upstream for "put follow" and "resume follow" to be resilient.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this, and get some follow-up clarification on the raised points.

@dnhatn
Copy link
Member Author

dnhatn commented Jul 27, 2020

Thanks Yannick.

@dnhatn dnhatn merged commit 5d8a6bb into elastic:master Jul 27, 2020
@dnhatn dnhatn deleted the relax-ccr-validate branch July 27, 2020 13:26
dnhatn added a commit that referenced this pull request Jul 27, 2020
If a primary shard of a follower index is being relocated, then we
will fail to create a follow-task. This validation is too restricted.
We should ensure that all primaries of the follower index are active
instead.

Closes #59625
dnhatn added a commit that referenced this pull request Jul 27, 2020
If a primary shard of a follower index is being relocated, then we
will fail to create a follow-task. This validation is too restricted.
We should ensure that all primaries of the follower index are active
instead.

Closes #59625
dnhatn added a commit that referenced this pull request Jul 28, 2020
If a primary shard of a follower index is being relocated, then we
will fail to create a follow-task. This validation is too restricted.
We should ensure that all primaries of the follower index are active
instead.

Closes #59625
dnhatn added a commit that referenced this pull request Aug 11, 2020
If a primary shard of a follower index is being relocated, then we
will fail to create a follow-task. This validation is too restricted.
We should ensure that all primaries of the follower index are active
instead.

Closes #59625
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Aug 20, 2021
elasticsearchmachine pushed a commit that referenced this pull request Aug 20, 2021
arteam added a commit to arteam/elasticsearch that referenced this pull request Aug 20, 2021
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Aug 20, 2021
elasticsearchmachine pushed a commit that referenced this pull request Aug 21, 2021
* Remove AwaitsFix CcrRollingUpgradeIT (#76765)

Fixed in #60054

* Always enable soft-deletes in CcrRollingUpgradeIT (#76786)
arteam added a commit that referenced this pull request Aug 22, 2021
* Remove all @AwaitsFix CcrRollingUpgradeIT

See #76765 and #60054

* Use the wait_for_active_shards adopted from ES 8

* Update CcrRollingUpgradeIT.java

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v6.8.12 v7.9.1 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CcrRollingUpgradeIT fails on CI
4 participants