Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add assertBusy for testMappingVersionAfterDynamicMappingUpdate #38579

Closed

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Feb 7, 2019

This assertBusy is necessary. When a mapping update is needed as a document is
indexed, the document is tried, rejected (due to mapping conflict), then a
mapping update sent off, the document is then immediately retried to see if
the mapping change has occurred quickly enough, and if it has, indexing does not
wait for the next cluster state to occur before moving ahead. In very rare cases
this immediate retry succeeds, which causes the indexing request to
complete (because it was successful) but the new cluster state to not be
propagated entirely yet. In that case, we need to wait because the mapping
version will eventually be updated, it just hasn't been updated yet.

The first mapping-update-necessary check:

if (result.getResultType() == Engine.Result.Type.MAPPING_UPDATE_REQUIRED) {

Followed immediately by the mapping update:

mappingUpdater.accept(result.getRequiredMappingUpdate());

And then the immediate retry:

In the event the immediate retry fails (99.9999% of the time), the context is
marked as needing to wait for a new cluster state before proceeding:

if (result.getResultType() == Engine.Result.Type.MAPPING_UPDATE_REQUIRED) {
// double mapping update. We assume that the successful mapping update wasn't yet processed on the node
// and retry the entire request again.
context.markAsRequiringMappingUpdate();

In the 0.0001% case, the immediate retry succeeds, causing the test to fail.

I was able to reproduce this bug about once every 10,000 tests. With the
awaitsFix I ran this 100,000 times with no failures.

Resolves #38428

This assertBusy is necessary. When a mapping update is needed as a document is
indexed, the document is tried, rejected (due to mapping conflict), then a
mapping update sent off, the document is then *immediately* retried to see if
the mapping change has occurred quickly enough, and if it has, indexing does not
wait for the next cluster state to occur before moving ahead. In very rare cases
this immediate retry succeeds, which causes the indexing request to
complete (because it was successful) but the new cluster state to not be
propagated entirely yet. In that case, we need to wait because the mapping
version will eventually be updated, it just hasn't been updated *yet*.

The first mapping-update-necessary check:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L487

Followed immediately by the mapping update:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L490

And then the *immediate* retry:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L499

In the event the immediate retry fails (99.9999% of the time), the context is
marked as needing to wait for a new cluster state before proceeding:

https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L501-L504

In the 0.0001% case, the immediate retry succeeds, causing the test to fail.

I was able to reproduce this bug about once every 10,000 tests. With the
awaitsFix I ran this 100,000 times with no failures.

Resolves elastic#38428
@dakrone dakrone added >test Issues or PRs that are addressing/adding tests :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v7.0.0 v6.7.0 v8.0.0 v7.2.0 labels Feb 7, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@jakelandis jakelandis self-requested a review February 11, 2019 15:20
createIndex("test", client().admin().indices().prepareCreate("test").addMapping("type"));
final ClusterService clusterService = getInstanceFromNode(ClusterService.class);
final long previousVersion = clusterService.state().metaData().index("test").getMappingVersion();
client().prepareIndex("test", "type", "1").setSource("field", "text").get();
assertThat(clusterService.state().metaData().index("test").getMappingVersion(), equalTo(1 + previousVersion));
// This assertBusy is necessary. When a mapping update is needed as a document is indexed,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed message, TIL. However, I think the commit message is sufficient for the detailed explanation. Perhaps a shorter version like:
//ensure that cluster state has been updated

Also, nitpick: s/mapping conflict/required mapping update

Copy link
Contributor

@jakelandis jakelandis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with or without the detailed message. I have preference to shorten or remove it, but it is your call.

dakrone added a commit to dakrone/elasticsearch that referenced this pull request Feb 13, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves elastic#38428
Supercedes elastic#38579
Relates to elastic#38711
@dakrone
Copy link
Member Author

dakrone commented Feb 14, 2019

Closing this in favor of #38873

@dakrone dakrone closed this Feb 14, 2019
dakrone added a commit that referenced this pull request Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
dakrone added a commit that referenced this pull request Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
dakrone added a commit that referenced this pull request Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
dakrone added a commit that referenced this pull request Feb 14, 2019
Prior to this commit, when an indexing operation resulted in an
`Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction
immediately retries the indexing operation to see if it succeeds. In the event
that it succeeds the context does not wait until the mapping update has
propagated through the cluster state before finishing the indexing.

In some of our tests we rely on mappings being available as soon as they've been
introduced in a document that indexed correctly. By removing the immediate retry
we always wait for this to be the case.

Resolves #38428
Supercedes #38579
Relates to #38711
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >test Issues or PRs that are addressing/adding tests v6.7.0 v7.0.0-rc1 v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants