-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add assertBusy for testMappingVersionAfterDynamicMappingUpdate #38579
Conversation
This assertBusy is necessary. When a mapping update is needed as a document is indexed, the document is tried, rejected (due to mapping conflict), then a mapping update sent off, the document is then *immediately* retried to see if the mapping change has occurred quickly enough, and if it has, indexing does not wait for the next cluster state to occur before moving ahead. In very rare cases this immediate retry succeeds, which causes the indexing request to complete (because it was successful) but the new cluster state to not be propagated entirely yet. In that case, we need to wait because the mapping version will eventually be updated, it just hasn't been updated *yet*. The first mapping-update-necessary check: https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L487 Followed immediately by the mapping update: https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L490 And then the *immediate* retry: https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L499 In the event the immediate retry fails (99.9999% of the time), the context is marked as needing to wait for a new cluster state before proceeding: https://github.com/elastic/elasticsearch/blob/622a7f1e207a552af56fec993045286abc3839e9/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java#L501-L504 In the 0.0001% case, the immediate retry succeeds, causing the test to fail. I was able to reproduce this bug about once every 10,000 tests. With the awaitsFix I ran this 100,000 times with no failures. Resolves elastic#38428
Pinging @elastic/es-distributed |
createIndex("test", client().admin().indices().prepareCreate("test").addMapping("type")); | ||
final ClusterService clusterService = getInstanceFromNode(ClusterService.class); | ||
final long previousVersion = clusterService.state().metaData().index("test").getMappingVersion(); | ||
client().prepareIndex("test", "type", "1").setSource("field", "text").get(); | ||
assertThat(clusterService.state().metaData().index("test").getMappingVersion(), equalTo(1 + previousVersion)); | ||
// This assertBusy is necessary. When a mapping update is needed as a document is indexed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed message, TIL. However, I think the commit message is sufficient for the detailed explanation. Perhaps a shorter version like:
//ensure that cluster state has been updated
Also, nitpick: s/mapping conflict/required mapping update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with or without the detailed message. I have preference to shorten or remove it, but it is your call.
Prior to this commit, when an indexing operation resulted in an `Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction immediately retries the indexing operation to see if it succeeds. In the event that it succeeds the context does not wait until the mapping update has propagated through the cluster state before finishing the indexing. In some of our tests we rely on mappings being available as soon as they've been introduced in a document that indexed correctly. By removing the immediate retry we always wait for this to be the case. Resolves elastic#38428 Supercedes elastic#38579 Relates to elastic#38711
Closing this in favor of #38873 |
Prior to this commit, when an indexing operation resulted in an `Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction immediately retries the indexing operation to see if it succeeds. In the event that it succeeds the context does not wait until the mapping update has propagated through the cluster state before finishing the indexing. In some of our tests we rely on mappings being available as soon as they've been introduced in a document that indexed correctly. By removing the immediate retry we always wait for this to be the case. Resolves #38428 Supercedes #38579 Relates to #38711
Prior to this commit, when an indexing operation resulted in an `Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction immediately retries the indexing operation to see if it succeeds. In the event that it succeeds the context does not wait until the mapping update has propagated through the cluster state before finishing the indexing. In some of our tests we rely on mappings being available as soon as they've been introduced in a document that indexed correctly. By removing the immediate retry we always wait for this to be the case. Resolves #38428 Supercedes #38579 Relates to #38711
Prior to this commit, when an indexing operation resulted in an `Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction immediately retries the indexing operation to see if it succeeds. In the event that it succeeds the context does not wait until the mapping update has propagated through the cluster state before finishing the indexing. In some of our tests we rely on mappings being available as soon as they've been introduced in a document that indexed correctly. By removing the immediate retry we always wait for this to be the case. Resolves #38428 Supercedes #38579 Relates to #38711
Prior to this commit, when an indexing operation resulted in an `Engine.Result.Type.MAPPING_UPDATE_REQUIRED`, TransportShardBulkAction immediately retries the indexing operation to see if it succeeds. In the event that it succeeds the context does not wait until the mapping update has propagated through the cluster state before finishing the indexing. In some of our tests we rely on mappings being available as soon as they've been introduced in a document that indexed correctly. By removing the immediate retry we always wait for this to be the case. Resolves #38428 Supercedes #38579 Relates to #38711
This assertBusy is necessary. When a mapping update is needed as a document is
indexed, the document is tried, rejected (due to mapping conflict), then a
mapping update sent off, the document is then immediately retried to see if
the mapping change has occurred quickly enough, and if it has, indexing does not
wait for the next cluster state to occur before moving ahead. In very rare cases
this immediate retry succeeds, which causes the indexing request to
complete (because it was successful) but the new cluster state to not be
propagated entirely yet. In that case, we need to wait because the mapping
version will eventually be updated, it just hasn't been updated yet.
The first mapping-update-necessary check:
elasticsearch/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java
Line 487 in 622a7f1
Followed immediately by the mapping update:
elasticsearch/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java
Line 490 in 622a7f1
And then the immediate retry:
elasticsearch/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java
Line 499 in 622a7f1
In the event the immediate retry fails (99.9999% of the time), the context is
marked as needing to wait for a new cluster state before proceeding:
elasticsearch/server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java
Lines 501 to 504 in 622a7f1
In the 0.0001% case, the immediate retry succeeds, causing the test to fail.
I was able to reproduce this bug about once every 10,000 tests. With the
awaitsFix I ran this 100,000 times with no failures.
Resolves #38428