-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[test] Fix IndexShardTests#testScheduledRefresh #110312
Conversation
After we flushed the shard, we only make sure that the refresh call is propagated to the shard engine, but we can't be sure that the call is actually ends up in a shard refresh. The call in `InternalEngine#refresh` can return `false` if we couldn't acquire the lock on `ElasticsearchDirectoryReader`, because it's already being refreshed. We can wrap the call in `assertBusy` to retry it in order to make sure that the shard eventually gets refreshed. Resolves #101008
Pinging @elastic/es-distributed (Team:Distributed) |
But this test concerns the |
server/src/test/java/org/elasticsearch/index/shard/IndexShardTests.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
This reverts commit a37a174.
logger.info("--> scheduledRefresh(future5)"); | ||
ensureNoPendingScheduledRefresh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's some spurious refresh being run, we cannot be sure it is at this exact spot.
Maybe a better approach would be to get the number of external refreshes before the future5, and asserting that after the scheduled refresh, it's incremented by 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kingherc That's a great idea. I've replaced the hack with blocking the refresh thread pool with checking the refresh stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to understand why the test was failing, in the logs of the test failures you can see that the flush is executed after the assertion trips, so I'm not convinced about the flush being the issue here
@elasticmachine update branch |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please handle the two comments I mentioned before merging.
primary.scheduledRefresh(future5); | ||
assertTrue(future5.actionGet()); // make sure we refresh once the shard is inactive | ||
primary.scheduledRefresh(ActionListener.noop()); | ||
// We can't check whether scheduledRefresh returns true because it races with a potential |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since @fcofdez approved, I am also fine with the current state and we see if in the future there's any issues with any other concurrent refreshes going on.
I believe this comment may not be up to date now. Since above we assertBusy that the flush has happened, probably the scheduled refresh here will also be true. I'd just remove the comment to avoid confusion.
@@ -3925,11 +3924,16 @@ public void testScheduledRefresh() throws Exception { | |||
logger.info("--> ensure search idle"); | |||
assertTrue(primary.isSearchIdle()); | |||
assertTrue(primary.searchIdleTime() >= TimeValue.ZERO.millis()); | |||
long periodicFlushesBefore = primary.flushStats().getPeriodic(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can you change the comment above
while shard is search active and ensure scheduleRefresh(...) makes documen visible:
?
because the shard was search idle and that's why scheduled refresh is false there.
@kingherc Sorry about the auto merge, I will address your comments in a follow up PR! |
* First scheduledRefresh returns false because search is idle * Remove the comment about the inability to control the result of scheduleRefresh Follow-up for elastic#110312
* First scheduledRefresh returns false because search is idle * Remove the comment about the inability to control the result of scheduleRefresh Follow-up for #110312
* First scheduledRefresh returns false because search is idle * Remove the comment about the inability to control the result of scheduleRefresh Follow-up for elastic#110312
* First scheduledRefresh returns false because search is idle * Remove the comment about the inability to control the result of scheduleRefresh Follow-up for elastic#110312
After we flushed the shard, we only make sure that the refresh call is propagated to the shard engine, but we can't be sure that the call is actually ends up in a shard refresh. The call in
InternalEngine#refresh
can returnfalse
if we couldn't acquire the lock onElasticsearchDirectoryReader
, because it's already being refreshed.We can wrap the call in
assertBusy
to retry it in order to make sure that the shard eventually gets refreshed.Resolves #101008