-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose translog stats in ReadOnlyEngine #43752
Conversation
Pinging @elastic/es-distributed |
@elasticmachine run elasticsearch-ci/docbldesx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left two minor comments.
--- | ||
"Translog stats on closed indices": | ||
- skip: | ||
version: " - 7.2.99" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should skip this test until 7.9.99 then adjust to this value after backporting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird that this did not fail the tests? Perhaps BWC is disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pfff... I should have seen this. And yes, bwc tests were disabled at that time. I merged master and bwc are enabled again.
@@ -119,6 +121,7 @@ public ReadOnlyEngine(EngineConfig config, SeqNoStats seqNoStats, TranslogStats | |||
reader = wrapReader(reader, readerWrapperFunction); | |||
searcherManager = new SearcherManager(reader, searcherFactory); | |||
this.docsStats = docsStats(lastCommittedSegmentInfos); | |||
this.translogStats = translogStats != null ? translogStats : translogStats(config, lastCommittedSegmentInfos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assert that if translogStats
is null then obtainLock
is true? This ensures that we don't open two translog instances at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ especially as opening the translog here writes a new checkpoint file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good suggestion, thanks
--- | ||
"Translog stats on closed indices": | ||
- skip: | ||
version: " - 7.2.99" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird that this did not fail the tests? Perhaps BWC is disabled?
- do: | ||
cluster.health: | ||
expand_wildcards: all | ||
wait_for_no_initializing_shards: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only needed for the backport.
Alternatively you could use wait_for_active_shards = 1 on the indices.close action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right
server/src/main/java/org/elasticsearch/index/engine/ReadOnlyEngine.java
Outdated
Show resolved
Hide resolved
@@ -119,6 +121,7 @@ public ReadOnlyEngine(EngineConfig config, SeqNoStats seqNoStats, TranslogStats | |||
reader = wrapReader(reader, readerWrapperFunction); | |||
searcherManager = new SearcherManager(reader, searcherFactory); | |||
this.docsStats = docsStats(lastCommittedSegmentInfos); | |||
this.translogStats = translogStats != null ? translogStats : translogStats(config, lastCommittedSegmentInfos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ especially as opening the translog here writes a new checkpoint file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Don't forget to add the wait_for_active_shards condition on the 7.x backport.
The test IndexShardIT.testIndexCanChangeCustomDataPath() fails on 7.x and 7.3 because the translog cannot be recovered. While I can't reproduce the issue, I think it has been introduced in #43752 which changed ReadOnlyEngine so that it opens the translog in its constructor in order to load the translog stats. This opening writes a new checkpoint file, but because 7.x/7.3 does not wait for shards to be started after being closed, the test immediately starts to copy shard files to a new directory and possibly does not copy all the required translog files. By waiting for the shards to be started after being closed, we ensure that the shards (and engines) have been correctly initialized and that the translog checkpoint file is not currently being written. closes #43964
The test IndexShardIT.testIndexCanChangeCustomDataPath() fails on 7.x and 7.3 because the translog cannot be recovered. While I can't reproduce the issue, I think it has been introduced in #43752 which changed ReadOnlyEngine so that it opens the translog in its constructor in order to load the translog stats. This opening writes a new checkpoint file, but because 7.x/7.3 does not wait for shards to be started after being closed, the test immediately starts to copy shard files to a new directory and possibly does not copy all the required translog files. By waiting for the shards to be started after being closed, we ensure that the shards (and engines) have been correctly initialized and that the translog checkpoint file is not currently being written. closes #43964
This pull request changes
ReadOnlyEngine
so that it loads stats from the translog when the engine is created. It also adds tests to check that translog stats are correctly exposed in Stats API for closed and frozen indices.