-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use exact numDocs in synced-flush and metadata snapshot #30228
Conversation
Since elastic#29458, we use a searcher to calculate the number of documents for a commit stats. Sadly, that approach is flawed. The searcher might no longer point to the last commit if it's refreshed. This commit uses SoftDeletesDirectoryReaderWrapper to exclude the soft-deleted documents from numDocs in a SegmentInfos. I chose to modify the method Luence#getNumDocs so that we can read a store metadata snapshot correctly without opening an engine. Relates elastic#29458
Pinging @elastic/es-distributed |
It might be incorrect if forceMerge happens
@elasticmachine test this please |
I wonder if we should do this. Classically number of docs was always total docs in the segments. I'm not saying this isn't debatable but I'm also not sure it merits the breakage price. Also we want this to go into 6.x. Instead of this, maybe we should another field to the stats to reflect soft deletes? then sycned flush is freed to do what it thinks is right and we have more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left one comment
*/ | ||
public static int getNumDocs(SegmentInfos info) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep this like it is. We should rather, in the engine do this conditionally if we have soft-deletes enabled. Also please make sure that we only load this once and cache it. We don't commit very often and commit stats are fetched rarely we can do this when it's needed and cache it per commit.
I understand your concerns. However, we get numDocs from a commit not only when sealing an index but also when verifying sync_id in peer recovery. In peer recovery, we read numDocs directly from Store as a target shard has not opened its engine yet. Thus, fixing this in an engine might not be enough. elasticsearch/server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java Line 319 in 45c6c20
Since we don't seal an index if numDocs are different, is it ok to remove the numDocs checking at recovery time? |
@dnhatn Simon will be a better judge, but I think we need a better utility method like |
@bleskes Yes, we should do it. |
@s1monw I take a different approach for this. Can you please have a look? Thank you. |
@elasticmachine retest this please |
I am ok with the changing the semantics here. I think we try to make something work that isn't necessarily giving us enough information. I think we should keep it the way it is and deprecate it in 6.x? I really wonder what the purpose of this stats is compared to other stats that have a live numDocs that is accurate? |
the way it is in master
I mean for instance docStats here: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-stats.html |
# Conflicts: # server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
I've updated the PR to restore the CommitStats and made synced-flush using an exact numDocs only when soft-deletes is enabled. There are still some issues that I am not sure.
Please have a look and let me know your thought. Thank you! |
// The primary shard may need the "exact" numDocs to verify if the commit has syncId. | ||
final boolean softDeleteEnabled = recoveryTarget.indexShard().indexSettings().isSoftDeleteEnabled(); | ||
if (softDeleteEnabled && Strings.hasText(snapshot.getSyncId())) { | ||
final List<IndexCommit> commits = DirectoryReader.listCommits(recoveryTarget.store().directory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to use safe commit. This is the commit the engine will use (probably an existing bug) .
Thx @dnhatn . I like where this is going. IMO neither synced flush nor recover are frequent operations. We should just use exact numbers all the time and everywhere. keep it simple. if soft deletes are not enabled, it's the same as the "fast" numDocs (and we can assert for that). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Unlike {@link #getNumDocs(SegmentInfos)} this method returns a numDocs that always excludes soft-deleted docs. | ||
* This method is expensive thus prefer using {@link #getNumDocs(SegmentInfos)} unless an exact numDocs is required. | ||
*/ | ||
public static int getExactNumDocs(IndexCommit commit) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
final SegmentInfos segmentInfos = Lucene.readSegmentInfos(commitRef.getIndexCommit()); | ||
final int numDocs; | ||
if (indexShard.indexSettings().isSoftDeleteEnabled()) { | ||
numDocs = Lucene.getExactNumDocs(commitRef.getIndexCommit()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can call the getExactNumDocs
all the time. it has no overhead if soft_deletes are not there
@@ -341,7 +342,8 @@ public void phase1(final IndexCommit snapshot, final Supplier<Integer> translogO | |||
recoverySourceSyncId.equals(recoveryTargetSyncId); | |||
if (recoverWithSyncId) { | |||
final long numDocsTarget = request.metadataSnapshot().getNumDocs(); | |||
final long numDocsSource = recoverySourceMetadata.getNumDocs(); | |||
final boolean softDeletesEnabled = shard.indexSettings().isSoftDeleteEnabled(); | |||
final long numDocsSource = softDeletesEnabled ? Lucene.getExactNumDocs(snapshot) : recoverySourceMetadata.getNumDocs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here just use getExactNumDocs
# Conflicts: # server/src/main/java/org/elasticsearch/indices/flush/SyncedFlushService.java
@elasticmachine test this please |
@bleskes I've updated this PR to always have an exact numDocs when capturing the store metadata snapshot. Would you please have a look? Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Nhat.
This test suite fails due to Lucene#getExactNumDocs. Relates #30228
Since #29458, we use a searcher to calculate the number of documents for a commit stats. Sadly, that approach is flawed. The searcher might no longer point to the last commit if it's refreshed. As synced-flush requires an exact numDocs to work correctly, we have to exclude all soft-deleted docs. This commit makes synced-flush stop using CommitStats but read an exact numDocs directly from an index commit. Relates #29458 Relates #29530
This PR adapts/utilizes recent enhancements in Lucene-7.4: - Replaces exactNumDocs by the soft-deletes count in SegmentCommitInfo. This enhancement allows us to back out changes introduced in #30228. - Always configure the soft-deletes field in IWC
This PR adapts/utilizes recent enhancements in Lucene-7.4: - Replaces exactNumDocs by the soft-deletes count in SegmentCommitInfo. This enhancement allows us to back out changes introduced in #30228. - Always configure the soft-deletes field in IWC
Since #29458, we use a searcher to calculate the number of documents for a commit stats. Sadly, that approach is flawed. The searcher might no longer point to the last commit if it's refreshed. As synced-flush requires an exact numDocs to work correctly, we have to exclude all soft-deleted docs.
This commit makes synced-flush stop using CommitStats but read an exact numDocs directly from an index commit.
Relates #29458
Relates #29530