-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add index commit id to searcher #63963
Conversation
server/src/main/java/org/elasticsearch/common/lucene/index/ElasticsearchDirectoryReader.java
Outdated
Show resolved
Hide resolved
I am labeling this WIP, although it's ready as I am not sure the approach in the PR. The way we compose the searcher id might be too aggressive. If we need to improve search resiliency for searchable snapshots only, then we can use the commit id instead. |
Pinging @elastic/es-distributed (:Distributed/Engine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there are two goals we could have with this id:
- The ability to repeat the fetch phase if it fails. Here we need identical doc-ids on the replica.
- The ability to continue to use a PIT despite failures in the cluster. Here we do not need identical doc-ids.
I wonder if the second part is better served by maxSeqNo
instead whenever gcp==maxSeqNo
? Plus primary term for correctness.
It might be two completely separate ids and thus handled in another PR. I only wanted to raise this here to gather consensus that we cannot come up with a single id or failover scheme that can serve both purposes.
// are always different although they can have the identical segment files. | ||
final MessageDigest md = MessageDigests.sha256(); | ||
for (SegmentCommitInfo sci : segmentInfos) { | ||
final byte[] segmentId = sci.getId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear to me that this is valid across different shard copies. The id generation starts somewhere random and then increments. I acknowledge the risk is small and I did not dig deeply into whether this increases the risk of collissions over using more standard uuid generation.
We (Jim, Francisco, and I) discussed this PR and agreed to add two IDs (i.e., commitID and seqID) to searchers. I implemented the commitID using an external UUID that is generated when an engine flushes (see 2f410eb). However, I found that using an external UUID isn't better than the id of an index commit. Hence, I re-implemented it using the id of an index commit. Would you please take a look? Thank you! |
We flush after copying segment files in peer recovery to associate a new translog UUID, and this flush changes the commit id. We can avoid this limitation by having an external commit id, and change it whenever the InternalEngine performs a flush. I implemented this, but I reverted it as it helps only this situation. I am happy to bring it back if you prefer. |
Is the purpose of the change not that we can fail over a search to another replica copy containing the same data? I suppose the approach taken here could work for searchable snapshots, but it would be nice to have an approach that also works for shards that already went through file based peer recovery and did not see any changes since (like most warm indices today). Or is there a good reason that we cannot/should not handle that scenario? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Yes, that's the original purpose of this PR.
Yes, it works with searchable snapshots.
The problem is that we flush to create a new index commit to associate with a new local translog on replicas after performing a file-based recovery. This generates a new commit_id, although the content of the index commit doesn't change. To circumvent this, we can have an external commit id that doesn't change in file-based recoveries (please see 5ec5781). |
Chatted with @dnhatn about the main issue of the external commit id: relying on lucene not merging when associating a new translog with the commit. In order to continue this path of development, let us stick to the approach you have here (using the commit id of the segment infos), or revisit your original proposal of relying on individual segment commit-ids. |
Thanks everyone for reviewing. I am merging this PR as is and will revisit the previous approach later. |
This change assigns the id of an index commit to a searcher, so we can retry search requests on another shard copy if they have the same index commit.
This change assigns the id of an index commit to a searcher, so we can retry search requests on another shard copy if they have the same index commit.
The commit id introduced in elastic#63963 does not work well with searchable snapshots as we create a new index commit when restoring from snapshots. This change revises an approach that generates an id using the ids of the segments of an index commit. Relates elastic#63963
This commit adds an id, which is composed of the ids of segment files, to ElasticsearchDirectoryReader. With this id, we can retry search requests on another shard copy if its latest "snapshot" has the same segment files as the failing shard.