-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CCR: Aborted document is exposed in Lucene changes #32269
Comments
Pinging @elastic/es-distributed |
I see a solution that brings a deleted doc to live iff it was soft-deleted. However, this approach may be fragile for stale documents (soft-deleted before indexing) because it relies on the order of processing fields. If the soft-deletes field of a doc is processed then that doc is aborted, we can't exclude that document from liveDocs. Another option is to expose hard liveDocs in Lucene. |
I think we can expose the actual hard deletes on a SegmentReader level. that is the best solution IMO. |
as a workaround we can fix this specific issue by loading the original livedocs if there are any deletions. This is safe in this case since we write deletes to disk on flush so if there is an aborted doc we will have at least 1 hard deleted doc. We can then do this for the segment reader in question: SegmentReader reader = ...;
SegmentCommitInfo si = reader.getSegmentInfo();
Bits hardLiveDocs = si.getDelCount() != 0 ? si.info.getCodec().liveDocsFormat().readLiveDocs(si.info.dir, si, IOContext.READONCE) : null; @dnhatn WDYT? |
Today when reading operation history in Lucene, we read all documents. However, if indexing a document is aborted, IndexWriter will hard-delete it; we, therefore, need to exclude that document from Lucene history. This commit makes sure that we exclude aborted documents by using the hard liveDocs of a SegmentReader if there are deletes. Closes #32269
Fixed by #32333. |
The CCR branch started failing frequently after merging #31007. Some CI instances:
These failures can be explained as follows:
The problem is that we read aborted documents which should never be exposed. This might be a critical problem in CCR and Lucene rollbacks.
/cc @s1monw and @bleskes
The text was updated successfully, but these errors were encountered: