Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [Segment Replication] Error computing max seqNo from SegmentInfos. #6588

Closed
mch2 opened this issue Mar 8, 2023 · 0 comments · Fixed by #6594
Closed

[BUG] [Segment Replication] Error computing max seqNo from SegmentInfos. #6588

mch2 opened this issue Mar 8, 2023 · 0 comments · Fixed by #6594
Assignees
Labels
bug Something isn't working distributed framework

Comments

@mch2
Copy link
Member

mch2 commented Mar 8, 2023

Describe the bug
Segment replication relies on computing the max seqNo on primary shards so that the seqNo that is sent to replicas is accurate to what is in the infos. However, this function breaks after deletes are merged away, moving the maxSeqNo back in time.

  1> java.lang.AssertionError: null
  1>    at org.opensearch.index.engine.Engine.getMaxSeqNoFromSearcher(Engine.java:317) ~[main/:?]
  1>    at org.opensearch.index.engine.Engine.getMaxSeqNoFromSegmentInfos(Engine.java:291) ~[main/:?]
  1>    at org.opensearch.index.shard.RemoteStoreRefreshListener.uploadSegmentInfosSnapshot(RemoteStoreRefreshListener.java:182) ~[main/:?]

This functionality was introduced to pass an accurate seqNo to replicas to

  1. pass an accurate seqNo to setLocalCheckpointOfSafeCommit, to determine where xlog should be trimmed after a commit is received.
  2. Update the replica's processed seqNo.

To Reproduce

  1. Start a cluster with SR
  2. index some documents
  3. Delete a portion of the documents.
  4. Trigger a force merge.

Expected behavior
Replicas should be sent accurate seqNo as computed by the primary. One way to avoid this is to send only max seqNo as set in userdata by primaries on commit. On a refresh level this means replica's processed seqNo would not update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distributed framework
Projects
Status: Done
1 participant