-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication] Update getSegmentInfosSnapshot() logic to return SegmentInfos with highest generation number #4288
Conversation
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
/** | ||
* Returns generation number (_N in segment_N) from last commit on disk. | ||
*/ | ||
public long getLastCommitGeneration() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this method on the store? can we call
SegmentInfos.getLastCommitGeneration(store.directory());
directly from InternalEngine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created new method for consistency purpose as to keep disk(store) related calls on Store
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Outdated
Show resolved
Hide resolved
SegmentInfos sisDisk = new SegmentInfos(org.apache.lucene.util.Version.LATEST.major); | ||
sisDisk.setNextWriteGeneration(memoryGen + 1); | ||
|
||
when(store.getLastCommitGeneration()).thenReturn(sisDisk.getGeneration()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than mocking here, you should be able to invoke SegmentInfos.commit and create a new _N file with a higher generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing this, I wasn't aware of this method.
I tried SegmentInfos.commit
but it also updated the in-memory copy of SegmentInfos i.e. both sisDisk & sisInMemory have same generation number. So, for tests, need to use mocks to make on-disk copy to have higher segment generation.
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Gradle Check (Jenkins) Run Completed with:
|
This comment was marked as outdated.
This comment was marked as outdated.
|
||
// This method simulates behaviour of on-disk SegmentInfos having lower segment generation number compared to | ||
// actual in-memory SegmentInfos and verified that engine.getSegmentInfosSnapshot returns in-memory SegmentInfos | ||
public void testGetSegmentInfosWithHighestGenInMemory() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> This test case will have both in-memory and on-disk segmentInfos on same segment generation number but not lower on on-disk as it is mentioned (or purpose of this test case). In this test case we are not incrementing or decrementing segment generation of either on-disk on in-memory, so both will be same. So we are not testing the case here that we actually want to.
-> Is this test really necessary? Will there be a scenario ever with on-disk segmentInfos having lower segment generation number than in-memory ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Looks like comment is outdated. Updated.
- Yes, this is useful to ensure in memory copy gets picked up (when segment generation is higher/equal compared to on-disk copy's segment generation). Yes, this scenario is happening during replica promotion.
…n SegmentInfos having highest generation number Signed-off-by: Suraj Singh <[email protected]>
Signed-off-by: Suraj Singh <[email protected]>
… number as on-disk copy Signed-off-by: Suraj Singh <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
Inconsistent on-disk & in-memory generation number in SegmentInfos is not expected state. Need to think through the problem again and identify cause of mis-match in on-disk & in-memory copies of SegmentInfos. If there is a way to identify mis-match (and possibly fix it) would be the ideal solution here. Converting it to draft for now. CC @mch2 |
Signed-off-by: Suraj Singh [email protected]
Description
The PR updates the logic getSegmentInfosSnapshot() to return SegmentInfos copy (In memory Vs On-disk) with highest generation number. The logic is copied from #4178 (comment) with minor tweaks and unit tests.
Issues Resolved
#4178
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.