Use local checkpoint to calculate min translog gen for recovery #52841

dnhatn · 2020-02-26T18:33:39Z

Today we use the translog_generation of the safe commit as the minimum
required translog generation for recovery. This approach has a
limitation, where we won't be able to clean up translog unless we flush.
Reopening an already recovered engine will create a new empty translog,
and we leave it there until we force flush.

This commit removes the translog_generation commit tag and uses the
local checkpoint of the safe commit to calculate the minimum required
translog generation for recovery instead.

Closes #49970
Backport of #51905

…tic#51905) Today we use the translog_generation of the safe commit as the minimum required translog generation for recovery. This approach has a limitation, where we won't be able to clean up translog unless we flush. Reopening an already recovered engine will create a new empty translog, and we leave it there until we force flush. This commit removes the translog_generation commit tag and uses the local checkpoint of the safe commit to calculate the minimum required translog generation for recovery instead. Closes elastic#49970

Separates the translog from the index deletion conditions (allowing the translog to be cleaned up more eagerly), and avoids taking the write lock on the translog if no clean-up is actually necessary.

Since elastic#51905, we skip translog recovery if the local checkpoint of the safe commit equals to the global checkpoint. This change adjusts the test not to create a new snapshot in that case. Closes elastic#52221 Relates elastic#51905

Since elastic#51905, we use the local checkpoint of the safe commit to calculate the number of uncommitted operations of a translog stats. If a periodic flush triggered by afterWriteOperation completes before we sync translog, then the last commit is not safe. We also need to sync translog from Engine instead of the translog so that we can advance the safe commit. Relates elastic#51905 Closes elastic#52223

Asserts that no new operations are made into the translog since we re-opened the engine. Relates elastic#51905 Closes elastic#52410

Adjusts the assertion as we might eagerly clean up translog during resync since elastic#52556 Relates elastic#52556 Closes elastic#52598

Adjusts the assertion as we trim translog more eagerly since elastic#52556. Relates elastic#52556 Closes elastic#52148

We aren't able to reproduce or figure out the reason that failed this test. This commit adds more assertions so we can narrow the scope. Relates elastic#52223

dnhatn · 2020-02-26T20:18:42Z

@elasticmachine test this please

dnhatn and others added 8 commits February 26, 2020 12:04

Separate translog from index deletion conditions (elastic#52556)

73e9d30

Separates the translog from the index deletion conditions (allowing the translog to be cleaned up more eagerly), and avoids taking the write lock on the translog if no clean-up is actually necessary.

Fix testRestoreLocalHistoryFromTranslog (elastic#52441)

3c42d64

Asserts that no new operations are made into the translog since we re-opened the engine. Relates elastic#51905 Closes elastic#52410

Fix testResyncAfterPrimaryPromotion (elastic#52615)

9e12124

Adjusts the assertion as we might eagerly clean up translog during resync since elastic#52556 Relates elastic#52556 Closes elastic#52598

Fix testSeqNoCollision (elastic#52588)

ceb7967

Adjusts the assertion as we trim translog more eagerly since elastic#52556. Relates elastic#52556 Closes elastic#52148

Add more assertions to testMaybeFlush (elastic#52792)

e94a201

We aren't able to reproduce or figure out the reason that failed this test. This commit adds more assertions so we can narrow the scope. Relates elastic#52223

dnhatn added backport test-forwards and removed test-forwards labels Feb 26, 2020

dnhatn closed this Feb 26, 2020

dnhatn deleted the 7x-seqno-tlog-policy branch February 26, 2020 21:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use local checkpoint to calculate min translog gen for recovery #52841

Use local checkpoint to calculate min translog gen for recovery #52841

dnhatn commented Feb 26, 2020

dnhatn commented Feb 26, 2020

Use local checkpoint to calculate min translog gen for recovery #52841

Use local checkpoint to calculate min translog gen for recovery #52841

Conversation

dnhatn commented Feb 26, 2020

dnhatn commented Feb 26, 2020