Update WAL corruption test so that it fails without fix #9942

akankshamahajan15 · 2022-05-04T00:27:56Z

Summary: In case of non-TransactionDB and avoid_flush_during_recovery = true, RocksDB won't
flush the data from WAL to L0 for all column families if possible. As a
result, not all column families can increase their log_numbers, and
min_log_number_to_keep won't change.
For transaction DB (.allow_2pc), even with the flush, there may be old WAL files that it must not delete because they can contain data of uncommitted transactions and min_log_number_to_keep won't change.
If we persist a new MANIFEST with
advanced log_numbers for some column families, then during a second
crash after persisting the MANIFEST, RocksDB will see some column
families' log_numbers larger than the corrupted WAL, and the "column family inconsistency" error will be hit, causing recovery to fail.

This PR update unit tests to emulate the errors and tests are failing without a fix.

Error:

[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/0
db/corruption_test.cc:1190: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF test_cf
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/0, where GetParam() = (true, false) (91 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/1
db/corruption_test.cc:1190: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF test_cf
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/1, where GetParam() = (false, false) (92 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/2
db/corruption_test.cc:1190: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF test_cf
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/2, where GetParam() = (true, true) (95 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/3
db/corruption_test.cc:1190: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF test_cf
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/3, where GetParam() = (false, true) (92 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/0
db/corruption_test.cc:1354: Failure
TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/0, where GetParam() = (true, false) (94 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/1
db/corruption_test.cc:1354: Failure
TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/1, where GetParam() = (false, false) (97 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/2
db/corruption_test.cc:1354: Failure
TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/2, where GetParam() = (true, true) (94 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/3
db/corruption_test.cc:1354: Failure
TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/3, where GetParam() = (false, true) (91 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/0
db/corruption_test.cc:1483: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/0, where GetParam() = (true, false) (93 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/1
db/corruption_test.cc:1483: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/1, where GetParam() = (false, false) (94 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/2
db/corruption_test.cc:1483: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/2, where GetParam() = (true, true) (90 ms)
[ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/3
db/corruption_test.cc:1483: Failure
DB::Open(options, dbname_, cf_descs, &handles, &db_)
Corruption: SST file is ahead of WALs in CF default
[  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/3, where GetParam() = (false, true) (93 ms)
[----------] 12 tests from CorruptionTest/CrashDuringRecoveryWithCorruptionTest (1116 ms total)

Test Plan: Not needed

riversand963 · 2022-05-09T17:49:37Z

iiuc, you had another repro by corrupting the last WAL?

akankshamahajan15 · 2022-05-09T18:44:06Z

iiuc, you had another repro by corrupting the last WAL?

I updated that one. I realized the unit tests were incorrect because I was corrupting the sync WALs. In this unit tests, un sync WALs are corrupted. Let me know if you feel we need to cover anything else, I can add new unit tests covering those missing scenarios.

akankshamahajan15 · 2022-05-10T21:15:18Z

Once it's reviewed and accepted, I will DISABLED the tests and land it. I will reenable the tests in PR with the fix.

riversand963

Thanks @akankshamahajan15 for adding the tests.

I think we should still test both cases of avoid_flush_during_recovery being true or false in the last re-open. In the re-open with error injection, we alway set it to true.

db/corruption_test.cc

akankshamahajan15 · 2022-05-11T02:02:26Z

Thanks @akankshamahajan15 for adding the tests.

I think we should still test both cases of avoid_flush_during_recovery being true or false in the last re-open. In the re-open with error injection, we alway set it to true.

Ok. That makes sense. I tried with avoid_flush_during_recovery = false during injection also and it was crashing. But setting it to a false in last reopen should work. Thanks for the review. I will address the comments.

riversand963

Thanks @akankshamahajan15 for adding the tests.
One comment: for each test in this PR, can we add some verification logic after the last open to make sure all existing data are not lost?

akankshamahajan15 · 2022-05-11T19:46:54Z

Thanks @akankshamahajan15 for adding the tests. One comment: for each test in this PR, can we add some verification logic after the last open to make sure all existing data are not lost?

Sure. I will add the verification in the PR with the fix.

Summary: Update the unit tests to fail if unsync wal is corrupted in case of non transaction db with avoid_flush_recovery = true and in transaction_db with allow2pc. Test Plan: Reviewers: Subscribers: Tasks: Tags:

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot · 2022-05-11T19:55:12Z

@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot · 2022-05-11T19:56:43Z

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-05-11T20:01:45Z

@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…g recovery (#9922) Summary: In case of non-TransactionDB and avoid_flush_during_recovery = true, RocksDB won't flush the data from WAL to L0 for all column families if possible. As a result, not all column families can increase their log_numbers, and min_log_number_to_keep won't change. For transaction DB (.allow_2pc), even with the flush, there may be old WAL files that it must not delete because they can contain data of uncommitted transactions and min_log_number_to_keep won't change. If we persist a new MANIFEST with advanced log_numbers for some column families, then during a second crash after persisting the MANIFEST, RocksDB will see some column families' log_numbers larger than the corrupted wal, and the "column family inconsistency" error will be hit, causing recovery to fail. As a solution, RocksDB will persist the new MANIFEST after successfully syncing the new WAL. If a future recovery starts from the new MANIFEST, then it means the new WAL is successfully synced. Due to the sentinel empty write batch at the beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point. If future recovery starts from the old MANIFEST, it means the writing the new MANIFEST failed. We won't have the "SST ahead of WAL" error. Currently, RocksDB DB::Open() may creates and writes to two new MANIFEST files even before recovery succeeds. This PR buffers the edits in a structure and writes to a new MANIFEST after recovery is successful Pull Request resolved: #9922 Test Plan: 1. Update unit tests to fail without this change 2. make crast_test -j Branch with unit test and no fix #9942 to keep track of unit test (without fix) Reviewed By: riversand963 Differential Revision: D36043701 Pulled By: akankshamahajan15 fbshipit-source-id: 5760970db0a0920fb73d3c054a4155733500acd9

…g recovery (facebook#9922) Summary: In case of non-TransactionDB and avoid_flush_during_recovery = true, RocksDB won't flush the data from WAL to L0 for all column families if possible. As a result, not all column families can increase their log_numbers, and min_log_number_to_keep won't change. For transaction DB (.allow_2pc), even with the flush, there may be old WAL files that it must not delete because they can contain data of uncommitted transactions and min_log_number_to_keep won't change. If we persist a new MANIFEST with advanced log_numbers for some column families, then during a second crash after persisting the MANIFEST, RocksDB will see some column families' log_numbers larger than the corrupted wal, and the "column family inconsistency" error will be hit, causing recovery to fail. As a solution, RocksDB will persist the new MANIFEST after successfully syncing the new WAL. If a future recovery starts from the new MANIFEST, then it means the new WAL is successfully synced. Due to the sentinel empty write batch at the beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point. If future recovery starts from the old MANIFEST, it means the writing the new MANIFEST failed. We won't have the "SST ahead of WAL" error. Currently, RocksDB DB::Open() may creates and writes to two new MANIFEST files even before recovery succeeds. This PR buffers the edits in a structure and writes to a new MANIFEST after recovery is successful Pull Request resolved: facebook#9922 Test Plan: 1. Update unit tests to fail without this change 2. make crast_test -j Branch with unit test and no fix facebook#9942 to keep track of unit test (without fix) Reviewed By: riversand963 Differential Revision: D36043701 Pulled By: akankshamahajan15 fbshipit-source-id: 5760970db0a0920fb73d3c054a4155733500acd9

…g recovery (#9922) Summary: In case of non-TransactionDB and avoid_flush_during_recovery = true, RocksDB won't flush the data from WAL to L0 for all column families if possible. As a result, not all column families can increase their log_numbers, and min_log_number_to_keep won't change. For transaction DB (.allow_2pc), even with the flush, there may be old WAL files that it must not delete because they can contain data of uncommitted transactions and min_log_number_to_keep won't change. If we persist a new MANIFEST with advanced log_numbers for some column families, then during a second crash after persisting the MANIFEST, RocksDB will see some column families' log_numbers larger than the corrupted wal, and the "column family inconsistency" error will be hit, causing recovery to fail. As a solution, RocksDB will persist the new MANIFEST after successfully syncing the new WAL. If a future recovery starts from the new MANIFEST, then it means the new WAL is successfully synced. Due to the sentinel empty write batch at the beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point. If future recovery starts from the old MANIFEST, it means the writing the new MANIFEST failed. We won't have the "SST ahead of WAL" error. Currently, RocksDB DB::Open() may creates and writes to two new MANIFEST files even before recovery succeeds. This PR buffers the edits in a structure and writes to a new MANIFEST after recovery is successful Pull Request resolved: #9922 Test Plan: 1. Update unit tests to fail without this change 2. make crast_test -j Branch with unit test and no fix #9942 to keep track of unit test (without fix) Reviewed By: riversand963 Differential Revision: D36043701 Pulled By: akankshamahajan15 fbshipit-source-id: 5760970db0a0920fb73d3c054a4155733500acd9

facebook-github-bot added the CLA Signed label May 4, 2022

akankshamahajan15 force-pushed the fail_without_fix branch from 83517f5 to 8cdea4c Compare May 4, 2022 01:16

akankshamahajan15 mentioned this pull request May 4, 2022

Persist the new MANIFEST after successfully syncing the new WAL during recovery #9922

Closed

akankshamahajan15 force-pushed the fail_without_fix branch from 8cdea4c to da8d41b Compare May 9, 2022 01:20

akankshamahajan15 changed the title ~~Update test so that it fails without changes~~ Update WAL corruption test so that it fails without fix May 9, 2022

akankshamahajan15 requested review from riversand963 and ajkr May 9, 2022 01:27

akankshamahajan15 force-pushed the fail_without_fix branch 2 times, most recently from b039819 to cc5e522 Compare May 10, 2022 21:10

riversand963 reviewed May 11, 2022

View reviewed changes

db/corruption_test.cc Outdated Show resolved Hide resolved

db/corruption_test.cc Outdated Show resolved Hide resolved

db/corruption_test.cc Outdated Show resolved Hide resolved

db/corruption_test.cc Outdated Show resolved Hide resolved

db/corruption_test.cc Outdated Show resolved Hide resolved

akankshamahajan15 force-pushed the fail_without_fix branch 2 times, most recently from 8c9069d to d5d170b Compare May 11, 2022 16:05

akankshamahajan15 requested a review from riversand963 May 11, 2022 16:06

riversand963 approved these changes May 11, 2022

View reviewed changes

akankshamahajan15 added 2 commits May 11, 2022 12:54

Update WAL corruption test

9458489

Summary: Update the unit tests to fail if unsync wal is corrupted in case of non transaction db with avoid_flush_recovery = true and in transaction_db with allow2pc. Test Plan: Reviewers: Subscribers: Tasks: Tags:

Addressed comments

25bb393

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

akankshamahajan15 force-pushed the fail_without_fix branch from d5d170b to c54b51a Compare May 11, 2022 19:54

DISABLED the tests

8822742

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

akankshamahajan15 force-pushed the fail_without_fix branch from c54b51a to 8822742 Compare May 11, 2022 19:56

facebook-github-bot closed this in 6442a62 May 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update WAL corruption test so that it fails without fix #9942

Update WAL corruption test so that it fails without fix #9942

akankshamahajan15 commented May 4, 2022 •

edited

Loading

riversand963 commented May 9, 2022

akankshamahajan15 commented May 9, 2022

akankshamahajan15 commented May 10, 2022 •

edited

Loading

riversand963 left a comment

akankshamahajan15 commented May 11, 2022 •

edited

Loading

riversand963 left a comment

akankshamahajan15 commented May 11, 2022

facebook-github-bot commented May 11, 2022

facebook-github-bot commented May 11, 2022

facebook-github-bot commented May 11, 2022

Update WAL corruption test so that it fails without fix #9942

Update WAL corruption test so that it fails without fix #9942

Conversation

akankshamahajan15 commented May 4, 2022 • edited Loading

riversand963 commented May 9, 2022

akankshamahajan15 commented May 9, 2022

akankshamahajan15 commented May 10, 2022 • edited Loading

riversand963 left a comment

Choose a reason for hiding this comment

akankshamahajan15 commented May 11, 2022 • edited Loading

riversand963 left a comment

Choose a reason for hiding this comment

akankshamahajan15 commented May 11, 2022

facebook-github-bot commented May 11, 2022

facebook-github-bot commented May 11, 2022

facebook-github-bot commented May 11, 2022

akankshamahajan15 commented May 4, 2022 •

edited

Loading

akankshamahajan15 commented May 10, 2022 •

edited

Loading

akankshamahajan15 commented May 11, 2022 •

edited

Loading