Skip to content

Commit

Permalink
Protect existing files in `FaultInjectionTest{Env,FS}::ReopenWritable…
Browse files Browse the repository at this point in the history
…File()` (#8995)

Summary:
`FaultInjectionTest{Env,FS}::ReopenWritableFile()` functions were accidentally deleting WALs from previous `db_stress` runs causing verification to fail. They were operating under the assumption that `ReopenWritableFile()` would delete any existing file. It was a reasonable assumption considering the `{Env,FileSystem}::ReopenWritableFile()` documentation stated that would happen. The only problem was neither the implementations we offer nor the "real" clients in RocksDB code followed that contract. So, this PR updates the contract as well as fixing the fault injection client usage.

The fault injection change exposed that `ExternalSSTFileBasicTest.SyncFailure` was relying on a fault injection `Env` dropping unsynced data written by a regular `Env`. I changed that test to make its `SstFileWriter` use fault injection `Env`, and also implemented `LinkFile()` in fault injection so the unsynced data is tracked under the new name.

Pull Request resolved: #8995

Test Plan:
- Verified it fixes the following failure:

```
$ ./db_stress --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --ops_per_thread=1000 --prefixpercent=0 --readpercent=60 --reopen=0 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
$ ./db_stress --avoid_flush_during_recovery=1 --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/rocksdb_crashtest_whitebox --delpercent=5 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --iterpercent=10 --key_len_percent_dist=1,30,69 --max_bytes_for_level_base=4194304 --max_key=100000 --max_key_len=3 --nooverwritepercent=1 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_write_fault_one_in=16 --ops_per_thread=1000 --prefix_size=-1 --prefixpercent=0 --readpercent=50 --sync=1 --target_file_size_base=1048576 --test_batches_snapshots=0 --write_buffer_size=1048576 --writepercent=35 --value_size_mult=33 -threads=1
...
Verification failed for column family 0 key 000000000000001300000000000000857878787878 (1143): Value not found: NotFound:
Crash-recovery verification failed :(
...
```

- `make check -j48`

Reviewed By: ltamasi

Differential Revision: D31495388

Pulled By: ajkr

fbshipit-source-id: 7886ccb6a07cb8b78ad7b6c1c341ccf40bb68385
  • Loading branch information
ajkr committed Oct 11, 2021
1 parent ffd4e96 commit 2df8905
Show file tree
Hide file tree
Showing 10 changed files with 203 additions and 52 deletions.
1 change: 1 addition & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
## Unreleased
### Bug Fixes
* Fix `DisableManualCompaction()` to cancel compactions even when they are waiting on automatic compactions to drain due to `CompactRangeOptions::exclusive_manual_compactions == true`.
* Fix contract of `Env::ReopenWritableFile()` and `FileSystem::ReopenWritableFile()` to specify any existing file must not be deleted or truncated.

## 6.25.1 (2021-09-28)
### Bug Fixes
Expand Down
2 changes: 1 addition & 1 deletion db/external_sst_file_basic_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1147,7 +1147,7 @@ TEST_F(ExternalSSTFileBasicTest, SyncFailure) {
}

Options sst_file_writer_options;
sst_file_writer_options.env = env_;
sst_file_writer_options.env = fault_injection_test_env_.get();
std::unique_ptr<SstFileWriter> sst_file_writer(
new SstFileWriter(EnvOptions(), sst_file_writer_options));
std::string file_name =
Expand Down
11 changes: 8 additions & 3 deletions db_stress_tool/db_stress_gflags.cc
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@ static bool ValidateUint32Range(const char* flagname, uint64_t value) {
return true;
}

DEFINE_uint64(seed, 2341234, "Seed for PRNG");
DEFINE_uint64(seed, 2341234,
"Seed for PRNG. When --nooverwritepercent is "
"nonzero and --expected_values_dir is nonempty, this value "
"must be fixed across invocations.");
static const bool FLAGS_seed_dummy __attribute__((__unused__)) =
RegisterFlagValidator(&FLAGS_seed, &ValidateUint32Range);

Expand Down Expand Up @@ -453,7 +456,8 @@ DEFINE_string(
"provided and non-empty, the DB state will be verified against these "
"values after recovery. --max_key and --column_family must be kept the "
"same across invocations of this program that use the same "
"--expected_values_path.");
"--expected_values_path. See --seed and --nooverwritepercent for further "
"requirements.");

DEFINE_bool(verify_checksum, false,
"Verify checksum for every block read from storage");
Expand Down Expand Up @@ -644,7 +648,8 @@ static const bool FLAGS_delrangepercent_dummy __attribute__((__unused__)) =

DEFINE_int32(nooverwritepercent, 60,
"Ratio of keys without overwrite to total workload (expressed as "
" a percentage)");
"a percentage). When --expected_values_dir is nonempty, must "
"keep this value constant across invocations.");
static const bool FLAGS_nooverwritepercent_dummy __attribute__((__unused__)) =
RegisterFlagValidator(&FLAGS_nooverwritepercent, &ValidateInt32Percent);

Expand Down
11 changes: 6 additions & 5 deletions include/rocksdb/env.h
Original file line number Diff line number Diff line change
Expand Up @@ -266,11 +266,12 @@ class Env {
std::unique_ptr<WritableFile>* result,
const EnvOptions& options) = 0;

// Create an object that writes to a new file with the specified
// name. Deletes any existing file with the same name and creates a
// new file. On success, stores a pointer to the new file in
// *result and returns OK. On failure stores nullptr in *result and
// returns non-OK.
// Create an object that writes to a file with the specified name.
// `WritableFile::Append()`s will append after any existing content. If the
// file does not already exist, creates it.
//
// On success, stores a pointer to the file in *result and returns OK. On
// failure stores nullptr in *result and returns non-OK.
//
// The returned file will only be accessed by one thread at a time.
virtual Status ReopenWritableFile(const std::string& /*fname*/,
Expand Down
11 changes: 6 additions & 5 deletions include/rocksdb/file_system.h
Original file line number Diff line number Diff line change
Expand Up @@ -309,11 +309,12 @@ class FileSystem {
std::unique_ptr<FSWritableFile>* result,
IODebugContext* dbg) = 0;

// Create an object that writes to a new file with the specified
// name. Deletes any existing file with the same name and creates a
// new file. On success, stores a pointer to the new file in
// *result and returns OK. On failure stores nullptr in *result and
// returns non-OK.
// Create an object that writes to a file with the specified name.
// `FSWritableFile::Append()`s will append after any existing content. If the
// file does not already exist, creates it.
//
// On success, stores a pointer to the file in *result and returns OK. On
// failure stores nullptr in *result and returns non-OK.
//
// The returned file will only be accessed by one thread at a time.
virtual IOStatus ReopenWritableFile(
Expand Down
3 changes: 3 additions & 0 deletions tools/db_crashtest.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,9 @@
"max_key": 100000000,
"max_write_buffer_number": 3,
"mmap_read": lambda: random.randint(0, 1),
# Setting `nooverwritepercent > 0` is only possible because we do not vary
# the random seed, so the same keys are chosen by every run for disallowing
# overwrites.
"nooverwritepercent": 1,
"open_files": lambda : random.choice([-1, -1, 100, 500000]),
"optimize_filters_for_memory": lambda: random.randint(0, 1),
Expand Down
92 changes: 75 additions & 17 deletions utilities/fault_injection_env.cc
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ Status FaultInjectionTestEnv::NewWritableFile(
// again then it will be truncated - so forget our saved state.
UntrackFile(fname);
MutexLock l(&mutex_);
open_files_.insert(fname);
open_managed_files_.insert(fname);
auto dir_and_name = GetDirAndName(fname);
auto& list = dir_to_new_files_since_last_sync_[dir_and_name.first];
list.insert(dir_and_name.second);
Expand All @@ -302,17 +302,49 @@ Status FaultInjectionTestEnv::ReopenWritableFile(
if (!IsFilesystemActive()) {
return GetError();
}
Status s = target()->ReopenWritableFile(fname, result, soptions);

bool exists;
Status s, exists_s = target()->FileExists(fname);
if (exists_s.IsNotFound()) {
exists = false;
} else if (exists_s.ok()) {
exists = true;
} else {
s = exists_s;
exists = false;
}

if (s.ok()) {
result->reset(new TestWritableFile(fname, std::move(*result), this));
// WritableFileWriter* file is opened
// again then it will be truncated - so forget our saved state.
UntrackFile(fname);
MutexLock l(&mutex_);
open_files_.insert(fname);
auto dir_and_name = GetDirAndName(fname);
auto& list = dir_to_new_files_since_last_sync_[dir_and_name.first];
list.insert(dir_and_name.second);
s = target()->ReopenWritableFile(fname, result, soptions);
}

// Only track files we created. Files created outside of this
// `FaultInjectionTestEnv` are not eligible for tracking/data dropping
// (for example, they may contain data a previous db_stress run expects to
// be recovered). This could be extended to track/drop data appended once
// the file is under `FaultInjectionTestEnv`'s control.
if (s.ok()) {
bool should_track;
{
MutexLock l(&mutex_);
if (db_file_state_.find(fname) != db_file_state_.end()) {
// It was written by this `Env` earlier.
assert(exists);
should_track = true;
} else if (!exists) {
// It was created by this `Env` just now.
should_track = true;
open_managed_files_.insert(fname);
auto dir_and_name = GetDirAndName(fname);
auto& list = dir_to_new_files_since_last_sync_[dir_and_name.first];
list.insert(dir_and_name.second);
} else {
should_track = false;
}
}
if (should_track) {
result->reset(new TestWritableFile(fname, std::move(*result), this));
}
}
return s;
}
Expand All @@ -330,7 +362,7 @@ Status FaultInjectionTestEnv::NewRandomRWFile(
// again then it will be truncated - so forget our saved state.
UntrackFile(fname);
MutexLock l(&mutex_);
open_files_.insert(fname);
open_managed_files_.insert(fname);
auto dir_and_name = GetDirAndName(fname);
auto& list = dir_to_new_files_since_last_sync_[dir_and_name.first];
list.insert(dir_and_name.second);
Expand Down Expand Up @@ -394,17 +426,43 @@ Status FaultInjectionTestEnv::RenameFile(const std::string& s,
return ret;
}

Status FaultInjectionTestEnv::LinkFile(const std::string& s,
const std::string& t) {
if (!IsFilesystemActive()) {
return GetError();
}
Status ret = EnvWrapper::LinkFile(s, t);

if (ret.ok()) {
MutexLock l(&mutex_);
if (db_file_state_.find(s) != db_file_state_.end()) {
db_file_state_[t] = db_file_state_[s];
}

auto sdn = GetDirAndName(s);
auto tdn = GetDirAndName(t);
if (dir_to_new_files_since_last_sync_[sdn.first].find(sdn.second) !=
dir_to_new_files_since_last_sync_[sdn.first].end()) {
auto& tlist = dir_to_new_files_since_last_sync_[tdn.first];
assert(tlist.find(tdn.second) == tlist.end());
tlist.insert(tdn.second);
}
}

return ret;
}

void FaultInjectionTestEnv::WritableFileClosed(const FileState& state) {
MutexLock l(&mutex_);
if (open_files_.find(state.filename_) != open_files_.end()) {
if (open_managed_files_.find(state.filename_) != open_managed_files_.end()) {
db_file_state_[state.filename_] = state;
open_files_.erase(state.filename_);
open_managed_files_.erase(state.filename_);
}
}

void FaultInjectionTestEnv::WritableFileSynced(const FileState& state) {
MutexLock l(&mutex_);
if (open_files_.find(state.filename_) != open_files_.end()) {
if (open_managed_files_.find(state.filename_) != open_managed_files_.end()) {
if (db_file_state_.find(state.filename_) == db_file_state_.end()) {
db_file_state_.insert(std::make_pair(state.filename_, state));
} else {
Expand All @@ -415,7 +473,7 @@ void FaultInjectionTestEnv::WritableFileSynced(const FileState& state) {

void FaultInjectionTestEnv::WritableFileAppended(const FileState& state) {
MutexLock l(&mutex_);
if (open_files_.find(state.filename_) != open_files_.end()) {
if (open_managed_files_.find(state.filename_) != open_managed_files_.end()) {
if (db_file_state_.find(state.filename_) == db_file_state_.end()) {
db_file_state_.insert(std::make_pair(state.filename_, state));
} else {
Expand Down Expand Up @@ -485,6 +543,6 @@ void FaultInjectionTestEnv::UntrackFile(const std::string& f) {
dir_to_new_files_since_last_sync_[dir_and_name.first].erase(
dir_and_name.second);
db_file_state_.erase(f);
open_files_.erase(f);
open_managed_files_.erase(f);
}
} // namespace ROCKSDB_NAMESPACE
6 changes: 4 additions & 2 deletions utilities/fault_injection_env.h
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,8 @@ class FaultInjectionTestEnv : public EnvWrapper {
virtual Status RenameFile(const std::string& s,
const std::string& t) override;

virtual Status LinkFile(const std::string& s, const std::string& t) override;

// Undef to eliminate clash on Windows
#undef GetFreeSpace
virtual Status GetFreeSpace(const std::string& path,
Expand Down Expand Up @@ -237,13 +239,13 @@ class FaultInjectionTestEnv : public EnvWrapper {
SetFilesystemActiveNoLock(active, error);
error.PermitUncheckedError();
}
void AssertNoOpenFile() { assert(open_files_.empty()); }
void AssertNoOpenFile() { assert(open_managed_files_.empty()); }
Status GetError() { return error_; }

private:
port::Mutex mutex_;
std::map<std::string, FileState> db_file_state_;
std::set<std::string> open_files_;
std::set<std::string> open_managed_files_;
std::unordered_map<std::string, std::set<std::string>>
dir_to_new_files_since_last_sync_;
bool filesystem_active_; // Record flushes, syncs, writes
Expand Down
Loading

0 comments on commit 2df8905

Please sign in to comment.