Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for MANIFEST tailing #4710

Closed
wants to merge 1 commit into from

Conversation

riversand963
Copy link
Contributor

@riversand963 riversand963 commented Nov 22, 2018

This depends on #4602 and #4745
Test plan (to be updated)

$make clean && make -j32 all check
$./db_secondary_test

All tests much pass.

@riversand963 riversand963 added the WIP Work in progress label Nov 22, 2018
@riversand963 riversand963 self-assigned this Nov 22, 2018
@riversand963 riversand963 force-pushed the ro_db_manifest branch 2 times, most recently from d620c76 to 376391c Compare November 29, 2018 18:33
@riversand963 riversand963 force-pushed the ro_db_manifest branch 8 times, most recently from 7c909c7 to 97c83a3 Compare December 20, 2018 22:48
@riversand963 riversand963 changed the title [WIP] Add support for MANIFEST tailing Add support for MANIFEST tailing Dec 20, 2018
@riversand963 riversand963 removed the WIP Work in progress label Dec 20, 2018
@riversand963 riversand963 requested a review from siying December 21, 2018 17:33
tracer_->Get(column_family, key);
}
}
SuperVersion* super_version = cfd->GetSuperVersion();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this SuperVersion go away later? Do we need to do the thread-local super version trick?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. If the secondary calls InstallSuperVersion (in another thread), this superversion can go away.

ColumnFamilyHandle* column_family) {
auto cfh = reinterpret_cast<ColumnFamilyHandleImpl*>(column_family);
auto cfd = cfh->cfd();
SuperVersion* super_version = cfd->GetSuperVersion()->Ref();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a data race here? Do we need to do the thread-local super version trick?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Let me take a look.

return Status::OK();
}

Status DBImplSecondary::TryCatchUpWithPrimary() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is supposed to call this function? I didn't find it called, except in a unit test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it's called only in unit test (#4820). I'm thinking about adding a separate thread calling this function in a future PR.

// It's possible that we have finished reading the current MANIFEST, and
// the primary has created a new MANIFEST.
log::Reader::Reporter* reporter = reader->GetReporter();
s = MaybeSwitchManifest(reporter, manifest_reader);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is complicated... If you asked me before, I would say we don't support manifest switch for now. Anyway since it is already implemented...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let's keep it for now.

// the primary has created a new MANIFEST.
log::Reader::Reporter* reporter = reader->GetReporter();
s = MaybeSwitchManifest(reporter, manifest_reader);
reader = manifest_reader->get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so far I didn't see how this safely handles manfiest switching. Handling a new manifest, we need to start with empty Version. Maybe I miss something but I don't see it is the way handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not keep Versions across different MANIFESTs, I assume you are referring to VersionBuilders which actually save (buffer) the changes of one or multiple VersionEdits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @siying offline. For this PR, we are not going to worry about MANIFEST switching.

@riversand963 riversand963 force-pushed the ro_db_manifest branch 3 times, most recently from 78fa5b2 to c750120 Compare December 28, 2018 19:04
Test plan (to be updated)
```
$make clean && make -j32 all check
$./db_secondary_test
```

All tests must pass.
@riversand963
Copy link
Contributor Author

Moved to #4899

facebook-github-bot pushed a commit that referenced this pull request Mar 26, 2019
Summary:
This PR allows RocksDB to run in single-primary, multi-secondary process mode.
The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary.
Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`.

This PR has several components:
1. (Originally in #4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary.

2. (Similar to #4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue.

3. (Originally in #4710 and #4820). Add implementation of the secondary, i.e. `DBImplSecondary`.
3.1 Tail the primary's MANIFEST during recovery.
3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`.
3.3 Tailing WAL will be in a future PR.

4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code.
Pull Request resolved: #4899

Differential Revision: D14510945

Pulled By: riversand963

fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
jwasinger pushed a commit to jwasinger/rocksdb that referenced this pull request Apr 1, 2019
Summary:
This PR allows RocksDB to run in single-primary, multi-secondary process mode.
The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary.
Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`.

This PR has several components:
1. (Originally in facebook#4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary.

2. (Similar to facebook#4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue.

3. (Originally in facebook#4710 and facebook#4820). Add implementation of the secondary, i.e. `DBImplSecondary`.
3.1 Tail the primary's MANIFEST during recovery.
3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`.
3.3 Tailing WAL will be in a future PR.

4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code.
Pull Request resolved: facebook#4899

Differential Revision: D14510945

Pulled By: riversand963

fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
vagogte pushed a commit to vagogte/rocksdb that referenced this pull request Jun 18, 2019
Summary:
This PR allows RocksDB to run in single-primary, multi-secondary process mode.
The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary.
Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`.

This PR has several components:
1. (Originally in facebook#4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary.

2. (Similar to facebook#4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue.

3. (Originally in facebook#4710 and facebook#4820). Add implementation of the secondary, i.e. `DBImplSecondary`.
3.1 Tail the primary's MANIFEST during recovery.
3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`.
3.3 Tailing WAL will be in a future PR.

4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code.
Pull Request resolved: facebook#4899

Differential Revision: D14510945

Pulled By: riversand963

fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants