Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: Add flag to choose pebble as storage engine #41453

Merged
merged 1 commit into from
Oct 16, 2019

Conversation

itsbilal
Copy link
Member

@itsbilal itsbilal commented Oct 8, 2019

This change adds a --storage-engine CLI flag, which has options
rocksdb and pebble. Specifying pebble ensures all stores created
on this node, including the temp engine, use pebble instead of rocksdb.

This is a per-node flag instead of a per-store flag to eliminate cases
where we have to maintain two caches, one for rocksdb instances and
one for pebble instances.

Also remove --temp-engine flag, which is now unnecessary.

Release note (cli change): Add command-line argument --storage-engine
to specify type of storage engine to use. Options are rocksdb (default)
and pebble.

@itsbilal itsbilal requested a review from a team as a code owner October 8, 2019 21:20
@itsbilal itsbilal self-assigned this Oct 8, 2019
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@itsbilal
Copy link
Member Author

itsbilal commented Oct 8, 2019

Dependency graph of related PRs:

#41199 ---------------------  #41452  --------------------------- #41455
(pebble engine)       \      (pebble MVCC scanner)              (MVCC test refactor)
                       \        
                        \
                         ----- #41453
                            (CLI changes)
                            (this PR)

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajkr, @itsbilal, @petermattis, and @sumeerbhola)


pkg/base/config.go, line 153 at r2 (raw file):

// Type implements the pflag.Value interface.
func (e *EngineType) Type() string { return "string" }

I'm not sure what Type() is used for, but I think it is supposed to distinguish from other flag types. Perhaps this should be "engine".


pkg/server/config.go, line 496 at r2 (raw file):

				// TODO(itsbilal): Tune these options, and allow them to be overridden
				// in the spec (similar to the existing spec.RocksDBOptions and others).
				pebbleOpts := &pebble.Options{

You don't seem to be specifying cache. Am I missing where that is done?

@knz
Copy link
Contributor

knz commented Oct 9, 2019

Not reviewing the code yet, I have two questions and a remark:

  1. how did you motivate the choice to make this flag global, instead of per-store? We document that a node can use multiple stores, is there a reason not to enable different stores from using different storage engines? (Like we already enable in-memory stores as a separate store type)

  2. You mention that it only works for non-temporary stores. So that means that temporary stores still use rocksdb? What was the rationale for that? (The rationale should be detailed in the commit message)

  3. This is a user-facing change (and a prominent one at that) and thus should be detailed in a release note.

@petermattis
Copy link
Collaborator

how did you motivate the choice to make this flag global, instead of per-store? We document that a node can use multiple stores, is there a reason not to enable different stores from using different storage engines? (Like we already enable in-memory stores as a separate store type)

A per-store flag is possible, but then we need to figure out how to allocate the block cache between Pebble and RocksDB. My preference is to avoid this complication and just make the flag apply to all stores. Actually trying to share the block cache between Pebble and RocksDB is a non-starter due to technical difficulty.

You mention that it only works for non-temporary stores. So that means that temporary stores still use rocksdb? What was the rationale for that? (The rationale should be detailed in the commit message)

Oh, good point. We should make this apply to temporary stores too!

@knz
Copy link
Contributor

knz commented Oct 9, 2019

A per-store flag is possible, but then we need to figure out how to allocate the block cache between Pebble and RocksDB. My preference is to avoid this complication and just make the flag apply to all stores. Actually trying to share the block cache between Pebble and RocksDB is a non-starter due to technical difficulty

This is a good explanation. It should be detailed in the commit message too.

@itsbilal
Copy link
Member Author

itsbilal commented Oct 9, 2019

how did you motivate the choice to make this flag global, instead of per-store? We document that a node can use multiple stores, is there a reason not to enable different stores from using different storage engines? (Like we already enable in-memory stores as a separate store type)

Peter has already answered this, but yes, the goal is to eliminate double-caching. I had originally implemented a per-store flag but was advised to go with this approach.

There's a separate flag for temp stores that's already in master and 19.2, --temp-engine. Given we were already creating a new cache for that in rocksdb (and would be doing the same in pebble), we don't lose out much if the temp and non-temp storage engines don't match.

There's probably more value in having more fine-grained control of each store that way, right? Instead of having --storage-engine override whatever was specified for --temp-engine (or eliminating the latter altogether). I'm fine with it either way, just want to know what would be the better way forward.

@petermattis
Copy link
Collaborator

There's probably more value in having more fine-grained control of each store that way, right? Instead of having --storage-engine override whatever was specified for --temp-engine (or eliminating the latter altogether). I'm fine with it either way, just want to know what would be the better way forward.

I had forgotten about --temp-engine. I think we should have one flag that controls all use of Pebble vs RocksDB.

@itsbilal itsbilal force-pushed the pebble-node-flag branch 4 times, most recently from 9feb314 to 63d5297 Compare October 10, 2019 15:40
Copy link
Member Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated commit message, added a release note, fixed cache instantiation (discussed in #41199), and removed --temp-storage. PTAL!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajkr, @petermattis, and @sumeerbhola)


pkg/base/config.go, line 153 at r2 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I'm not sure what Type() is used for, but I think it is supposed to distinguish from other flag types. Perhaps this should be "engine".

It seems like we expect it to be a primitive type everywhere else. Other similar Type() functions on CLI argument types seem to return either string or int or something of that nature.


pkg/server/config.go, line 496 at r2 (raw file):

Previously, petermattis (Peter Mattis) wrote…

You don't seem to be specifying cache. Am I missing where that is done?

Done. Cache is instantiated here now.

@knz
Copy link
Contributor

knz commented Oct 10, 2019

Updated commit message, added a release note

💯

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajkr, @itsbilal, and @sumeerbhola)


pkg/base/config.go, line 153 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

It seems like we expect it to be a primitive type everywhere else. Other similar Type() functions on CLI argument types seem to return either string or int or something of that nature.

It doesn't have to be a primitive type. The spf13/pflag library we use internally uses types like intSlice and stringArray. I'm not quite sure what the purpose of this is, though. The docs provide no clue and it only seems to be used for special formatting. Fine to leave as "string".

This change adds a --storage-engine CLI flag, which has options
`rocksdb` and `pebble`. Specifying `pebble` ensures all stores created
on this node, including the temp engine, use pebble instead of rocksdb.

This is a per-node flag instead of a per-store flag to eliminate cases
where we have to maintain two caches, one for rocksdb instances and
one for pebble instances.

Also remove --temp-engine flag, which is now unnecessary.

Release note (cli change): Add command-line argument --storage-engine
to specify type of storage engine to use. Options are rocksdb (default)
and pebble.
@itsbilal
Copy link
Member Author

Thanks for the review! Did a no-change rebase, going to merge when CI is green. Though the flag will practically be useless until #41452 lands.

@petermattis
Copy link
Collaborator

Thanks for the review! Did a no-change rebase, going to merge when CI is green. Though the flag will practically be useless until #41452 lands.

You might want to hold off on merging, or perhaps add a commit which disables support for pebble, otherwise you'll be fielding questions about why pebble doesn't work.

@itsbilal
Copy link
Member Author

Makes sense. I'll hold off on merging until #41452 lands then. Thanks!

@itsbilal
Copy link
Member Author

bors r+

craig bot pushed a commit that referenced this pull request Oct 16, 2019
41452: engine: Port MVCCScanner to Go, implement MVCCScan for pebble r=itsbilal a=itsbilal

This change ports the MVCC Scanner from libroach to go, for use
with Pebble's iterators. One half of the work involved in #39674.
Also updates pebbleIterator to call it in pebbleIterator.MVCC*.

First commit is #41199 , which should merge soon.

Release note: None

41453: cli: Add flag to choose pebble as storage engine  r=itsbilal a=itsbilal

This change adds a --storage-engine CLI flag, which has options
`rocksdb` and `pebble`. Specifying `pebble` ensures all stores created
on this node, including the temp engine, use pebble instead of rocksdb.

This is a per-node flag instead of a per-store flag to eliminate cases
where we have to maintain two caches, one for rocksdb instances and
one for pebble instances.

Also remove --temp-engine flag, which is now unnecessary.

Release note (cli change): Add command-line argument --storage-engine
to specify type of storage engine to use. Options are rocksdb (default)
and pebble.

Co-authored-by: Bilal Akhtar <[email protected]>
@craig
Copy link
Contributor

craig bot commented Oct 16, 2019

Build succeeded

@craig craig bot merged commit cb77d2b into cockroachdb:master Oct 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants