Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sstable: introduce objstorage interface #2267

Merged
merged 1 commit into from
Feb 1, 2023

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Jan 23, 2023

This change introduces objstorage.Provider which replaces uses of vfs.FS / vfs.File in sstable code.

We pull out the write-buffering and read-ahead logic into the Provider implementation. This simplifies the sstable reader code considerably.

The provider will be extended to support shared storage, along the following lines:

  • define a new SharedObjectStore interface
  • the Provider stores a SharedObjectStore instance (alongside the local vfs.FS)
  • the Provider maintains internally a mapping from FileNum to backend type (local vs shared). Methods like OpenForReading use the mapping to talk to the correct backend.
  • Create needs to be augmented with extra information (e.g. LSM level?) to allow the provider to decide where a new object is created

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde
Copy link
Member Author

See #2254 for comments on an earlier version of this.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still reading)

Reviewed 5 of 36 files at r1.
Reviewable status: 5 of 36 files reviewed, 4 unresolved discussions (waiting on @itsbilal and @RaduBerinde)


-- commits line 18 at r1:
We can just expect the caller to specify whether it wants it created locally or shareable.
The input to that decision can include what key span the caller expects to write to this sstable, so it is probably best not to complicate the interface and keep the decision making in the caller.


objstorage/provider.go line 24 at r1 (raw file):

	st Settings

	// TODO(radu): to we support shared storage as well, this object will need to

should this say: when we support ...


objstorage/readahead.go line 17 at r1 (raw file):

	maxReadaheadSize     = 256 << 10 /* 256KB */
)

I ignored this file assuming this is code movement. Is that a fair assumption?


objstorage/vfs_readable.go line 160 at r1 (raw file):

	size int64

	rh NoopReadaheadHandle

Isn't our current code able to do readahead (not the OS-level one) without having a file descriptor?
We now seem to be calling newGenericFileReadable(file) if there isn't a file descriptor.

@RaduBerinde
Copy link
Member Author

objstorage/vfs_readable.go line 160 at r1 (raw file):

Previously, sumeerbhola wrote…

Isn't our current code able to do readahead (not the OS-level one) without having a file descriptor?
We now seem to be calling newGenericFileReadable(file) if there isn't a file descriptor.

Not as far as I can tell - the only thing the (non-OS-level) readahead code does is call Prefetch which requires a descriptor.

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 36 files reviewed, 4 unresolved discussions (waiting on @itsbilal and @sumeerbhola)


-- commits line 18 at r1:

Previously, sumeerbhola wrote…

We can just expect the caller to specify whether it wants it created locally or shareable.
The input to that decision can include what key span the caller expects to write to this sstable, so it is probably best not to complicate the interface and keep the decision making in the caller.

Done.


objstorage/provider.go line 24 at r1 (raw file):

Previously, sumeerbhola wrote…

should this say: when we support ...

Done.


objstorage/readahead.go line 17 at r1 (raw file):

Previously, sumeerbhola wrote…

I ignored this file assuming this is code movement. Is that a fair assumption?

Yes.

Copy link
Member

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is mostly in great shape! Just some minor comments.

Reviewed 29 of 36 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @RaduBerinde and @sumeerbhola)


table_cache.go line 927 at r3 (raw file):

func (v *tableCacheValue) load(meta *fileMetadata, c *tableCacheShard, dbOpts *tableCacheOpts) {
	// Try opening the file first.
	provider := objstorage.New(objstorage.DefaultSettings(dbOpts.fs, dbOpts.dirname))

it should be a pretty simple change to move this provider to dbOpts ?


objstorage/vfs_readable.go line 90 at r3 (raw file):

var readaheadHandlePool = sync.Pool{
	New: func() interface{} {
		i := &readaheadHandle{}

Since these will only exist in the context of singleLevelIterator, which is already pooled, thoughts on making this an exported struct, adding it to the singleLevelIterator struct and doing some mild interface-no-nos by having an InitForReadable(r Readable) instead of the current readable.NewReadaheadHandle() ? It's less clean but reduces one pool. It would necessitate more "does this file support readahead, if not just do nothing / relay the read as-is" if-statements in the methods below because this handle would have to handle all readables, and that alone is enough of a reason to prefer the current approach, so feel free to dismiss this suggestion. Just throwing the idea around since I don't have much insight into the overhead of pools.

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @itsbilal and @sumeerbhola)


table_cache.go line 927 at r3 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

it should be a pretty simple change to move this provider to dbOpts ?

Yes, I'm working on a follow-up change that does this (and removes fs and dirname, updating the rest of the code to use the provider).


objstorage/vfs_readable.go line 90 at r3 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Since these will only exist in the context of singleLevelIterator, which is already pooled, thoughts on making this an exported struct, adding it to the singleLevelIterator struct and doing some mild interface-no-nos by having an InitForReadable(r Readable) instead of the current readable.NewReadaheadHandle() ? It's less clean but reduces one pool. It would necessitate more "does this file support readahead, if not just do nothing / relay the read as-is" if-statements in the methods below because this handle would have to handle all readables, and that alone is enough of a reason to prefer the current approach, so feel free to dismiss this suggestion. Just throwing the idea around since I don't have much insight into the overhead of pools.

Pools are very fast in the steady-state - each CPU maintains its own cache of objects so the fast path is very fast. I doubt removing the pool would be noticeable here.

I did ponder whether to make the handle a struct.. In the end it feels like as we will add one more implementation for shared storage, we'd benefit from having an interface.. In any case, it's easy to change this later if we change our mind.

Copy link
Member

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: though I'm not sure if @sumeerbhola has more stuff to suggest here.

Reviewed 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @sumeerbhola)


table_cache.go line 927 at r3 (raw file):

Previously, RaduBerinde wrote…

Yes, I'm working on a follow-up change that does this (and removes fs and dirname, updating the rest of the code to use the provider).

Sounds good! Looking forward to that cleanup then.


objstorage/vfs_readable.go line 90 at r3 (raw file):

Previously, RaduBerinde wrote…

Pools are very fast in the steady-state - each CPU maintains its own cache of objects so the fast path is very fast. I doubt removing the pool would be noticeable here.

I did ponder whether to make the handle a struct.. In the end it feels like as we will add one more implementation for shared storage, we'd benefit from having an interface.. In any case, it's easy to change this later if we change our mind.

Makes sense, let's stick with this approach then

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 24 of 36 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @RaduBerinde)


sstable/reader.go line 543 at r4 (raw file):

//
// TODO(radu, sumeer): we should use a ReadaheadHandle here, separate from the
// ReadHandle for the data blocks. Especially for the compaction case, where we

nit: s/ReadHandle/ReadaheadHandle/

This PR will make this TODO trivial since we can simply create a new ReadaheadHandle via NewReadaheadHandle().

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs! I consider this ready to merge but I will hold off on the merge for now to prevent interfering with the work in #2271.

Reviewable status: 35 of 36 files reviewed, 4 unresolved discussions (waiting on @itsbilal and @sumeerbhola)


sstable/reader.go line 543 at r4 (raw file):

Previously, sumeerbhola wrote…

nit: s/ReadHandle/ReadaheadHandle/

This PR will make this TODO trivial since we can simply create a new ReadaheadHandle via NewReadaheadHandle().

Fixed.

@RaduBerinde
Copy link
Member Author

Updated. The only recent change was removing the fd from fileReadable; first commit is #2293.

itsbilal added a commit to itsbilal/pebble that referenced this pull request Feb 1, 2023
This change adds a vfs.SharedStorage interface that can be
implemented by blob storage drivers. The objstorage.Provider
coming in cockroachdb#2267 will largely be responsible for storing
state of files' locations (i.e. shared or local), and calling into
vfs.SharedStorage as necessary.

This change also adds an InstanceID uint32 to make the life of
the shared storage implementation easier in disambiguating
its own files from that of others.
This change introduces `objstorage.Provider` which replaces uses of
`vfs.FS` / `vfs.File` in sstable code.

We pull out the write-buffering and read-ahead logic into the Provider
implementation. This simplifies the sstable reader code considerably.

The provider will be extended to support shared storage, along the
following lines:
 - define a new `SharedObjectStore` interface
 - the Provider stores a `SharedObjectStore` instance
 - the Provider maintains internally a mapping from `FileNum` to
   backend type (local vs shared). Methods like `OpenForReading` use
   the mapping to talk to the correct backend.
 - `Create` needs to be augmented with extra information to choose
    where a new object is created (local vs shared)
@RaduBerinde RaduBerinde merged commit f2e950c into cockroachdb:master Feb 1, 2023
@RaduBerinde RaduBerinde deleted the objstorage-take-two branch February 1, 2023 17:23
itsbilal added a commit to itsbilal/pebble that referenced this pull request Feb 1, 2023
This change adds a vfs.SharedStorage interface that can be
implemented by blob storage drivers. The objstorage.Provider
coming in cockroachdb#2267 will largely be responsible for storing
state of files' locations (i.e. shared or local), and calling into
vfs.SharedStorage as necessary.
itsbilal added a commit to itsbilal/pebble that referenced this pull request Feb 6, 2023
This change adds a objstorage.shared.Storage interface that can be
implemented by blob storage drivers. The objstorage.Provider
coming in cockroachdb#2267 will largely be responsible for storing
state of files' locations (i.e. shared or local), and calling into
Storage as necessary.
itsbilal added a commit to itsbilal/pebble that referenced this pull request Feb 6, 2023
This change adds a objstorage.shared.Storage interface that can be
implemented by blob storage drivers. The objstorage.Provider
coming in cockroachdb#2267 will largely be responsible for storing
state of files' locations (i.e. shared or local), and calling into
Storage as necessary.
itsbilal added a commit that referenced this pull request Feb 7, 2023
This change adds a objstorage.shared.Storage interface that can be
implemented by blob storage drivers. The objstorage.Provider
coming in #2267 will largely be responsible for storing
state of files' locations (i.e. shared or local), and calling into
Storage as necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants