-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vfs: add SyncData,SyncTo,Preallocate to vfs.File #2271
Conversation
286ad40
to
bce3671
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 24 files reviewed, 1 unresolved discussion (waiting on @jbowens)
vfs/vfs.go
line 34 at r2 (raw file):
Attributes() FileAttributes
Is there an assumption being made here that all the real implementation of these IO causing methods is provided by the baseline File that can then be wrapped as many times as needed. That is, a wrapper can't implement something like Preallocate
itself by writing to the file, or do some additional fiddling with metadata (imagining a world where encryptedFile
occasionally needs to update some metadata). Without this assumption would we be back in a situation where the ordering of wrappers matters? Or is the only requirement that everything that does IO needs to be explicitly listed in the interface. I suspect it is the latter:
- if the
diskHealthCheckingFile
wrapper is above something in this interface that does additional IO, it will capture the latency of this compound operation. - if the
diskHealthCheckingFile
wrapper is below that something that does a compound operation, since those will be expressed in terms of the simpler operations listed here, they will get individually timed.
Is my understanding correct?
And do we need to say that if some implementation of File
spins up some asynchronous background activity then we do need that that wrapper use a File
that has already been wrapped with diskHealthCheckingFile
?
I suppose even with this constraint, the fact that all we need is a partial order over the wrappings is an improvement over requiring that syncingFile
directly wrap diskHealthCheckingFile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 24 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)
vfs/vfs.go
line 34 at r2 (raw file):
Or is the only requirement that everything that does IO needs to be explicitly listed in the interface. I suspect it is the latter
Yeah, that matches my understanding.
And do we need to say that if some implementation of File spins up some asynchronous background activity then we do need that that wrapper use a File that has already been wrapped with diskHealthCheckingFile?
Yeah, I think that's right. I'd love to protect against that somehow, but it seems really tricky. In crdb_test/invariant builds, we could add a VFS
at the top and bottom of the stack. All files created at the bottom of the stack should also be recorded at the top of the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 24 files reviewed, 4 unresolved discussions (waiting on @jbowens and @sumeerbhola)
-- commits
line 2 at r2:
Can you include some context (what do we want to achieve)?
vfs/default_linux.go
line 36 at r2 (raw file):
// Assert that linuxFile and linuxDir implement vfs.File. var ( _ File = linuxDir{}
It's almost always a bad idea to have an interface be implemented by a bare (non-empty) struct (as opposed to a pointer to the struct). In general, Go needs to make a copy of the object each time it makes a call to a non-pointer-receiver method through the interface. Setting an interface value to an instance of the struct always incurs an allocation; it can lead to subtle cases where you wouldn't normally allocate, likeif dir, ok := f.(linuxDir); ok { some_func_with_file_argument(dir) }
By the way, if you really want to avoid an indirection (doesn't seem important here), you can do type linuxDir os.File
and manually implement the vfs.File wrappers.
vfs/vfs.go
line 33 at r2 (raw file):
io.Writer Attributes() FileAttributes
This needs some documentation. When is it acceptable to return empty attributes? (as some implementations here do)
c355abb
to
05b9229
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 24 files reviewed, 2 unresolved discussions (waiting on @sumeerbhola)
Previously, RaduBerinde wrote…
Can you include some context (what do we want to achieve)?
Done.
vfs/default_linux.go
line 36 at r2 (raw file):
Previously, RaduBerinde wrote…
It's almost always a bad idea to have an interface be implemented by a bare (non-empty) struct (as opposed to a pointer to the struct). In general, Go needs to make a copy of the object each time it makes a call to a non-pointer-receiver method through the interface. Setting an interface value to an instance of the struct always incurs an allocation; it can lead to subtle cases where you wouldn't normally allocate, like
if dir, ok := f.(linuxDir); ok { some_func_with_file_argument(dir) }
By the way, if you really want to avoid an indirection (doesn't seem important here), you can do
type linuxDir os.File
and manually implement the vfs.File wrappers.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking to the brittleness of the previous approach, #2255 which merged today also would've broken disk-stall detection if it weren't already broken, because it changed the signature of timeDiskOp
which the syncing file had a dependency on.
Reviewable status: 0 of 24 files reviewed, 2 unresolved discussions (waiting on @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you've already started reviewing this PR, please feel free to complete it here. Otherwise, please start on the 22.2 backport: #2276. I'll be focusing on updating the 22.2 backport right now, and will incorporate comments from both PRs on that branch.
Reviewable status: 0 of 24 files reviewed, 2 unresolved discussions (waiting on @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 24 files reviewed, 9 unresolved discussions (waiting on @jbowens and @sumeerbhola)
internal/errorfs/errorfs.go
line 335 at r5 (raw file):
func (f *errorFile) Attributes() vfs.FileAttributes { return vfs.FileAttributes{}
Why not return file.Attributes()
? If the underlying file supports SyncTo
, we do too.
vfs/default_linux.go
line 1 at r5 (raw file):
// Copyright 2019 The LevelDB-Go and Pebble Authors. All rights reserved. Use
[nit] 2023
vfs/default_unix.go
line 1 at r5 (raw file):
// Copyright 2014 The LevelDB-Go and Pebble Authors. All rights reserved. Use
[nit] 2023
vfs/default_windows.go
line 1 at r5 (raw file):
// Copyright 2014 The LevelDB-Go and Pebble Authors. All rights reserved. Use
[nit] 2023
vfs/syncing_file.go
line 168 at r5 (raw file):
if f.fileAttrs.CanSyncTo { // Even if NoSyncOnClose is set, since the file supports CanSyncTo
When and why NoSyncOnClose
is used? I can't find any production code (in pebble or cockroach) which sets it.
It's strange that we still want to SyncTo
when that flag is set. If this wasn't the case, it would be cleaner to change the semantics of SyncTo()
: it syncs up to the offset, even if it has to internally do a "full" Sync()
to achieve that (similar to SyncData
). By the way the linux implementation here already does this. Then we don't need to check the capability, we can just call SyncTo
in all cases (we could probably remove Attributes()
altogether)
vfs/vfs.go
line 33 at r2 (raw file):
Previously, RaduBerinde wrote…
This needs some documentation. When is it acceptable to return empty attributes? (as some implementations here do)
File attributes has an existing meaning (permissions, size, date, etc). How about using Capabilities
?
vfs/vfs.go
line 54 at r5 (raw file):
// SyncTo may perform data syncing asynchronously in the background. See // vfs.NewSyncingFile for more information on its intended use. SyncTo(length int64) error
The underlying syscall supports an arbitrary range, I don't think we should fix the start offset at 0 in the interface. (maybe it would even be a tiny bit faster to specify an offset that was already synced?)
vfs/vfs.go
line 204 at r5 (raw file):
for oserror.IsExist(err) { if removeErr := os.Remove(name); removeErr != nil && !oserror.IsNotExist(removeErr) { return wrapOSFile(osFile), errors.WithStack(removeErr)
This is an error case (and can be below as well). How is wrapOSFile
supposed to handle a nil
argument? Currently it returns a valid File
that would panic on most calls. I guess that's more of less what happened before as well.
In any case, we should document the nil handling for wrapOSFile
(consider making wrapOSFile
a non-OS dependent wrapper and documenting it there, and have it call out to a wrapOSFileImpl()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 24 files reviewed, 5 unresolved discussions (waiting on @RaduBerinde and @sumeerbhola)
internal/errorfs/errorfs.go
line 335 at r5 (raw file):
Previously, RaduBerinde wrote…
Why not return
file.Attributes()
? If the underlying file supportsSyncTo
, we do too.
Good catch, this was just oversight.
vfs/syncing_file.go
line 168 at r5 (raw file):
Previously, RaduBerinde wrote…
When and why
NoSyncOnClose
is used? I can't find any production code (in pebble or cockroach) which sets it.It's strange that we still want to
SyncTo
when that flag is set. If this wasn't the case, it would be cleaner to change the semantics ofSyncTo()
: it syncs up to the offset, even if it has to internally do a "full"Sync()
to achieve that (similar toSyncData
). By the way the linux implementation here already does this. Then we don't need to check the capability, we can just callSyncTo
in all cases (we could probably removeAttributes()
altogether)
It's never used, but it probably should be used in temp engines. Here's the context: #873.
Agreed, the interface would be much simpler without it. I was tempted to drop support for it altogether given that it's currently unused and the present crunch.
Regarding the strangeness of SyncTo
-ing when it's set, the goal of the setting is to avoid any synchronous syncing of dirty blocks. In use cases like the temp engines where we don't want to wait because persistence doesn't matter, we still want the latency-smoothing effect of requesting a sync of the tail.
vfs/vfs.go
line 33 at r2 (raw file):
Previously, RaduBerinde wrote…
File attributes has an existing meaning (permissions, size, date, etc). How about using
Capabilities
?
Yeah, this was unnecessarily general. Done.
vfs/vfs.go
line 54 at r5 (raw file):
Previously, RaduBerinde wrote…
The underlying syscall supports an arbitrary range, I don't think we should fix the start offset at 0 in the interface. (maybe it would even be a tiny bit faster to specify an offset that was already synced?)
I'd like to keep the changes as light as we can to resolve the disk stall issue (which is admittedly still not very light). We can revisit this afterwards. I'm not sure why the syncing file only ever issued sync_file_range
calls beginning at the beginning of the file. It does seem a little peculiar.
vfs/vfs.go
line 204 at r5 (raw file):
Previously, RaduBerinde wrote…
This is an error case (and can be below as well). How is
wrapOSFile
supposed to handle anil
argument? Currently it returns a validFile
that would panic on most calls. I guess that's more of less what happened before as well.In any case, we should document the nil handling for
wrapOSFile
(consider makingwrapOSFile
a non-OS dependent wrapper and documenting it there, and have it call out to awrapOSFileImpl()
)
Done.
Previously, jbowens (Jackson Owens) wrote…
I think we could still make SyncTo call Sync if it doesn't support it, and call SyncTo unconditionally here. We would still get the behavior we want in practice (on linux). |
a6efe4b
to
1dd543c
Compare
Expand the vfs.File interface to expose SyncData, SyncTo and Preallocate as first-class methods. Previously, the file created through vfs.NewSyncingFile would perform analagous I/O operations (eg, `Fdatasync`, `sync_file_range`, `Fallocate`) outside the context of the interface. This easily allowed the accidental loss of the disk-health checking over these operations by minor tweaks to the interface of the disk-health checking implementation or by adding intermediary VFS wrappers between the disk-health checking FS and the syncing file. See cockroachdb/cockroach#94373 where the introduction of an additional VFS wrapper resulted in these I/O operations being uncovered by disk-stall detection.
Incorporated the remaining feedback from #2276, so this is good for another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 24 files reviewed, 3 unresolved discussions (waiting on @jbowens and @sumeerbhola)
TFTR! |
Forward-port of #2276.
Expand the vfs.File interface to expose SyncData, SyncTo and Preallocate as
first-class methods. Previously, the file created through vfs.NewSyncingFile
would perform analagous I/O operations (eg,
Fdatasync
,sync_file_range
,Fallocate
) outside the context of the interface. This easily allowed theaccidental loss of the disk-health checking over these operations by minor
tweaks to the interface of the disk-health checking implementation or by adding
intermediary VFS wrappers between the disk-health checking FS and the syncing
file.
See cockroachdb/cockroach#94373 where the introduction of an additional VFS
wrapper resulted in these I/O operations being uncovered by disk-stall
detection.