Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pebble: race between MemFS.Close and MemFS.Remove #1236

Open
bananabrick opened this issue Aug 24, 2021 · 6 comments
Open

pebble: race between MemFS.Close and MemFS.Remove #1236

bananabrick opened this issue Aug 24, 2021 · 6 comments

Comments

@bananabrick
Copy link
Contributor

bananabrick commented Aug 24, 2021

The tableCache on master has a release loop which closes files, https://github.com/cockroachdb/pebble/blob/master/table_cache.go#L173. But it doesn't seem like we wait for a file to close before removing it.

  • The MemFS keeps track of reference counts for the files which is decremented on MemFS.Close.
  • MemFS.Remove errors out if we try to remove a file with a ref count > 0.
  • So, I'm seeing a race where the releaseLoop function in the table cache only attempts to close a file, after MemFS.Remove has been called. This prevents the file from being deleted.
  • Moreover, it seems as if we don't really do anything with the error returned from d.opts.Cleaner.Clean in d.deleteObsoleteFile. - The error gets logged here: https://github.com/cockroachdb/pebble/blob/master/compaction.go#L2872, but we don't do anything to rectify the error.

Jira issue: PEBBLE-220

@petermattis
Copy link
Collaborator

Where is MemFS.Remove being called on a still open sstable? That is definitely a bug if it is occurring, but I'm not seeing the path that causes it from your description. doDeleteObsoleteFiles() calls tableCache.evict() which calls tableCacheShard.evict() which synchronously calls tableCacheValue.release() which calls sstable.Reader.Close() which calls File.Close(). There is a comment in tableCacheShard.evict() that the close needs to happen synchronously:

		// NB: This is equivalent to tableCacheShard.releaseNode(), but we perform
		// the tableCacheNode.release() call synchronously below to ensure the
		// sstable file descriptor is closed before returning. Note that
		// tableCacheShard.releasing needs to be incremented while holding
		// tableCacheShard.mu in order to avoid a race with Close()

Moreover, it seems as if we don't really do anything with the error returned from d.opts.Cleaner.Clean in d.deleteObsoleteFile. - The error gets logged here: https://github.com/cockroachdb/pebble/blob/master/compaction.go#L2872, but we don't do anything to rectify the error.

Yeah, there is a TODO about this:

	// TODO(peter): need to handle this error, probably by re-adding the
	// file that couldn't be deleted to one of the obsolete slices map.
	err := d.opts.Cleaner.Clean(d.opts.FS, fileType, path)
	if oserror.IsNotExist(err) {
		return
	}

The suggestion in the TODO seems reasonable. The thinking on why this wasn't urgently addressed is that on startup we'll capture any obsolete file that wasn't previously deleted. Still, it would be worthwhile to address the TODO.

@bananabrick
Copy link
Contributor Author

bananabrick commented Aug 25, 2021

@petermattis

In tableCacheShard.evict, we only call v.release if v != nil. Unfortunately, in the code path where we call tableCacheShard.addNode -> tableCacheShard.evictNodes -> tableCacheShard.runHandCold to tableCacheShard.clearNode -> tableCacheShard.unrefValue, we end up adding the value to the releasingCh, and setting the value to nil.

When the file is removed later through a call to doDeleteObsoleteFiles, the releaseLoop hasn't necessarily closed the file yet.

The reason the evict function call doesn't end up calling release on the value is because it has already been set to nil.

@petermattis
Copy link
Collaborator

@bananabrick Ah, that's a great find. I'm not sure what to do here. We may need a secondary map of evicted sstables that we're waiting to close. Even worse, if you evict and re-open and the evict, there may be multiple file handles opened for the same sstable simultaneously. Not really a problem except that we need to wait for all of them to close. Note that the MemFS behavior is meant to emulate the Windows behavior. On Unix it isn't a problem to delete a file that is still open.

@bananabrick
Copy link
Contributor Author

It seems like windows support isn't a priority, but we do have some tests which seem to rely on us closing files. I've run into similar issues while trying to write more tests for the shared table cache.

I can look into this more carefully next week.

@bananabrick bananabrick self-assigned this Aug 25, 2021
@petermattis
Copy link
Collaborator

Windows support isn't a huge priority, but Pebble does make an attempt at it. This bug is currently the only one I'm aware of with regards to Windows.

jbowens added a commit to jbowens/pebble that referenced this issue Oct 27, 2022
Default to Unix semantics in MemFS. The Windows semantics are left as
configurable for unit tests that seek to ensure we don't remove open
files, when possible.

There are existing cases where we cannot satisfy Windows semantics, and
Go does not support opening files with the appropriate
`FILE_SHARE_DELETE` permission bit (see golang/go#34681,
golang/go#32088). The MemFS's implementation of Windows semantics is
useful for ensuring we don't regress in the cases where we can satisfy
Windows semantics.

Close cockroachdb#2064.
Informs cockroachdb#1236.
jbowens added a commit that referenced this issue Oct 27, 2022
Default to Unix semantics in MemFS. The Windows semantics are left as
configurable for unit tests that seek to ensure we don't remove open
files, when possible.

There are existing cases where we cannot satisfy Windows semantics, and
Go does not support opening files with the appropriate
`FILE_SHARE_DELETE` permission bit (see golang/go#34681,
golang/go#32088). The MemFS's implementation of Windows semantics is
useful for ensuring we don't regress in the cases where we can satisfy
Windows semantics.

Close #2064.
Informs #1236.
jbowens added a commit to jbowens/pebble that referenced this issue Oct 16, 2023
Default to Unix semantics in MemFS. The Windows semantics are left as
configurable for unit tests that seek to ensure we don't remove open
files, when possible.

There are existing cases where we cannot satisfy Windows semantics, and
Go does not support opening files with the appropriate
`FILE_SHARE_DELETE` permission bit (see golang/go#34681,
golang/go#32088). The MemFS's implementation of Windows semantics is
useful for ensuring we don't regress in the cases where we can satisfy
Windows semantics.

Close cockroachdb#2064.
Informs cockroachdb#1236.
jbowens added a commit that referenced this issue Oct 17, 2023
Default to Unix semantics in MemFS. The Windows semantics are left as
configurable for unit tests that seek to ensure we don't remove open
files, when possible.

There are existing cases where we cannot satisfy Windows semantics, and
Go does not support opening files with the appropriate
`FILE_SHARE_DELETE` permission bit (see golang/go#34681,
golang/go#32088). The MemFS's implementation of Windows semantics is
useful for ensuring we don't regress in the cases where we can satisfy
Windows semantics.

Close #2064.
Informs #1236.
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it
in 10 days to keep the issue queue tidy. Thank you for your
contribution to Pebble!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

3 participants