-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/pebbleiter: guard against double-closed Iterators #90859
Conversation
c572aca
to
456dd3f
Compare
642dce5
to
4485497
Compare
4485497
to
6102977
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, would it also make sense to run some of our roachtests with these new assertions enabled?
Reviewed 14 of 14 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jbowens)
pkg/storage/pebbleiter/BUILD.bazel
line 7 at r1 (raw file):
# gazelle:exclude gen-crdb_test_on.go # keep
Is this a necessary build directive? Same for the keep
below in the go_test
target.
pkg/storage/pebbleiter/BUILD.bazel
line 19 at r1 (raw file):
) REMOVE_GO_BUILD_CONSTRAINTS = "cat $< | grep -v '//go:build' | grep -v '// +build' > $@"
I don't understand the build system enough to be able to tell at a glance why this is needed. Worth a comment? Or is this common enough knowledge for people that speak Bazel?
pkg/storage/pebbleiter/crdb_test_off.go
line 22 at r1 (raw file):
// MaybeWrap returns the provided Iterator, unmodified. func MaybeWrap(iter *pebble.Iterator) Iterator {
Do these inline? If they don't, do we care about the overhead?
pkg/storage/pebbleiter/crdb_test_on.go
line 31 at r1 (raw file):
// AssertionIter wraps a *pebble.Iterator with assertion checking. type AssertionIter struct {
nit: does this need to be exported?
6102977
to
7e098bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 14 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jbowens and @nicktrav)
pkg/storage/pebbleiter/crdb_test_off.go
line 22 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Do these inline? If they don't, do we care about the overhead?
worth adding // gcassert:inline
7e098bf
to
7df8fba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nicktrav, @rickystewart, and @sumeerbhola)
pkg/storage/pebbleiter/BUILD.bazel
line 7 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Is this a necessary build directive? Same for the
keep
below in thego_test
target.
I believe so, but I copied this from the buildutil
package.
pkg/storage/pebbleiter/BUILD.bazel
line 19 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
I don't understand the build system enough to be able to tell at a glance why this is needed. Worth a comment? Or is this common enough knowledge for people that speak Bazel?
I couldn't tell you what they do either, but I lifted them from the only other build-tag-constrained file. Maybe @rickystewart knows.
pkg/storage/pebbleiter/crdb_test_off.go
line 22 at r1 (raw file):
Previously, sumeerbhola wrote…
worth adding
// gcassert:inline
The production one does, and done.
pkg/storage/pebbleiter/crdb_test_on.go
line 31 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
nit: does this need to be exported?
unexported
7df8fba
to
e4200a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, 7 of 7 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rickystewart)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, the comment is more food for thought than a suggestion you need to take.
// by the Apache License, Version 2.0, included in the file | ||
// licenses/APL.txt. | ||
|
||
//go:build !crdb_test || crdb_test_off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth considering whether eschewing the conditional compilation and just sticking with a straightforward wrapping design might be fine. For example, something like:
import "github.com/cockroachdb/cockroach/pkg/util/buildutil"
type Iterator struct {
iter *pebble.Iterator
closed bool
}
func Wrap(iter *pebble.Iterator) Iterator {
return Iterator { iter: iter }
}
func (i *Iterator) Clone(cloneOpts pebble.CloneOptions) (*Iterator, error) { ... }
func (i *Iterator) Close() error {
if buildutil.CrdbTestBuild {
if i.closed {
panic("pebble.Iterator already closed")
}
err := i.iter.Close()
i.closed = true
return err
} else {
return i.iter.Close()
}
}
I imagine your thinking is that allocating the extra struct is not a cost we should pay if we're not running a test build, and maybe you're right. However I wonder if the extra cost (another little struct and maybe one additional indirection for Clone() and Close()) is not minimal enough that we can just eat it. I don't know enough about the implementation to say so -- maybe these iterators are opened and closed in a really hot loop and we can't afford taking any extra time.
Another thing to consider: if we just decide to keep the wrapping in non-test builds, then we could have an extra guardrail, right?
func (i *Iterator) Close() error {
if i.closed {
if buildutil.CrdbTestBuild {
panic(...)
} else {
return fmt.Errorf("iterator has already been closed") // can attach extra data to the error here too
}
}
err := i.iter.Close()
i.closed = true
return err
}
We can avoid panicking immediately in this case but presumably throwing an error if we're doing something that's definitely illegal is better than just continuing on to likely corrupt data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: the last bit -- if we're going to keep track of whether each individual iterator is closed to warn about double-closes, I also wonder whether that logic makes more sense as a part of the pebble.Iterator
itself. That struct could carry around the closed bool
and give a sensible error if the method is called more than once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However I wonder if the extra cost (another little struct and maybe one additional indirection for Clone() and Close()) is not minimal enough that we can just eat it. I don't know enough about the implementation to say so -- maybe these iterators are opened and closed in a really hot loop and we can't afford taking any extra time.
This layer is particularly performance sensitive and any additional allocations would be prohibitive.
Re: the last bit -- if we're going to keep track of whether each individual iterator is closed to warn about double-closes, I also wonder whether that logic makes more sense as a part of the pebble.Iterator itself. That struct could carry around the closed bool and give a sensible error if the method is called more than once.
Since both storage.pebbleIterator
and pebble.Iterator
are pooled to reduce allocations, the iterators structs may already be in use again by the time Close
is called a second time. We can't distinguish between a Close
issued by the new user, versus a double-Close
by the old user.
It does seem like it'd be feasible to maintain monotonically increasing counts of opens
and closes
and assert that closes
never exceeds opens
, but the ability to make crdb_test
-gated assertions of pkg/storage
's use of pebble.Iterator
is generally useful. We could move some of those assertions into pebble.Iterator
, but they still need to be gated on crdb_test
.
@rickystewart would it be feasible to add the Pebble invariants
build tag when the Cockroach crdb_test
build tag is set? That could allow us to take advantage of the existing Pebble invariant assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(As mentioned, don't let this comment block you, this is fine to be checked-in as is.)
@rickystewart would it be feasible to add the Pebble invariants build tag when the Cockroach crdb_test build tag is set? That could allow us to take advantage of the existing Pebble invariant assertions.
It is possible, but it might result in a lot of stuff being re-compiled for developers when swapping between builds and tests if they happen to be using pebble
. You can file a follow-up issue to discuss this if you like.
e4200a9
to
cfef6c4
Compare
Replace direct uses of the `*pebble.Iterator` type with a `pebbleiter.Iterator` type definition. In production builds, this type is a simple type definition for `*pebble.Iterator` which suffers no overhead. In `crdb_test` builds, the `pebbleiter.Iterator` is defined as a `*pebbleiter.AssertionIter` type that can assert proper use of the `*pebble.Iterator` type and its contract. Initially, use this new assertion iterator to ensure that a `*pebble.Iterator` is never double-closed, which may cause the Iterator to be added to a sync pool twice leading to races. This type of assertion cannot be done by `*pebble.Iterator`, because the iterator may already be reused by the time the second Close is run. In the future, I'd like to also introduce assertions that pkg/storage always checks validity before calling HasPointAndRange and RangeKeyChanged (see cockroachdb/pebble#2205). Release note: None Epic: None Close cockroachdb#90643.
cfef6c4
to
96c8a75
Compare
Backport cockroachdb#90859, cockroachdb#96685 and cockroachdb#97222 to mangle the buffers returned by Pebble iterators to ensure MVCC and Cockroach code respects the Pebble iterator memory lifetimes. The RangeBounds() and RangeKeys() validity assertions are omitted, since 22.2 allowed these methods to be invoked on an invalid iterator or an iterator positioned at a point key. Release note: None Release justification: Non-production code changes
Backport cockroachdb#90859, cockroachdb#96685 and cockroachdb#97222 to mangle the buffers returned by Pebble iterators to ensure MVCC and Cockroach code respects the Pebble iterator memory lifetimes. The RangeBounds() and RangeKeys() validity assertions are omitted, since 22.2 allowed these methods to be invoked on an invalid iterator or an iterator positioned at a point key. Release note: None Release justification: Non-production code changes
Replace direct uses of the
*pebble.Iterator
type with apebbleiter.Iterator
type definition. In production builds, this type is a simple type definition for
*pebble.Iterator
which suffers no overhead. Incrdb_test
builds, thepebbleiter.Iterator
is defined as a*pebbleiter.AssertionIter
type that canassert proper use of the
*pebble.Iterator
type and its contract.Initially, use this new assertion iterator to ensure that a
*pebble.Iterator
is never double-closed, which may cause the Iterator to be added to a sync pool
twice leading to races. This type of assertion cannot be done by
*pebble.Iterator
, because the iterator may already be reused by the time thesecond Close is run.
In the future, I'd like to also introduce assertions that pkg/storage always
checks validity before calling HasPointAndRange and RangeKeyChanged (see
cockroachdb/pebble#2205).
Release note: None
Epic: None
Close #90643.