storage/pebbleiter: guard against double-closed Iterators #90859

jbowens · 2022-10-28T18:19:33Z

Replace direct uses of the *pebble.Iterator type with a pebbleiter.Iterator
type definition. In production builds, this type is a simple type definition for
*pebble.Iterator which suffers no overhead. In crdb_test builds, the
pebbleiter.Iterator is defined as a *pebbleiter.AssertionIter type that can
assert proper use of the *pebble.Iterator type and its contract.

Initially, use this new assertion iterator to ensure that a *pebble.Iterator
is never double-closed, which may cause the Iterator to be added to a sync pool
twice leading to races. This type of assertion cannot be done by
*pebble.Iterator, because the iterator may already be reused by the time the
second Close is run.

In the future, I'd like to also introduce assertions that pkg/storage always
checks validity before calling HasPointAndRange and RangeKeyChanged (see
cockroachdb/pebble#2205).

Release note: None
Epic: None

Close #90643.

cockroach-teamcity · 2022-10-28T18:19:42Z

This change is

nicktrav

Nice!

In the future, would it also make sense to run some of our roachtests with these new assertions enabled?

Reviewed 14 of 14 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jbowens)

pkg/storage/pebbleiter/BUILD.bazel line 7 at r1 (raw file):

# gazelle:exclude gen-crdb_test_on.go

# keep

Is this a necessary build directive? Same for the keep below in the go_test target.

pkg/storage/pebbleiter/BUILD.bazel line 19 at r1 (raw file):

)

REMOVE_GO_BUILD_CONSTRAINTS = "cat $< | grep -v '//go:build' | grep -v '// +build' > $@"

I don't understand the build system enough to be able to tell at a glance why this is needed. Worth a comment? Or is this common enough knowledge for people that speak Bazel?

pkg/storage/pebbleiter/crdb_test_off.go line 22 at r1 (raw file):

// MaybeWrap returns the provided Iterator, unmodified.
func MaybeWrap(iter *pebble.Iterator) Iterator {

Do these inline? If they don't, do we care about the overhead?

pkg/storage/pebbleiter/crdb_test_on.go line 31 at r1 (raw file):

// AssertionIter wraps a *pebble.Iterator with assertion checking.
type AssertionIter struct {

nit: does this need to be exported?

sumeerbhola

Reviewed 4 of 14 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jbowens and @nicktrav)

pkg/storage/pebbleiter/crdb_test_off.go line 22 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

Do these inline? If they don't, do we care about the overhead?

worth adding // gcassert:inline

jbowens

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nicktrav, @rickystewart, and @sumeerbhola)

pkg/storage/pebbleiter/BUILD.bazel line 7 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

Is this a necessary build directive? Same for the keep below in the go_test target.

I believe so, but I copied this from the buildutil package.

pkg/storage/pebbleiter/BUILD.bazel line 19 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

I don't understand the build system enough to be able to tell at a glance why this is needed. Worth a comment? Or is this common enough knowledge for people that speak Bazel?

I couldn't tell you what they do either, but I lifted them from the only other build-tag-constrained file. Maybe @rickystewart knows.

pkg/storage/pebbleiter/crdb_test_off.go line 22 at r1 (raw file):

Previously, sumeerbhola wrote…

worth adding // gcassert:inline

The production one does, and done.

pkg/storage/pebbleiter/crdb_test_on.go line 31 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

nit: does this need to be exported?

unexported

nicktrav

Reviewed 1 of 1 files at r2, 7 of 7 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rickystewart)

rickystewart

Looks fine to me, the comment is more food for thought than a suggestion you need to take.

rickystewart · 2023-01-09T19:10:07Z

pkg/storage/pebbleiter/crdb_test_off.go

+// by the Apache License, Version 2.0, included in the file
+// licenses/APL.txt.
+
+//go:build !crdb_test || crdb_test_off


I think it's worth considering whether eschewing the conditional compilation and just sticking with a straightforward wrapping design might be fine. For example, something like:

import "github.com/cockroachdb/cockroach/pkg/util/buildutil" type Iterator struct { iter *pebble.Iterator closed bool } func Wrap(iter *pebble.Iterator) Iterator { return Iterator { iter: iter } } func (i *Iterator) Clone(cloneOpts pebble.CloneOptions) (*Iterator, error) { ... } func (i *Iterator) Close() error { if buildutil.CrdbTestBuild { if i.closed { panic("pebble.Iterator already closed") } err := i.iter.Close() i.closed = true return err } else { return i.iter.Close() } }

I imagine your thinking is that allocating the extra struct is not a cost we should pay if we're not running a test build, and maybe you're right. However I wonder if the extra cost (another little struct and maybe one additional indirection for Clone() and Close()) is not minimal enough that we can just eat it. I don't know enough about the implementation to say so -- maybe these iterators are opened and closed in a really hot loop and we can't afford taking any extra time.

Another thing to consider: if we just decide to keep the wrapping in non-test builds, then we could have an extra guardrail, right?

func (i *Iterator) Close() error { if i.closed { if buildutil.CrdbTestBuild { panic(...) } else { return fmt.Errorf("iterator has already been closed") // can attach extra data to the error here too } } err := i.iter.Close() i.closed = true return err }

We can avoid panicking immediately in this case but presumably throwing an error if we're doing something that's definitely illegal is better than just continuing on to likely corrupt data?

Re: the last bit -- if we're going to keep track of whether each individual iterator is closed to warn about double-closes, I also wonder whether that logic makes more sense as a part of the pebble.Iterator itself. That struct could carry around the closed bool and give a sensible error if the method is called more than once.

However I wonder if the extra cost (another little struct and maybe one additional indirection for Clone() and Close()) is not minimal enough that we can just eat it. I don't know enough about the implementation to say so -- maybe these iterators are opened and closed in a really hot loop and we can't afford taking any extra time.

This layer is particularly performance sensitive and any additional allocations would be prohibitive.

Re: the last bit -- if we're going to keep track of whether each individual iterator is closed to warn about double-closes, I also wonder whether that logic makes more sense as a part of the pebble.Iterator itself. That struct could carry around the closed bool and give a sensible error if the method is called more than once.

Since both storage.pebbleIterator and pebble.Iterator are pooled to reduce allocations, the iterators structs may already be in use again by the time Close is called a second time. We can't distinguish between a Close issued by the new user, versus a double-Close by the old user.

It does seem like it'd be feasible to maintain monotonically increasing counts of opens and closes and assert that closes never exceeds opens, but the ability to make crdb_test-gated assertions of pkg/storage's use of pebble.Iterator is generally useful. We could move some of those assertions into pebble.Iterator, but they still need to be gated on crdb_test.

@rickystewart would it be feasible to add the Pebble invariants build tag when the Cockroach crdb_test build tag is set? That could allow us to take advantage of the existing Pebble invariant assertions.

(As mentioned, don't let this comment block you, this is fine to be checked-in as is.)

@rickystewart would it be feasible to add the Pebble invariants build tag when the Cockroach crdb_test build tag is set? That could allow us to take advantage of the existing Pebble invariant assertions.

It is possible, but it might result in a lot of stuff being re-compiled for developers when swapping between builds and tests if they happen to be using pebble. You can file a follow-up issue to discuss this if you like.

Replace direct uses of the `*pebble.Iterator` type with a `pebbleiter.Iterator` type definition. In production builds, this type is a simple type definition for `*pebble.Iterator` which suffers no overhead. In `crdb_test` builds, the `pebbleiter.Iterator` is defined as a `*pebbleiter.AssertionIter` type that can assert proper use of the `*pebble.Iterator` type and its contract. Initially, use this new assertion iterator to ensure that a `*pebble.Iterator` is never double-closed, which may cause the Iterator to be added to a sync pool twice leading to races. This type of assertion cannot be done by `*pebble.Iterator`, because the iterator may already be reused by the time the second Close is run. In the future, I'd like to also introduce assertions that pkg/storage always checks validity before calling HasPointAndRange and RangeKeyChanged (see cockroachdb/pebble#2205). Release note: None Epic: None Close cockroachdb#90643.

Backport cockroachdb#90859, cockroachdb#96685 and cockroachdb#97222 to mangle the buffers returned by Pebble iterators to ensure MVCC and Cockroach code respects the Pebble iterator memory lifetimes. The RangeBounds() and RangeKeys() validity assertions are omitted, since 22.2 allowed these methods to be invoked on an invalid iterator or an iterator positioned at a point key. Release note: None Release justification: Non-production code changes

jbowens force-pushed the assertion-iter branch from c572aca to 456dd3f Compare October 28, 2022 21:30

jbowens self-assigned this Oct 31, 2022

jbowens force-pushed the assertion-iter branch 2 times, most recently from 642dce5 to 4485497 Compare January 5, 2023 17:07

jbowens marked this pull request as ready for review January 5, 2023 17:09

jbowens requested review from a team as code owners January 5, 2023 17:09

jbowens force-pushed the assertion-iter branch from 4485497 to 6102977 Compare January 5, 2023 17:11

nicktrav approved these changes Jan 5, 2023

View reviewed changes

jbowens force-pushed the assertion-iter branch from 6102977 to 7e098bf Compare January 5, 2023 17:55

sumeerbhola reviewed Jan 5, 2023

View reviewed changes

jbowens force-pushed the assertion-iter branch from 7e098bf to 7df8fba Compare January 9, 2023 16:54

jbowens commented Jan 9, 2023

View reviewed changes

jbowens requested a review from rickystewart January 9, 2023 16:54

jbowens force-pushed the assertion-iter branch from 7df8fba to e4200a9 Compare January 9, 2023 16:59

nicktrav reviewed Jan 9, 2023

View reviewed changes

rickystewart approved these changes Jan 9, 2023

View reviewed changes

jbowens force-pushed the assertion-iter branch from e4200a9 to cfef6c4 Compare January 10, 2023 16:05

jbowens force-pushed the assertion-iter branch from cfef6c4 to 96c8a75 Compare January 11, 2023 19:17

jbowens merged commit 8330bdb into cockroachdb:master Jan 12, 2023

jbowens deleted the assertion-iter branch January 12, 2023 03:12

jbowens mentioned this pull request Feb 23, 2023

release-22.2: storage: introduce test-build assertion iterator #97572

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage/pebbleiter: guard against double-closed Iterators #90859

storage/pebbleiter: guard against double-closed Iterators #90859

jbowens commented Oct 28, 2022 •

edited

Loading

cockroach-teamcity commented Oct 28, 2022

nicktrav left a comment

sumeerbhola left a comment

jbowens left a comment

nicktrav left a comment

rickystewart left a comment

rickystewart Jan 9, 2023

rickystewart Jan 9, 2023

jbowens Jan 9, 2023

rickystewart Jan 9, 2023

storage/pebbleiter: guard against double-closed Iterators #90859

storage/pebbleiter: guard against double-closed Iterators #90859

Conversation

jbowens commented Oct 28, 2022 • edited Loading

cockroach-teamcity commented Oct 28, 2022

nicktrav left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

jbowens left a comment

Choose a reason for hiding this comment

nicktrav left a comment

Choose a reason for hiding this comment

rickystewart left a comment

Choose a reason for hiding this comment

rickystewart Jan 9, 2023

Choose a reason for hiding this comment

rickystewart Jan 9, 2023

Choose a reason for hiding this comment

jbowens Jan 9, 2023

Choose a reason for hiding this comment

rickystewart Jan 9, 2023

Choose a reason for hiding this comment

jbowens commented Oct 28, 2022 •

edited

Loading