Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go.mod: bump Pebble to 8801607debb0 #108864

Merged
merged 2 commits into from
Aug 21, 2023

Conversation

itsbilal
Copy link
Member

@itsbilal itsbilal commented Aug 16, 2023

Second commit does the wholesale addition of error return values to NewMVCCIterator and NewEngineIterator.

8801607d db: Add TestConcurrentExcise
ad242264 *: add error return value to NewIter{,WithContext}
eb4a8571 *: implement EventuallyFileOnlySnapshots

Release note: None.
Epic: None.

@itsbilal itsbilal requested review from a team as code owners August 16, 2023 19:38
@itsbilal itsbilal requested a review from a team August 16, 2023 19:38
@itsbilal itsbilal requested review from a team as code owners August 16, 2023 19:38
@itsbilal itsbilal requested review from jbowens, rhu713 and mgartner and removed request for a team August 16, 2023 19:38
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch 4 times, most recently from d8d3c9f to 1a1753d Compare August 16, 2023 21:05
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, 5 of 69 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @mgartner, and @rhu713)


pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

		lowerBound, _ := keys.LockTableSingleKey(desc.StartKey.AsRawKey(), nil)
		upperBound, _ := keys.LockTableSingleKey(desc.EndKey.AsRawKey(), nil)
		iter, err := r.store.TODOEngine().NewEngineIterator(storage.IterOptions{

I'm a little worried about all these instances where we can't easily propagate an error return value. I don't want some refactoring to end up feeding one of these call sites an EFOS snapshot Reader, and there's a subtle, very rare panic waiting to happen.

Maybe we should add another storage package interface:

type ReaderWithMustIterators interface {
   Reader
   MustEngineIterator(r Reader) EngineIterator
   MustMVCCIterator(r Reader) MVCCIterator
}

And Readers for which iterator construction is infallible can implement this interface. Then in situations like this, we have type safety protection against accidentally threading an EFOS snapshot in its stead.

@itsbilal
Copy link
Member Author

pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, jbowens (Jackson Owens) wrote…

I'm a little worried about all these instances where we can't easily propagate an error return value. I don't want some refactoring to end up feeding one of these call sites an EFOS snapshot Reader, and there's a subtle, very rare panic waiting to happen.

Maybe we should add another storage package interface:

type ReaderWithMustIterators interface {
   Reader
   MustEngineIterator(r Reader) EngineIterator
   MustMVCCIterator(r Reader) MVCCIterator
}

And Readers for which iterator construction is infallible can implement this interface. Then in situations like this, we have type safety protection against accidentally threading an EFOS snapshot in its stead.

I'm good with making that change, though the matching Pebble interface has already been updated even for paths that don't currently return errors. So if someone were to make a change to start returning an error in *DB.NewIter, and we have a panic in *Pebble.MustMVCCIterator in case of an error, we will still end up in the same place where we have a runtime panic and no direct type safety due to a mismatch in expectations between the Pebble struct in Cockroach and the DB struct in Pebble.

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @mgartner, and @rhu713)


pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

I'm good with making that change, though the matching Pebble interface has already been updated even for paths that don't currently return errors. So if someone were to make a change to start returning an error in *DB.NewIter, and we have a panic in *Pebble.MustMVCCIterator in case of an error, we will still end up in the same place where we have a runtime panic and no direct type safety due to a mismatch in expectations between the Pebble struct in Cockroach and the DB struct in Pebble.

Yes, but we're able to (independently of this change) propagate the Must* interface down into Pebble too. I also think it's more reasonable of an expectation that we (Storage) can maintain this contract at pkg/storage, than expecting all Cockroach engineers to understand which Readers are safe to ignore errors from and which are not.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens, @mgartner, and @rhu713)


pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Yes, but we're able to (independently of this change) propagate the Must* interface down into Pebble too. I also think it's more reasonable of an expectation that we (Storage) can maintain this contract at pkg/storage, than expecting all Cockroach engineers to understand which Readers are safe to ignore errors from and which are not.

+1 to separating out these two interfaces.

@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch from 1a1753d to cf99335 Compare August 17, 2023 17:25
Copy link
Member Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens, @mgartner, and @rhu713)


pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, sumeerbhola wrote…

+1 to separating out these two interfaces.

This is done now.

@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch from cf99335 to 353708a Compare August 17, 2023 17:44
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 69 files at r2, 2 of 12 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @rhu713, and @sumeerbhola)


pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

This is done now.

This interface assertion shouldn't be necessary, right? TODOEngine() returns an Engine which always must implement storage.ReaderWithMustIterators?

Now that I'm looking at this again, could we leave NewEngineIterator and NewMVCCIterator's Reader signatures as-is, and make the fallible NewIterator interface the exceptional? A FallibleReader or some such? This seems like it would reduce churn and eliminate boilerplate. If this is reasonable to you, sorry I didn't think of it earlier—I know these broad API refactors are annoying.

@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch from 353708a to 5535349 Compare August 17, 2023 21:29
@itsbilal
Copy link
Member Author

pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, jbowens (Jackson Owens) wrote…

This interface assertion shouldn't be necessary, right? TODOEngine() returns an Engine which always must implement storage.ReaderWithMustIterators?

Now that I'm looking at this again, could we leave NewEngineIterator and NewMVCCIterator's Reader signatures as-is, and make the fallible NewIterator interface the exceptional? A FallibleReader or some such? This seems like it would reduce churn and eliminate boilerplate. If this is reasonable to you, sorry I didn't think of it earlier—I know these broad API refactors are annoying.

I think we should stick with this and consider ReaderWithMustIterators as the exception rather than the norm. If anything we should move towards eliminating it (see point 6 below). Here are all the reasons why:

  1. We will end up using EFOS in more cases going forward anyway
  2. Future changes can introduce errors to existing iterator creation code paths
  3. Most uses of NewMVCCIterator already handle errors or have error return values, this change just closes the gap
  4. I've already done the bulk of the work of the refactor, and it'd be a shame to throw it away only to reintroduce it if we ever introduced errors elsewhere down the line
  5. We already have too many interfaces (eg. InternalWriter vs Writer),
  6. The only cases where we really need ReaderWithMustIterators are the MVCC incremental iterator (easily fixable with a mechanical refactor), and the rangefeed (a little more complex). I can see the rangefeed being the only longer-term user of MustMVCCIterator. Meanwhile we have 200+ uses of NewMVCCIterator that handle errors correctly.

I've made the TODOEngine change in this line.

@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch from 5535349 to 630c317 Compare August 17, 2023 21:37
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, 33 of 69 files at r2, 8 of 12 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @jbowens, @mgartner, and @rhu713)


pkg/kv/kvserver/replica_evaluate.go line 110 at r4 (raw file):

	})
	if err != nil {
		log.Fatalf(context.TODO(), "error when creating iterator: %s", err)

there is a single caller of optimizePuts that can handle an error by wrapping it like pErr := kvpb.NewErrorWithTxn(err)


pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

I think we should stick with this and consider ReaderWithMustIterators as the exception rather than the norm. If anything we should move towards eliminating it (see point 6 below). Here are all the reasons why:

  1. We will end up using EFOS in more cases going forward anyway
  2. Future changes can introduce errors to existing iterator creation code paths
  3. Most uses of NewMVCCIterator already handle errors or have error return values, this change just closes the gap
  4. I've already done the bulk of the work of the refactor, and it'd be a shame to throw it away only to reintroduce it if we ever introduced errors elsewhere down the line
  5. We already have too many interfaces (eg. InternalWriter vs Writer),
  6. The only cases where we really need ReaderWithMustIterators are the MVCC incremental iterator (easily fixable with a mechanical refactor), and the rangefeed (a little more complex). I can see the rangefeed being the only longer-term user of MustMVCCIterator. Meanwhile we have 200+ uses of NewMVCCIterator that handle errors correctly.

I've made the TODOEngine change in this line.

I agree that error return should be the norm.
And now I don't see why we need these Must variants. I wonder whether it would be simpler with the following sequence of PRs:

  • Introduce the interface change and change all callers to handle the error, with a few panics where it isn't vanilla refactoring. This touches many files but merges quickly. The PR we are currently reviewing can be changed to be this. There are no actual error returns, so there is no worry about those panics being hit.
  • (potentially multiple small PRs) Fix all the panic cases by propagating error -- we can seek expertise of the various teams for these, if needed.
  • Bump Pebble and change the storage package to propagate the error from Pebble.

pkg/kv/kvserver/replica_rangefeed.go line 420 at r4 (raw file):

	// due to stopping, but before it enters the quiescing state, then the select
	// below will fall through to the panic.
	if err := p.Start(r.store.Stopper(), rtsIter); err != nil {

this may be an acceptable way to handle an error in the NewEngineIterator case too.


pkg/kv/kvserver/rangefeed/catchup_scan.go line 87 at r4 (raw file):

) *CatchUpIterator {
	return &CatchUpIterator{
		simpleCatchupIter: storage.NewMVCCIncrementalIterator(reader.(storage.ReaderWithMustIterators),

Can't this type assertion trigger a panic?
The only non-test caller is replica_rangefeed.go, which I think can handle an error using return future.MakeCompletedErrorFuture(err).


pkg/storage/mvcc.go line 2840 at r4 (raw file):

	// _expect_ to hit this since the RevertRange is only intended for non-live
	// key spans, but there could be an intent leftover.
	iter := NewMVCCIncrementalIterator(rw.(ReaderWithMustIterators), MVCCIncrementalIterOptions{

same panic question. And this callsite can handle an error, just like all the callers of NewMVCCIncrementalIterator in this file.


pkg/kv/kvserver/batcheval/cmd_export_test.go line 533 at r4 (raw file):

	}

	iter := storage.NewMVCCIncrementalIterator(reader.(storage.ReaderWithMustIterators), storage.MVCCIncrementalIterOptions{

can handle error

@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch from 630c317 to 574170f Compare August 18, 2023 13:36
Copy link
Member Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens, @mgartner, @rhu713, and @sumeerbhola)


pkg/kv/kvserver/replica_evaluate.go line 110 at r4 (raw file):

Previously, sumeerbhola wrote…

there is a single caller of optimizePuts that can handle an error by wrapping it like pErr := kvpb.NewErrorWithTxn(err)

Done.


pkg/kv/kvserver/replica_rangefeed.go line 408 at r2 (raw file):

Previously, sumeerbhola wrote…

I agree that error return should be the norm.
And now I don't see why we need these Must variants. I wonder whether it would be simpler with the following sequence of PRs:

  • Introduce the interface change and change all callers to handle the error, with a few panics where it isn't vanilla refactoring. This touches many files but merges quickly. The PR we are currently reviewing can be changed to be this. There are no actual error returns, so there is no worry about those panics being hit.
  • (potentially multiple small PRs) Fix all the panic cases by propagating error -- we can seek expertise of the various teams for these, if needed.
  • Bump Pebble and change the storage package to propagate the error from Pebble.

This PR was only supposed to be 1+3 anyway, at least with the way it was at the start. I'm down to decouple the Pebble bump if that helps. The only non-vanilla-refactors, like I said, are NewMVCCIncrementalIterator and its ~20 callsites, and replica_{evaluate,rangefeed}.go (which is actually addressed now that this PR is taking on some of the work from point 2, so it's really only NewMVCCIncrementalIterator). I was planning on doing NewMVCCIncrementalIterator separately anyway as there are some mild complexities there (eg. CatchUpIterator), which would be the rest of 2.

I've added a TODO to NewMVCCIncrementalIterator. If that sounds good, let me know - I can split out the replica_{evaluate,rangefeed} stuff too out of this PR but otherwise I think it already is covering all the mechanical refactors that shouldn't need more attention from other teams.


pkg/kv/kvserver/replica_rangefeed.go line 420 at r4 (raw file):

Previously, sumeerbhola wrote…

this may be an acceptable way to handle an error in the NewEngineIterator case too.

Done.


pkg/kv/kvserver/rangefeed/catchup_scan.go line 87 at r4 (raw file):

Previously, sumeerbhola wrote…

Can't this type assertion trigger a panic?
The only non-test caller is replica_rangefeed.go, which I think can handle an error using return future.MakeCompletedErrorFuture(err).

That's correct, but my understanding was that we'd prefer a type-assertion panic when we send an EFOS down the wrong code path, as opposed to the much rarer "panic on an error in the wild". Given my comment above that explicitly calls out NewMVCCIncrementalIterator as something we should fix to return an error soon, I agree that we should be returning an error here.

I just didn't want to update all ~30 callsites of NewMVCCIncrementalIterator in the same PR with the hope this would land quickly and that can be the follow-up (hence the TODO in NewMVCCIncrementalIterator that I removed with the ReaderWithMustIterator change), but I can do it.


pkg/storage/mvcc.go line 2840 at r4 (raw file):

Previously, sumeerbhola wrote…

same panic question. And this callsite can handle an error, just like all the callers of NewMVCCIncrementalIterator in this file.

NewMVCCIncrementalIterator case as mentioned above


pkg/kv/kvserver/batcheval/cmd_export_test.go line 533 at r4 (raw file):

Previously, sumeerbhola wrote…

can handle error

Same as above

@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch from 574170f to c602fe1 Compare August 18, 2023 14:21
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this

:lgtm:

Reviewed 4 of 76 files at r5, 42 of 72 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jbowens, @mgartner, and @rhu713)

8801607d db: Add TestConcurrentExcise
ad242264 *: add error return value to NewIter{,WithContext}
eb4a8571 *: implement EventuallyFileOnlySnapshots

Release note: None.
Now that the iterator creation return paths can return
errors from Pebble, thread that through the interface
and all callers. Mechanical but very large and widespread
change.

Epic: none

Release note: None
@itsbilal itsbilal force-pushed the bilal/pebble-master-8801607debb0 branch from c602fe1 to be542c7 Compare August 21, 2023 15:04
@itsbilal
Copy link
Member Author

TFTR!

bors r=sumeerbhola

@craig
Copy link
Contributor

craig bot commented Aug 21, 2023

Build succeeded:

@craig craig bot merged commit eaf8717 into cockroachdb:master Aug 21, 2023
itsbilal added a commit to itsbilal/cockroach that referenced this pull request Aug 22, 2023
Reader and ReadWriterWithMustIterators were added as a
stop-gap in cockroachdb#108864 to avoid having to refactor uses of
MVCCIncrementalIterator (and by extention, CatchUpIterator) all
in one mega PR. This change addresses some TODOs around moving
to `Reader` which has an error return value in the constructor
for iterators, and adds the relevant non-repetitive/straightforward
plumbing to thread this error through those non-mechanical code paths.

Epic: none

Release note: None
craig bot pushed a commit that referenced this pull request Aug 23, 2023
109283: storage,kv: Remove Read{er,Writer}WithMustIterators r=jbowens a=itsbilal

Reader and ReadWriterWithMustIterators were added as a stop-gap in #108864 to avoid having to refactor uses of MVCCIncrementalIterator (and by extention, CatchUpIterator) all in one mega PR. This change addresses some TODOs around moving to `Reader` which has an error return value in the constructor for iterators, and adds the relevant non-repetitive/straightforward plumbing to thread this error through those non-mechanical code paths.

Epic: none

Release note: None

Co-authored-by: Bilal Akhtar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants