Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: prevent follower reads while a range is subsumed #50265

Merged
merged 3 commits into from
Aug 4, 2020

Conversation

aayushshah15
Copy link
Contributor

@aayushshah15 aayushshah15 commented Jun 16, 2020

Before this commit, during a merge, the RHS leaseholder’s store
could continue broadcasting (actionable) closed timestamp updates
even after it had been subsumed. This allowed the followers to be
able to serve follower reads past the subsumption time of RHS.
Additionally, after the merge, if the LHS had a lower closed timestamp
than the RHS, it could allow writes to the keyspan owned by RHS
at timestamps lower than the RHS’s max closed timestamp.

This commit fixes this bug by requiring that the followers catch up
to a LeaseAppliedIndex that belongs to the entry succeeding the
Subsume request.

Fixes #44878

Release note (bug fix): Fixed a rare bug that could cause actionable
closed timestamps to effectively regress over a given keyspan. This
could in turn lead to a serializability violation when using follower
reads. This was due to ill-defined interactions between range merges
and the closed timestamp subsystem.

@blathers-crl
Copy link

blathers-crl bot commented Jun 16, 2020

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Jun 16, 2020
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aayushshah15
Copy link
Contributor Author

I feel uneasy about having store.cfg exported, please let me know if I missed something cleaner.

In a future commit, I will add a test that forces a lease transfer during the merge critical state.

@aayushshah15 aayushshah15 removed the request for review from danhhz June 16, 2020 10:05
@blathers-crl
Copy link

blathers-crl bot commented Jun 16, 2020

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were there supposed to be one commits or two in here? I see what with a message looking like the squash of two.

Nit: there's a lot diffs created by the exporting of that Cfg guy. Moving that change to a dedicated commit would make reviewing easier.

Nit: the release note is too technical; users don't understand this. Also the tense is wrong (as per https://wiki.crdb.io/wiki/spaces/CRDB/pages/186548364/Release+notes) - say "Fixed a bug".

Release note (bug fix): Fixes a bug that allowed follower reads
during the critical state of a merge transaction.

I'd say

Fixed a rare bug causing transaction serializability violations when using follower reads.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, @nvanbenschoten, and @tbg)


pkg/kv/kvserver/closed_timestamp_test.go, line 710 at r1 (raw file):

// done.
func setupClusterForClosedTSTestingWithCustomKnobs(
	ctx context.Context, t *testing.T, targetDuration time.Duration, knobs base.TestingKnobs,

might as well take a TestClusterArgs.


pkg/kv/kvserver/closed_timestamp_test.go, line 801 at r1 (raw file):

	repls []*kvserver.Replica,
) {
	defaultTestingKnobs := base.TestingKnobs{

Consider keeping a single setupClusterForClosedTSTesting() function by extracting defaultTestingKnobs into a aggressiveResolvedTSPushKnobs that everybody passed in. That way it'll be easier for tests to compose these default knobs with other additional knobs.


pkg/kv/kvserver/replica.go, line 1330 at r1 (raw file):

// neighbor is in its critical phase and, if so, arranges to block all requests
// until the merge completes.
func (r *Replica) maybeWatchForMerge(ctx context.Context, untrack closedts.ReleaseFunc) error {

pls comment the new param


pkg/kv/kvserver/replica.go, line 1361 at r1 (raw file):

		// Nothing more to do.
		r.mu.Unlock()
		return nil

don't we need to call untrack() with LAI+1 in this case?


pkg/kv/kvserver/replica.go, line 1370 at r1 (raw file):

	// arrived and now, which is rare but entirely legal.
	r.unquiesceLocked()
	// We prevent replicas from being able to serve follower reads on timestamps

I think this comment needs to spell things out more: say that we block follower reads by publishing a MLAI that's never going to come in the happy case, and we do that because the closed timestamps that are published from now on will not be reflected in the LHS's tscache after the merge.


pkg/kv/kvserver/replica.go, line 1372 at r1 (raw file):

	// We prevent replicas from being able to serve follower reads on timestamps
	// that fall between the subsumption time (FreezeStart) and the timestamp at
	// which the merge transaction commits or aborts (i.e. when a merge is in its

s/when a merge/the timestamp window representing the range's critical state

Also I don't see the syntagm "critical state" used elsewhere around here. Maybe say something more descriptive like "subsumed state".


pkg/kv/kvserver/replica.go, line 1382 at r1 (raw file):

	// has been subsumed are lease requests, which do not bump the LAI.
	//
	// See https://github.com/cockroachdb/cockroach/issues/44878 for discussion.

nit: I personally consider most such issue links to be just click bait. The comments should usually be sufficient for explaining what's going on; when they're not sufficient because we failed to write them properly, there's git blame.


pkg/kv/kvserver/replica_read.go, line 127 at r1 (raw file):

	log.Event(ctx, "executing read-only batch")

	untrack := r.maybeTrackProposal(ctx, ba)

can this down, move next to res.Local.ReleaseFunc = untrack ?


pkg/kv/kvserver/replica_read.go, line 130 at r1 (raw file):

		}
		br, res, pErr = evaluateBatch(ctx, kvserverbase.CmdIDKey(""), rw, rec, nil, ba, true /* readOnly */)

spurious diff


pkg/kv/kvserver/replica_read.go, line 150 at r1 (raw file):

		return nil, res, pErr
	}
	res.Local.ReleaseFunc = untrack

Is this indirection really needed? Can't we simply untrack here?
I see that there are some early returns in maybeWatchForMerge that don't untrack with LAI+1, but I can't tell which ones are expected. Plus one of them seems wrong to me.

If the indirection is necessary, can we at least bind all the arguments to the closure we put in res.Local.ReleaseFunc? They shouldn't change in between here and maybeWatchForMerge, should they? The closure could take a single bool about whether we're watching for the merge or not (a false would translate to all zeros to untrack()).


pkg/kv/kvserver/replica_read.go, line 154 at r1 (raw file):

}

func (r *Replica) maybeTrackProposal(

This funky method has one caller. Inline it there and then I won't have to ask for a better name and comments.


pkg/kv/kvserver/replica_read.go, line 160 at r1 (raw file):

		// We start tracking SubsumeRequests as part of our guarantee to never
		// broadcast a closed timestamp entry between when we evaluate the Subsume
		// request and the time that we run maybeWatchForMerge for the first time.

I don't quite understand the "for the first time" part. Consider axing it.


pkg/kv/kvserver/replica_read.go, line 162 at r1 (raw file):

		// request and the time that we run maybeWatchForMerge for the first time.
		// See comment in maybeWatchForMerge() for details.
		_, untrack := r.store.Cfg.ClosedTimestamp.Tracker.Track(ctx)

I think there's something missing here - the other return value from Track is a minimum timestamp for the (Subsume) request. I think we need to reject the Subsume request if its timestamp is above it. I think the comment above should spell out the important of Subsume's ts.


pkg/kv/kvserver/batcheval/result/result.go, line 69 at r1 (raw file):

	// Call maybeWatchForMerge.
	MaybeWatchForMerge bool
	// ReleaseFunc, when it's not the empty function, stores the callback returned

I think this needs more words; the read-only SubsumeRequest doesn't have a LeaseAppliedIndex.

Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, @nvanbenschoten, and @tbg)


pkg/kv/kvserver/replica.go, line 1361 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

don't we need to call untrack() with LAI+1 in this case?

We only need to release this tracked proposal the first time we run maybeWatchForMerge (which should always be right after the subsume request evaluation unless there is a lease transfer). Any time we see that we've already started the mergeComplete channel, we should already have released this proposal then. This is what the comment below (in maybeTrackProposal) talks about, I'll make it clearer.


pkg/kv/kvserver/replica_read.go, line 127 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

can this down, move next to res.Local.ReleaseFunc = untrack ?

Starting to track the proposal after evaluation of the subsume request seemed problematic for the given scenario:
t1: RHS leaseholder begins evaluating the subsume request (i.e. t1 is our FreezeStart)
t2: The leaseholder's store broadcasts a closed timestamp that is above t1
t3: Now we begin tracking the proposal, but it's too late because we've already emitted a closed timestamp above FreezeStart

Does that make sense? Let me know if I'm missing something.

@nvanbenschoten
Copy link
Member

there's a lot diffs created by the exporting of that Cfg guy. Moving that change to a dedicated commit would make reviewing easier.

Big +1 on this. The main change here is small but subtle and difficult to prove the correctness of. It's a major burden for reviewers and future readers when such changes are mixed in with a refactor that touches 46 files.

// NB: The above statement relies on the invariant that the LAI that
// follows a Subsume request will be applied only after the merge aborts, as
// the RHS replicas will get destroyed if the merge successfully commits. This
// invariant is upheld because the only Raft proposals allowed after a range
// has been subsumed are lease requests, which do not bump the LAI.

Are we sure about this? I mentioned in #44878 (comment) that I'm not convinced this invariant is currently upheld. It relies on latching and that there are Requests that perform Raft proposals that do not acquire any latches that conflict with the SubsumeRequest request, such as ComputeChecksumRequests.

Given that this is so critical for correctness, I think it would be worth adding assertions around raft proposal logic to ensure that we uphold this invariant.

@aayushshah15
Copy link
Contributor Author

Sounds good, feel free to hold off on reviewing until I reorg it into multiple commits. I'll also add assertions around the invariant.

@tbg tbg removed their request for review June 25, 2020 09:55
@aayushshah15 aayushshah15 changed the title kvserver: prevent follower reads during range merge critical state kvserver: prevent follower reads while a range is subsumed Jun 25, 2020
@aayushshah15 aayushshah15 force-pushed the rangefeed_drop branch 4 times, most recently from f136ce0 to fd1c9fa Compare June 25, 2020 23:58
Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reorganized a bunch of stuff and added a new test that forces a lease transfer on the RHS during the merge (& ensures that our closed timestamp guarantees are upheld).

With regards to ComputeChecksumRequest: It turns out that the ComputeChecksum request does "acquire latches" even though it doesn't declare any keys. However, checkExecutionCanProceed only cares about the latchGuard being non-nil on the leaseholder. It almost seems like an unintentional consequence. Should we be more explicit about this? Sorry for not investigating this properly earlier. I added an assertion that blows up if it sees a proposal that would bump the LAI while the range is subsumed, along with an accompanying test.

This is RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/closed_timestamp_test.go, line 710 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

might as well take a TestClusterArgs.

Done, let me know if I misunderstood you.


pkg/kv/kvserver/closed_timestamp_test.go, line 801 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Consider keeping a single setupClusterForClosedTSTesting() function by extracting defaultTestingKnobs into a aggressiveResolvedTSPushKnobs that everybody passed in. That way it'll be easier for tests to compose these default knobs with other additional knobs.

Done


pkg/kv/kvserver/replica.go, line 1330 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

pls comment the new param

Removed the indirection.


pkg/kv/kvserver/replica.go, line 1370 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think this comment needs to spell things out more: say that we block follower reads by publishing a MLAI that's never going to come in the happy case, and we do that because the closed timestamps that are published from now on will not be reflected in the LHS's tscache after the merge.

Done, let me know if you still find it lacking.


pkg/kv/kvserver/replica.go, line 1372 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

s/when a merge/the timestamp window representing the range's critical state

Also I don't see the syntagm "critical state" used elsewhere around here. Maybe say something more descriptive like "subsumed state".

Done.


pkg/kv/kvserver/replica.go, line 1382 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: I personally consider most such issue links to be just click bait. The comments should usually be sufficient for explaining what's going on; when they're not sufficient because we failed to write them properly, there's git blame.

Removed.


pkg/kv/kvserver/replica_read.go, line 150 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Is this indirection really needed? Can't we simply untrack here?
I see that there are some early returns in maybeWatchForMerge that don't untrack with LAI+1, but I can't tell which ones are expected. Plus one of them seems wrong to me.

If the indirection is necessary, can we at least bind all the arguments to the closure we put in res.Local.ReleaseFunc? They shouldn't change in between here and maybeWatchForMerge, should they? The closure could take a single bool about whether we're watching for the merge or not (a false would translate to all zeros to untrack()).

You're correct that the indirection isn't really necessary since we don't need to untrack() in the lease acquisition path (which is what the initial check in maybeWatchForMerge is for)


pkg/kv/kvserver/replica_read.go, line 154 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

This funky method has one caller. Inline it there and then I won't have to ask for a better name and comments.

Done


pkg/kv/kvserver/replica_read.go, line 160 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I don't quite understand the "for the first time" part. Consider axing it.

Done.


pkg/kv/kvserver/replica_read.go, line 162 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think there's something missing here - the other return value from Track is a minimum timestamp for the (Subsume) request. I think we need to reject the Subsume request if its timestamp is above it. I think the comment above should spell out the important of Subsume's ts.

Not applicable as discussed offline.


pkg/kv/kvserver/batcheval/result/result.go, line 69 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think this needs more words; the read-only SubsumeRequest doesn't have a LeaseAppliedIndex.

Removed this, see above.

@aayushshah15 aayushshah15 force-pushed the rangefeed_drop branch 2 times, most recently from b89c7c9 to 5f0b494 Compare June 26, 2020 13:32
@aayushshah15
Copy link
Contributor Author

The test that forces a lease transfer during the merge is flakey under contention because it relies on the merge txn not retrying 😞. Trying to fix that..

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

nit: I imagine I'm probably driving you crazy with this this stuff, but I think r4 and r5 should be squashed with r2. The testing really belongs with the commit fixing the issue. And the assertion is tiny, and also tied to the main thing this patch is doing. TestStoreRangeMergeCheckConsistencyAfterSubsumption is more random, that can be a separate commit indeed.

Btw, I also find it much better to rewrite all the commits every time I update a PR (like, rebase them all), so that reviewable shows the new commits as a contiguous block, instead of interspersing old commits and new commits.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/client_merge_test.go, line 1534 at r5 (raw file):

// 2. Once a merge is aborted, pending (and future) requests will be allowed to
// be proposed. This is meant as a sanity check for the assertion at the
// end of Replica.propose().

pls put more words here describing this assertion


pkg/kv/kvserver/client_merge_test.go, line 1599 at r5 (raw file):

	case <-checkConsistencyResp:
		t.Fatalf("expected the consistency check to wait until the merge was complete")
	case <-time.After(1 * time.Second):

you probably want more than a second here and below; tests run on loaded machines, particularly under testrace.
Consider testutils.SucceedsSoon() - that's the go to for waiting a while


pkg/kv/kvserver/closed_timestamp_test.go, line 469 at r4 (raw file):

	t.Run("without lease transfer", func(t *testing.T) {
		ctx := context.Background()
		blockMergeTrigger := make(chan interface{}, 10) // headroom in case the merge txn retries

why interface{} and not hlc.Timestamp ?


pkg/kv/kvserver/replica_raft.go, line 337 at r5 (raw file):

	if maxLeaseIndex != 0 {
		r.mu.RLock()
		defer r.mu.RUnlock()

nit: I'd just put the RUnlock() below, without a defer. You want to hold this lock for as little as possible, and tie it with the mergeInProgressRLocked() call. Like, if there's more code added to this method, you don't want it running under the lock.
If the log.Fatal() is hit, the lock won't be release anyway.


pkg/kv/kvserver/replica_read.go, line 127 at r1 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

Starting to track the proposal after evaluation of the subsume request seemed problematic for the given scenario:
t1: RHS leaseholder begins evaluating the subsume request (i.e. t1 is our FreezeStart)
t2: The leaseholder's store broadcasts a closed timestamp that is above t1
t3: Now we begin tracking the proposal, but it's too late because we've already emitted a closed timestamp above FreezeStart

Does that make sense? Let me know if I'm missing something.

ack


pkg/kv/kvserver/replica_read.go, line 126 at r2 (raw file):

	log.Event(ctx, "executing read-only batch")

	untrack := func(_ context.Context, _ ctpb.Epoch, _ roachpb.RangeID, _ ctpb.LAI) {}

I'd suggest simplifying this untrack() to only take a bool - whether or not the range has been subsumed, and closing over all the other arguments right here.

@aayushshah15
Copy link
Contributor Author

In addition to what I just posted in #51294, I also ran into some clock sync related Fatals while stressing kvnemesis.

  • I saw a run failing with this on server start.
clock synchronization error: this node is more than 500ms away from at least half of the known nodes (0 of 1 are within the offset)
  • And a run failing with this:
kvnemesis_test.go:67: error executing 'SET CLUSTER SETTING kv.rangefeed.enabled = true': pq: update-setting: remote wall time is too far ahead (970.559966ms) to be trustworthy

With this warning message in the traces:

 event:util/hlc/hlc.go:260 [n2,s2] backward time jump detected (-0.965141 seconds)

Here are the logs for these:
clock_out_of_sync_onboot.zip
backward_clock_jump.zip

The second one seems..really weird?

@aayushshah15
Copy link
Contributor Author

FWIW at this point, I believe I’m not going to find anything in these stress runs that falls under this PR. The exact issue this PR fixes is simulated by the targeted unit test. It’s easy to sink a lot of time looking into these nemeses failures so we should probably let it find stuff in the nightly runs & track them separately anyway.

Currently, RaftHeartbeatIntervalTicks is not configurable from a
TestCluster and we disallow RaftElectionTimeoutTicks to be less than or
equal to it. Thus, tests that use TestCluster and, for instance, sleep
for a node's liveness to expire take longer than they really need to.

Release note: None
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 1 of 23 files at r14, 21 of 21 files at r15, 1 of 1 files at r16.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, and @andreimatei)


pkg/kv/kvserver/replica_closedts.go, line 37 at r6 (raw file):

because it gets interpreted as the range getting quiesced

This doesn't quite capture the conclusion of that conversation. Mind updating it to be a little more clear?


pkg/kv/kvserver/replica_write.go, line 156 at r12 (raw file):

Subsume() cmd_subsume.go

Is this missing an "in"?


pkg/kv/kvserver/batcheval/cmd_compute_checksum.go, line 39 at r16 (raw file):

a rare a closed timestamp

Delete the second "a".

Before this commit, during a merge, the RHS leaseholder’s store could
continue broadcasting (actionable) closed timestamp updates even after
it had been subsumed. This allowed the followers to be able to serve
follower reads past the subsumption time of RHS.  Additionally, after
the merge, if the LHS had a lower closed timestamp than the RHS, it
could allow writes to the keyspan owned by RHS at timestamps lower than
the RHS’s max closed timestamp.

This commit fixes this bug by requiring that the followers catch up to a
LeaseAppliedIndex that belongs to the entry succeeding the Subsume
request. It also adds necessary assertions around the invariant that
while a range is subsumed, nothing (except the merge txn being rolled
back) can bump its lease applied index.

Fixes cockroachdb#44878

Release note (bug fix): Fixed a rare bug that could cause actionable
closed timestamps to effectively regress over a given keyspan. This
could in turn lead to a serializability violation when using follower
reads. This was due to ill-defined interactions between range merges
and the closed timestamp subsystem.
This commit adds a "sanity check" test around the assertion at the
end of Replica.propose() that ensures that the lease applied index
of a subsumed range is never bumped.

Release note: None
Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the extremely detailed reviews @nvanbenschoten and @andreimatei

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/replica_closedts.go, line 37 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
because it gets interpreted as the range getting quiesced

This doesn't quite capture the conclusion of that conversation. Mind updating it to be a little more clear?

Done.


pkg/kv/kvserver/replica_write.go, line 156 at r12 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Subsume() cmd_subsume.go

Is this missing an "in"?

Fixed.


pkg/kv/kvserver/batcheval/cmd_compute_checksum.go, line 39 at r16 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
a rare a closed timestamp

Delete the second "a".

Done.

@aayushshah15
Copy link
Contributor Author

bors r=nvanbenschoten,andreimatei

aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Aug 4, 2020
We had merges disabled because of the bugs tracked in cockroachdb#44878, but those have
since been fixed by cockroachdb#46085 and cockroachdb#50265.

Release note: None
@craig
Copy link
Contributor

craig bot commented Aug 4, 2020

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Aug 4, 2020

Build succeeded:

@craig craig bot merged commit 6305432 into cockroachdb:master Aug 4, 2020
craig bot pushed a commit that referenced this pull request Aug 4, 2020
51305: geo,geoindex: use bounding box for geography coverings that are bad r=sumeerbhola a=sumeerbhola

This change uses a covererInterface implementation for geography
that notices when a covering is using top-level cells of all faces
and in that case uses the bounding box to compute the covering.

Also changed the bounding box calculation for geography shapes to
use only the points and not first construct s2.Regions. The latter
causes marginally bad shapes to continue to have bad coverings since
the bounding box also covers the whole earth.

Release note: None

51882: roachpb: panic when comparing a Lease to a non-lease r=andreimatei a=andreimatei

Release note: None

52146: sql: remove local execution of projectSetNode and implement ConstructProjectSet in the new factory r=yuzefovich a=yuzefovich

Depends on #52108.

**sql: remove local execution of projectSetNode**

We have project set processor which is always planned for
`projectSetNode`, so this commit removes the dead code of its local
execution. Additionally, it removes some unused fields and cleans up
cancellation check of the processor.

Release note: None

**sql: implement ConstructProjectSet in the new factory**

Addresses: #47473.

Release note: None

52320: kvserver: enable merges in kvnemesis r=aayushshah15 a=aayushshah15

We had merges disabled because of the bugs tracked in #44878, but those have
since been fixed by #46085 and #50265.

Release note: None

Co-authored-by: sumeerbhola <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Aayush Shah <[email protected]>
@aayushshah15 aayushshah15 deleted the rangefeed_drop branch August 4, 2020 18:14
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Feb 27, 2021
Needed for cockroachdb#61137.

This commit updates the manner through which lease transfers (through
`LeaseTransferRequest`) and range merges (through `SubsumeRequest`)
handle the "transfer of power" from their outgoing leaseholder to their
incoming leaseholder. Specifically, it updates the handling of these
requests in order to rationalize the interaction between their
evaluation and the closing of timestamps through the closed timestamp
side-transport. It is now clearer when and how these requests instruct
the outgoing leaseholder to relinquish its ability to advance the closed
timestamp and, as a result, now possible for the requests to query and
operate on the maximum closed timestamp published by the outgoing leaseholder.

For lease transfers, this commit begins by addressing an existing TODO
to push the revocation of the outgoing lease out of `AdminTransferLease`
and into the evaluation of `LeaseTransferRequest` through a new
`RevokeLease` method on the `EvalContext`. Once a lease is revoked, the
side-transport will no longer be able to advance the closed timestamp
under it. This was made possible by cockroachdb#59086 and was suggested by @tbg
during the code review. We generally like to keep replica state changes
out of "admin" requests themselves, which are intended to coordinate
changes through individual non-admin requests. Admin requests generally
don't even need to evaluate on a leaseholder (though they try to), so
having them perform any state changes is fragile.

For range merges, this commit removes the `MaybeWatchForMerge` flag from
the `LocalResult` returned by `SubsumeRequest` and replaces it with a
`WatchForMerge` method on the `EvalContext`. This allows the
`SubsumeRequest` to launch the range merge watcher goroutine during it
evaluation, which the side-transport checks for to see whether a
leaseholder can advance its closed timestamp. In doing so, the
`SubsumeRequest` is able to pause closed timestamps when it wants and is
also able to observe and return the maximum closed timestamp _after_ the
closed timestamp has stopped advancing. This is a stark improvement over
the approach with the original closed timestamp system, which required a
herculean effort in cockroachdb#50265 to make correct.

With these refactors complete, the closed timestamp side-transport
should have a much easier and safer time checking whether a given
leaseholder is able to advance its closed timestamp.

Release justification: Necessary for the safety of new functionality.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Mar 1, 2021
Needed for cockroachdb#61137.

This commit updates the manner through which lease transfers (through
`LeaseTransferRequest`) and range merges (through `SubsumeRequest`)
handle the "transfer of power" from their outgoing leaseholder to their
incoming leaseholder. Specifically, it updates the handling of these
requests in order to rationalize the interaction between their
evaluation and the closing of timestamps through the closed timestamp
side-transport. It is now clearer when and how these requests instruct
the outgoing leaseholder to relinquish its ability to advance the closed
timestamp and, as a result, now possible for the requests to query and
operate on the maximum closed timestamp published by the outgoing leaseholder.

For lease transfers, this commit begins by addressing an existing TODO
to push the revocation of the outgoing lease out of `AdminTransferLease`
and into the evaluation of `LeaseTransferRequest` through a new
`RevokeLease` method on the `EvalContext`. Once a lease is revoked, the
side-transport will no longer be able to advance the closed timestamp
under it. This was made possible by cockroachdb#59086 and was suggested by @tbg
during the code review. We generally like to keep replica state changes
out of "admin" requests themselves, which are intended to coordinate
changes through individual non-admin requests. Admin requests generally
don't even need to evaluate on a leaseholder (though they try to), so
having them perform any state changes is fragile.

For range merges, this commit removes the `MaybeWatchForMerge` flag from
the `LocalResult` returned by `SubsumeRequest` and replaces it with a
`WatchForMerge` method on the `EvalContext`. This allows the
`SubsumeRequest` to launch the range merge watcher goroutine during it
evaluation, which the side-transport checks for to see whether a
leaseholder can advance its closed timestamp. In doing so, the
`SubsumeRequest` is able to pause closed timestamps when it wants and is
also able to observe and return the maximum closed timestamp _after_ the
closed timestamp has stopped advancing. This is a stark improvement over
the approach with the original closed timestamp system, which required a
herculean effort in cockroachdb#50265 to make correct.

With these refactors complete, the closed timestamp side-transport
should have a much easier and safer time checking whether a given
leaseholder is able to advance its closed timestamp.

Release justification: Necessary for the safety of new functionality.
craig bot pushed a commit that referenced this pull request Mar 1, 2021
61221: kv: sync lease transfers and range merges with closed timestamp side-transport r=nvanbenschoten a=nvanbenschoten

Needed for the safety of #61137.

This commit updates the manner through which lease transfers (through `LeaseTransferRequest`) and range merges (through `SubsumeRequest`) handle the "transfer of power" from their outgoing leaseholder to their incoming leaseholder. Specifically, it updates the handling of these requests in order to rationalize the interaction between their evaluation and the closing of timestamps through the closed timestamp side-transport. It is now clearer when and how these requests instruct the outgoing leaseholder to relinquish its ability to advance the closed timestamp and, as a result, now possible for the requests to query and operate on the maximum closed timestamp published by the outgoing leaseholder.

For lease transfers, this commit begins by addressing an existing TODO to push the revocation of the outgoing lease out of `AdminTransferLease` and into the evaluation of `LeaseTransferRequest` through a new `RevokeLease` method on the `EvalContext`. Once a lease is revoked, the side-transport will no longer be able to advance the closed timestamp under it. This was made possible by #59086 and was suggested by @tbg during the code review. We generally like to keep replica state changes out of "admin" requests themselves, which are intended to coordinate changes through individual non-admin requests. Admin requests generally don't even need to evaluate on a leaseholder (though they try to), so having them perform any state changes is fragile.

For range merges, this commit removes the `MaybeWatchForMerge` flag from the `LocalResult` returned by `SubsumeRequest` and replaces it with a `WatchForMerge` method on the `EvalContext`. This allows the `SubsumeRequest` to launch the range merge watcher goroutine during it evaluation, which the side-transport checks for to see whether a leaseholder can advance its closed timestamp. In doing so, the `SubsumeRequest` is able to pause closed timestamps when it wants and is also able to observe and return the maximum closed timestamp _after_ the closed timestamp has stopped advancing. This is a stark improvement over the approach with the original closed timestamp system, which required a herculean effort in #50265 to make correct.

With these refactors complete, the closed timestamp side-transport should have a much easier and safer time checking whether a given leaseholder is able to advance its closed timestamp.

Release justification: Necessary for the safety of new functionality.

61237: util/log: ensure that all channel logs are displayed with `-show-logs` r=tbg a=knz

When `-show-logs` is specified, the `log.Scope` becomes a no-op and
the default configuration in the `log` package is used. This is the
only time ever when the default configuration is used.

Prior to this patch, only the logging for the DEV channel would
make its way to the standard error (and the test output) in that
case. This was unfortunate, since the intent (as spelled out in a
comment already) was to display everything.

This patch fixes that.

Release justification: non-production code changes

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rangefeed: dropping events (in catch up scan?)
4 participants