kv: try next replica on RangeNotFoundError #31013

tbg · 2018-10-05T16:57:25Z

Previously, if a Batch RPC came back with a RangeNotFoundError, we would
immediately stop trying to send to more replicas, evict the range
descriptor, and start a new attempt after a back-off.

This new attempt could end up using the same replica, so if the
RangeNotFoundError persisted for some amount of time, so would the
unsuccessful retries for requests to it as DistSender doesn't aggressively
shuffle the replicas.

It turns out that there are such situations, and the election-after-restart
roachtest spuriously hit one of them:

new replica receives a preemptive snapshot and the ConfChange
cluster restarts
now the new replica is in this state until the range wakes
up, which may not happen for some time. 4. the first request to the range
runs into the above problem

@nvanbenschoten: I think there is an issue to be filed about the tendency
of DistSender to get stuck in unfortunate configurations.

Fixes #30613.

Release note (bug fix): Avoid repeatedly trying a replica that was found to
be in the process of being added.

cockroach-teamcity · 2018-10-05T16:57:34Z

This change is

nvanbenschoten

Nice find! These kinds of issues are very tricky to track down.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 1 files at r3, 10 of 10 files at r4, 1 of 1 files at r5, 2 of 2 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/kv/dist_sender.go, line 1117 at r3 (raw file):

			// descriptor. Invalidate the cache and try again with the new
			// metadata.
			log.Eventf(ctx, "evicting range descriptor on %T and backoff for re-lookup: %+v", tErr, desc)

Should we make this a VEventf?

pkg/kv/dist_sender.go, line 1360 at r6 (raw file):

				// These errors are likely to be unique to the replica that reported
				// them, so no action is required before the next retry.
			case *roachpb.RangeNotFoundError:

The effect of this change is that cases where all replicas return RangeNotFoundErrors will now result in a SendError once the transport is exhausted. I think that's ok because we treat them the same in sendPartialBatch, but I want to make sure that's your intention.

pkg/kv/dist_sender.go, line 1362 at r6 (raw file):

			case *roachpb.RangeNotFoundError:
				// The store we routed to doesn't have this replica. This can happen when
				// our descriptor is outright outdated, but it can also be caused a replica

"caused by"

pkg/kv/dist_sender.go, line 1371 at r6 (raw file):

				// up (which may never happen if the target range is dormant).
				if tErr.StoreID != 0 {
					cachedStoreID, found := ds.leaseHolderCache.Lookup(ctx, tErr.RangeID)

In your test failure, was the replica that the request continued to get routed to the leaseholder? I'm wondering if we should propagate the error immediately if we think it is the leaseholder but it returns a RangeNotFoundError.

pkg/kv/dist_sender.go, line 1374 at r6 (raw file):

					match := cachedStoreID == tErr.StoreID
					evicting := found && match
					log.Eventf(ctx, "evicting leaseholder s%d for r%d after RangeNotFound: %t (found=%t, match=%t)", tErr.StoreID, tErr.RangeID, evicting, found, match)

Should this be moved a line down?

pkg/kv/dist_sender.go, line 1391 at r6 (raw file):

					if replicas.FindReplica(lh.StoreID) == -1 {
						// Replace NotLeaseHolderError with RangeNotFoundError.
						br.Error = roachpb.NewError(roachpb.NewRangeNotFoundError(rangeID, curReplica.StoreID))

It's surprising that a NotLeaseHolderError will be converted to a RangeNotFoundError and propagated immediately but a RangeNotFoundError won't be propagated (see comment above). Should we change this to a SendError and remove the handling of RangeNotFoundErrors entirely from sendPartialBatch?

pkg/storage/store.go, line 144 at r5 (raw file):

		Settings:                    st,
		AmbientCtx:                  log.AmbientContext{Tracer: st.Tracer},
		Clock:                       clock,

Is this a gofmt 1.10 vs 1.11 thing?

bdarnell

Reviewed 1 of 1 files at r2, 7 of 10 files at r4, 1 of 1 files at r5, 1 of 2 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/cmd/roachtest/election.go, line 60 at r6 (raw file):

			// eagerly that multiple elections conflict with each other).
			start = timeutil.Now()
			buf, err := c.RunWithBuffer(ctx, c.l, c.Node(1), `env COCKROACH_CONNECT_TIMEOUT=120 ./cockroach sql --insecure -e "

Why so long? The point of this test is that the cluster should be able to recover quickly.

pkg/kv/dist_sender.go, line 1391 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It's surprising that a NotLeaseHolderError will be converted to a RangeNotFoundError and propagated immediately but a RangeNotFoundError won't be propagated (see comment above). Should we change this to a SendError and remove the handling of RangeNotFoundErrors entirely from sendPartialBatch?

+1 (I think, haven't thought through this part fully)

done

tbg

TFTRs!

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/cmd/roachtest/election.go, line 60 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why so long? The point of this test is that the cluster should be able to recover quickly.

All I know is that 5s is too short. I assume this is because Gossip needs a second to bootstrap (we start the processes asynchronously). I'll take this out and we can look at that separately.

pkg/kv/dist_sender.go, line 1117 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should we make this a VEventf?

Sure, why not. Doubt it matters, but doesn't hurt.

pkg/kv/dist_sender.go, line 1360 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The effect of this change is that cases where all replicas return RangeNotFoundErrors will now result in a SendError once the transport is exhausted. I think that's ok because we treat them the same in sendPartialBatch, but I want to make sure that's your intention.

Yes, I think that's fine.

pkg/kv/dist_sender.go, line 1362 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"caused by"

Done.

pkg/kv/dist_sender.go, line 1371 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

In your test failure, was the replica that the request continued to get routed to the leaseholder? I'm wondering if we should propagate the error immediately if we think it is the leaseholder but it returns a RangeNotFoundError.

No, in my testing this code never fired. But if it were cached as leaseholder we'd try it first on every subsequent attempt. If the real leaseholder were, say, the second replica, we'd never evict the cache and always pay a round-trip to that first node.

The real fix here is updating the leaseholder cache on successful replies, but I want to keep this diff small.

pkg/kv/dist_sender.go, line 1374 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this be moved a line down?

This message was in the right place when I needed it (since I wanted to know what was going on) but now it seems right to move it. Done.

pkg/kv/dist_sender.go, line 1391 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

+1 (I think, haven't thought through this part fully)

Done.

pkg/storage/store.go, line 144 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this a gofmt 1.10 vs 1.11 thing?

Not sure, but CI doesn't like it, so removed.

bdarnell

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/cmd/roachtest/election.go, line 60 at r6 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

All I know is that 5s is too short. I assume this is because Gossip needs a second to bootstrap (we start the processes asynchronously). I'll take this out and we can look at that separately.

OK. We should probably increase the default connect timeout then. 5s is too short, since connections in the real world may need to wait for the lease timeout for the cluster to become live. (Also I see that that env var is defined as a String interpreted as integer seconds when it should probably be a Duration instead)

pkg/storage/store.go, line 144 at r5 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Not sure, but CI doesn't like it, so removed.

Yes, it is gofmt 1.11. You can configure your editor to use cockroach/bin/gofmt or crlfmt to make sure it uses the right version.

tbg · 2018-10-08T14:38:52Z

Ack. Holding off on the merge since #31068 fails like clockwork on this PR in CI.

tbg · 2018-10-09T14:57:46Z

This should be green soon. I added a commit that populates the leaseholder cache on successful responses, PTAL!
I'm moderately excited about that commit because we've been wanting it for a while and it was actually easy. I'm less excited about the point in the cycle at which it's happening.

nvanbenschoten

Reviewed 11 of 11 files at r7, 1 of 1 files at r8, 10 of 10 files at r9, 3 of 3 files at r10, 5 of 5 files at r11.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/kv/dist_sender.go, line 1391 at r6 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Done.

The other half of this is that we no longer need a case for RangeNotFoundError in sendPartialBatch at all. Is there a blocker that's preventing us from removing that now?

pkg/kv/dist_sender.go, line 1371 at r11 (raw file):

				// When a request that we know could only succeed on the leaseholder comes
				// back as successful, make sure the leaseholder cache reflects this
				// replica. In steady state, this should almost never be the case, and so we

This wording is a bit strange. This should almost always be the case, which is why we gate it, right?

pkg/kv/dist_sender.go, line 1386 at r11 (raw file):

				// replica that has just been added but needs a snapshot to be caught up.
				//
				// We'll try other replicas which typically gives us the leaseholder, either

Glad we could remove that code.

pkg/kv/dist_sender.go, line 1396 at r11 (raw file):

					// the range descriptor), but that's OK - this kind of error should be rare
					// but when it happens, we want to populate the cache with something useful
					// ASAP without waiting for another round trip.

Can we change the leaseHolder object to avoid this?

pkg/sql/logictest/testdata/logic_test/explain_analyze_plans, line 50 at r11 (raw file):

/5         NULL     6         {5}       5

# Verify data placement.

Expand on this comment a bit. Why does the output change?

a-robinson

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/logictest/testdata/logic_test/explain_analyze_plans, line 50 at r11 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Expand on this comment a bit. Why does the output change?

It's a different table (kv vs kw).

And to answer @tschottdorf's original question in #31068 (comment), I imagine this makes the tests pass because it ensures that all the ranges/leases are in the DistSender's caches.

bdarnell

Reviewed 1 of 11 files at r7, 8 of 10 files at r9, 1 of 3 files at r10, 5 of 5 files at r11.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/kv/dist_sender.go, line 433 at r11 (raw file):

}

type leaseHolderInfo struct {

These two things don't seem to fit together very well in one struct. I think it would make more sense to make them two separate parameters to sendToReplicas (or just recompute needsLeaseHolder on the fly from the batch when needed).

This was hiding the output if the invocation itself failed, which is when you wanted it most. Release note: None

Release note: None

@nvanbenschoten

Previously, if a Batch RPC came back with a RangeNotFoundError, we would immediately stop trying to send to more replicas, evict the range descriptor, and start a new attempt after a back-off. This new attempt could end up using the same replica, so if the RangeNotFoundError persisted for some amount of time, so would the unsuccessful retries for requests to it as DistSender doesn't aggressively shuffle the replicas. It turns out that there are such situations, and the election-after-restart roachtest spuriously hit one of them: 1. new replica receives a preemptive snapshot and the ConfChange 2. cluster restarts 3. now the new replica is in this state until the range wakes up, which may not happen for some time. 4. the first request to the range runs into the above problem @nvanbenschoten: I think there is an issue to be filed about the tendency of DistSender to get stuck in unfortunate configurations. Fixes cockroachdb#30613. Release note (bug fix): Avoid repeatedly trying a replica that was found to be in the process of being added.

Whenever a successful response is received from an RPC that we know has to contact the leaseholder to succeed, update the leaseholder cache. The immediate motivation for this is to be able to land the preceding commits, which greatly exacerbated (as in, added a much faster failure mode to) ``` make stress PKG=./pkg/sql/logictest TESTS=TestPlannerLogic/5node-dist/distsql_interleaved_join ``` However, the change is one we've wanted to make for a while; our caching and in particular the eviction of leaseholders has been deficient essentially ever since it was first introduced. Touches cockroachdb#31068. Release note: None

tbg

Ok, ready for a final look. Happy with how this turned out.

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/cmd/roachtest/election.go, line 60 at r6 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

OK. We should probably increase the default connect timeout then. 5s is too short, since connections in the real world may need to wait for the lease timeout for the cluster to become live. (Also I see that that env var is defined as a String interpreted as integer seconds when it should probably be a Duration instead)

I'll keep this on the radar, but I'm going to leave as-is on the off-change that these timeouts were actually the same bug, hitting very early in one of the internal queries ran by ./cockroach sql.

pkg/kv/dist_sender.go, line 1391 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The other half of this is that we no longer need a case for RangeNotFoundError in sendPartialBatch at all. Is there a blocker that's preventing us from removing that now?

Good point, done.

pkg/kv/dist_sender.go, line 433 at r11 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

These two things don't seem to fit together very well in one struct. I think it would make more sense to make them two separate parameters to sendToReplicas (or just recompute needsLeaseHolder on the fly from the batch when needed).

Done.

pkg/kv/dist_sender.go, line 1371 at r11 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This wording is a bit strange. This should almost always be the case, which is why we gate it, right?

Oops, fixed.

pkg/kv/dist_sender.go, line 1386 at r11 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Glad we could remove that code.

Done.

pkg/kv/dist_sender.go, line 1396 at r11 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can we change the leaseHolder object to avoid this?

Done.

pkg/sql/logictest/testdata/logic_test/explain_analyze_plans, line 50 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

It's a different table (kv vs kw).

And to answer @tschottdorf's original question in #31068 (comment), I imagine this makes the tests pass because it ensures that all the ranges/leases are in the DistSender's caches.

Yeah, my only explanation here is that without this query, some ranges may just never have been touched and so the leaseholders aren't cached.

On the flip side, I don't know why this lease placement here is deterministic. Do these tests apply constraints or something like that? I mentioned this trick to @andreimatei and his eyebrow raised visibly, so perhaps he wants to comment.

(I'm not planning to block this PR on that, though).

nvanbenschoten

Reviewed 8 of 10 files at r14, 1 of 3 files at r15, 6 of 6 files at r16.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

tbg · 2018-10-10T21:49:32Z

TFTR!

bors r=nvanbenschoten,bdarnell

andreimatei

Reviewable status: complete! 1 of 0 LGTMs obtained

pkg/sql/logictest/testdata/logic_test/explain_analyze_plans, line 50 at r11 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Yeah, my only explanation here is that without this query, some ranges may just never have been touched and so the leaseholders aren't cached.

On the flip side, I don't know why this lease placement here is deterministic. Do these tests apply constraints or something like that? I mentioned this trick to @andreimatei and his eyebrow raised visibly, so perhaps he wants to comment.

(I'm not planning to block this PR on that, though).

why wouldn't the placement be deterministic? Doesn't the EXPERIMENTAL_RELOCATE do deterministic moves? And after that runs, I assume there's nothing to cause a change.

a-robinson

Reviewable status: complete! 1 of 0 LGTMs obtained

pkg/sql/logictest/testdata/logic_test/explain_analyze_plans, line 50 at r11 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

why wouldn't the placement be deterministic? Doesn't the EXPERIMENTAL_RELOCATE do deterministic moves? And after that runs, I assume there's nothing to cause a change.

Lease placement is deterministic because that's what EXPERIMENTAL_RELOCATE guarantees. It always transfers the lease to the first listed store, as specified in the original proposal https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20170314_sql_split_syntax.md#3-alter-tableindex-experimental_relocate

tbg · 2018-10-10T22:07:38Z

Thanks for the clarifications, @a-robinson and @andreimatei.

andreimatei

Reviewable status: complete! 1 of 0 LGTMs obtained

pkg/sql/logictest/testdata/logic_test/explain_analyze_plans, line 50 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Lease placement is deterministic because that's what EXPERIMENTAL_RELOCATE guarantees. It always transfers the lease to the first listed store, as specified in the original proposal https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20170314_sql_split_syntax.md#3-alter-tableindex-experimental_relocate

right, so there's no mystery here, is there?
these show experimental_ranges queries could be replaced with select count(1) for the purposes of populating the caches, but I don't know if that'd be better or worse for the test.
I'm mildly surprised that show experimental_ranges populates any caches, but I don't know how it's implemented.

bdarnell

Reviewed 8 of 10 files at r14, 1 of 3 files at r15, 6 of 6 files at r16.
Reviewable status: complete! 1 of 0 LGTMs obtained

a-robinson

Reviewable status: complete! 1 of 0 LGTMs obtained

pkg/sql/logictest/testdata/logic_test/explain_analyze_plans, line 50 at r11 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

right, so there's no mystery here, is there?
these show experimental_ranges queries could be replaced with select count(1) for the purposes of populating the caches, but I don't know if that'd be better or worse for the test.
I'm mildly surprised that show experimental_ranges populates any caches, but I don't know how it's implemented.

show ranges works for populating the caches because in order to populate the lease_holder column it has to send a LeaseInfo batch request to every range.

@nvanbenschoten

31013: kv: try next replica on RangeNotFoundError r=nvanbenschoten,bdarnell a=tschottdorf Previously, if a Batch RPC came back with a RangeNotFoundError, we would immediately stop trying to send to more replicas, evict the range descriptor, and start a new attempt after a back-off. This new attempt could end up using the same replica, so if the RangeNotFoundError persisted for some amount of time, so would the unsuccessful retries for requests to it as DistSender doesn't aggressively shuffle the replicas. It turns out that there are such situations, and the election-after-restart roachtest spuriously hit one of them: 1. new replica receives a preemptive snapshot and the ConfChange 2. cluster restarts 3. now the new replica is in this state until the range wakes up, which may not happen for some time. 4. the first request to the range runs into the above problem @nvanbenschoten: I think there is an issue to be filed about the tendency of DistSender to get stuck in unfortunate configurations. Fixes #30613. Release note (bug fix): Avoid repeatedly trying a replica that was found to be in the process of being added. 31187: roachtest: add synctest r=bdarnell a=tschottdorf This new roachtest sets up a charybdefs on a single (Ubuntu) node and runs the `synctest` cli command against a nemesis that injects random I/O errors. The synctest command is new. It simulates a Raft log and can be directed at a filesystem that is being hit with random failures. The workload essentially writes ascending keys (flushing each one to disk synchronously) until an I/O error occurs, at which point it re-opens the instance to verify that all persisted writes are still there. If the RocksDB instance was permanently corrupted, it switches to a new, pristine, directory. This is used in the roachtest, but is also useful for manual use in user deployments in which we suspect there is a failure to persist data to disk. This hasn't found anything, but it's fun to watch and also shows us a number of errors that we know and love from sentry. Release note: None 31215: storage: deflake TestStoreRangeMergeWatcher r=tschottdorf a=benesch This test could deadlock if the LHS replica on store2 was shut down before it processed the split at "b". Teach the test to wait for the LHS replica on store2 to process the split before blocking Raft traffic to it. Fixes #31096. Fixes #31149. Fixes #31160. Fixes #31167. Release note: None 31217: importccl: add explicit default to mysql testdata timestamp r=dt a=dt this makes the testdata work on mysql 8.0.2+, where the timestamp type no longer has the implicit defaults. Release note: none. 31221: cluster: Create final cluster version for 2.1 r=bdarnell a=bdarnell Release note: None Co-authored-by: Tobias Schottdorf <[email protected]> Co-authored-by: Nikhil Benesch <[email protected]> Co-authored-by: David Taylor <[email protected]> Co-authored-by: Ben Darnell <[email protected]>

craig · 2018-10-10T22:56:05Z

Build succeeded

GitHub CI (Cockroach)

This is no longer flaky, perhaps as a result of cockroachdb#31013. Verified via: $ roachprod create tobias-stress -n 10 --gce-machine-type=n1-standard-8 --local-ssd=false $ make roachprod-stressrace PKG=./pkg/sql/distsqlplan TESTS=TestSpanResolverUsesCaches CLUSTER=tobias-stress ... 8431 runs so far, 0 failures, over 15m15s Release note: None

31726: storage: unskip TestSpanResolverUsesCaches r=petermattis a=tschottdorf This is no longer flaky, perhaps as a result of #31013. Verified via: $ roachprod create tobias-stress -n 10 --gce-machine-type=n1-standard-8 --local-ssd=false $ make roachprod-stressrace PKG=./pkg/sql/distsqlplan TESTS=TestSpanResolverUsesCaches CLUSTER=tobias-stress ... 8431 runs so far, 0 failures, over 15m15s Release note: None Co-authored-by: Tobias Schottdorf <[email protected]>

tbg requested a review from a team October 5, 2018 16:57

tbg requested a review from nvanbenschoten October 5, 2018 16:57

nvanbenschoten previously requested changes Oct 5, 2018

View reviewed changes

bdarnell approved these changes Oct 7, 2018

View reviewed changes

tbg force-pushed the roachtest/election-after-restart branch 3 times, most recently from aae8c5b to ab08379 Compare October 8, 2018 11:17

tbg force-pushed the roachtest/election-after-restart branch from ab08379 to 8b083e1 Compare October 8, 2018 11:34

tbg commented Oct 8, 2018

View reviewed changes

tbg force-pushed the roachtest/election-after-restart branch from 8b083e1 to 1db9737 Compare October 8, 2018 13:09

bdarnell reviewed Oct 8, 2018

View reviewed changes

tbg requested a review from a team October 8, 2018 17:50

a-robinson mentioned this pull request Oct 8, 2018

logictest: TestPlannerLogic/5node-dist/distsql_interleaved_join created unexpected plan #31068

Closed

tbg force-pushed the roachtest/election-after-restart branch 2 times, most recently from 9123dcf to 50bad5f Compare October 9, 2018 12:48

tbg requested a review from a team as a code owner October 9, 2018 12:48

tbg force-pushed the roachtest/election-after-restart branch 2 times, most recently from c0010ee to faea6f4 Compare October 9, 2018 14:55

nvanbenschoten requested changes Oct 9, 2018

View reviewed changes

a-robinson reviewed Oct 9, 2018

View reviewed changes

bdarnell approved these changes Oct 9, 2018

View reviewed changes

tbg added 5 commits October 10, 2018 18:23

roachtest: print output on failure in election test

7e5d50c

This was hiding the output if the invocation itself failed, which is when you wanted it most. Release note: None

kv: improve a trace event about descriptor eviction

2ef8fa4

Release note: None

storage: report optional StoreID with RangeNotFoundError

8ff44ce

Release note: None

tbg force-pushed the roachtest/election-after-restart branch from faea6f4 to 857b9c0 Compare October 10, 2018 16:23

a-robinson mentioned this pull request Oct 10, 2018

teamcity: failed tests on master: testrace/TestDistSQLRangeCachesIntegrationTest #25808

Closed

tbg commented Oct 10, 2018

View reviewed changes

nvanbenschoten approved these changes Oct 10, 2018

View reviewed changes

andreimatei reviewed Oct 10, 2018

View reviewed changes

a-robinson approved these changes Oct 10, 2018

View reviewed changes

andreimatei reviewed Oct 10, 2018

View reviewed changes

bdarnell approved these changes Oct 10, 2018

View reviewed changes

a-robinson approved these changes Oct 10, 2018

View reviewed changes

craig bot merged commit 857b9c0 into cockroachdb:master Oct 10, 2018

tbg mentioned this pull request Oct 11, 2018

backport-2.1: kv: try next replica on RangeNotFoundError #31250

Merged

tbg mentioned this pull request Oct 23, 2018

storage: unskip TestSpanResolverUsesCaches #31726

Merged

tbg deleted the roachtest/election-after-restart branch March 13, 2019 11:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: try next replica on RangeNotFoundError #31013

kv: try next replica on RangeNotFoundError #31013

tbg commented Oct 5, 2018

cockroach-teamcity commented Oct 5, 2018

nvanbenschoten left a comment

bdarnell left a comment

tbg left a comment

bdarnell left a comment

tbg commented Oct 8, 2018

tbg commented Oct 9, 2018

nvanbenschoten left a comment

a-robinson left a comment

bdarnell left a comment

tbg left a comment

nvanbenschoten left a comment

tbg commented Oct 10, 2018

andreimatei left a comment

a-robinson left a comment

tbg commented Oct 10, 2018

andreimatei left a comment

bdarnell left a comment

a-robinson left a comment

craig bot commented Oct 10, 2018

kv: try next replica on RangeNotFoundError #31013

kv: try next replica on RangeNotFoundError #31013

Conversation

tbg commented Oct 5, 2018

cockroach-teamcity commented Oct 5, 2018

nvanbenschoten left a comment

Choose a reason for hiding this comment

bdarnell left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

bdarnell left a comment

Choose a reason for hiding this comment

tbg commented Oct 8, 2018

tbg commented Oct 9, 2018

nvanbenschoten left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

bdarnell left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

tbg commented Oct 10, 2018

andreimatei left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

tbg commented Oct 10, 2018

andreimatei left a comment

Choose a reason for hiding this comment

bdarnell left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

craig bot commented Oct 10, 2018

Build succeeded