Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: eagerly GC replicas on raft transport errors #8172

Merged

Conversation

petermattis
Copy link
Collaborator

@petermattis petermattis commented Aug 1, 2016

When the Raft transport stream returns an error we can use that error as
an signal that the replica may need to be GC'd.

Suggested in #8130.


This change is Reviewable

@petermattis
Copy link
Collaborator Author

Review status: 0 of 1 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


storage/replica.go, line 345 [r1] (raw file):

              r, toReplica, err)
          // TODO(peter): Find a better way of determining that this replica is
          // likely no longer a member of the range.

I've verified that this works, but it is fugly. Also, need to figure out how to test.


Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Aug 1, 2016

Test?

@petermattis
Copy link
Collaborator Author

petermattis commented Aug 1, 2016

@tamird Yes, a test is needed. Any advice on which existing storage test I should look to for inspiration?

@tbg
Copy link
Member

tbg commented Aug 1, 2016

ping #7489 since it's related. And when done, this change would also close #5789. LGTM once it's less fugly (I'm ok with string matching, but let's at least match on a better-defined string where the sender knows what the receiver does with it) and has a test.

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch 2 times, most recently from 15931d6 to 2c84fac Compare August 1, 2016 15:34
@petermattis
Copy link
Collaborator Author

Ok, I made this less fugly. PTAL.

Still need to figure out how to test. TestReplicateRemovedNodeDisruptiveElection almost works, but the unreplication causes the replica to be GC'd during the normal change-replicas processing.

@tbg
Copy link
Member

tbg commented Aug 1, 2016

You could introduce a knob that omits the GC during normal processing.
I think we should also unwind some of the changes in #7533 and #7507.

@tbg
Copy link
Member

tbg commented Aug 1, 2016

LGTM to the "meat" which is currently here.

@@ -145,6 +149,10 @@ func updatesTimestampCache(r roachpb.Request) bool {
return updatesTimestampCacheMethods[m]
}

const replicaTooOld = "replica too old, discarding message"

var errReplicaTooOld = grpc.Errorf(codes.Aborted, replicaTooOld)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just create those when you need them. You're not using them as a sentinel, so the fact that we use both the error and the error message is a bit confusing.

@bdarnell
Copy link
Contributor

bdarnell commented Aug 1, 2016

:lgtm:


Reviewed 4 of 4 files at r2.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


build/check-style.sh, line 111 [r2] (raw file):

    grep -vE '\.pb\.gw\.go:[0-9]+: declaration of "?ctx"? shadows' | \
    grep -vE 'declaration of "?(pE|e)rr"? shadows' | \
    grep -vE '^(server/(serverpb/admin|serverpb/status|admin|status)|(ts|ts/tspb)/(server|timeseries)|storage/replica)\..*\go:[0-9]+: constant [0-9]+ not a string in call to Errorf'

Why do we need to do this for grpc.Errorf but go vet is fine with our logging functions that include ctx as their first parameter?


storage/replica.go, line 152 [r2] (raw file):

}

const replicaTooOld = "replica too old, discarding message"

Maybe say "sender replica" instead of just "replica", since I would otherwise expect error messages to refer to the recipient replica's state.


storage/replica.go, line 351 [r2] (raw file):

          // NB: We can't compare err == errReplicaTooOld because grpc.rpcError is
          // a pointer and the pointer comparison will fail.
          if grpc.Code(err) == codes.Aborted && grpc.ErrorDesc(err) == replicaTooOld {

You could define grpcErrorEqual as grpc.Code(a) == grpc.Code(b) && grpc.ErrorDesc(a) == grpc.ErrorDesc(b) so we don't need to expose the separate replicaTooOld constant or duplicate the Aborted constant.


Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from 2c84fac to b9d4c95 Compare August 1, 2016 16:07
@petermattis
Copy link
Collaborator Author

Added a test which didn't need additional knobs. Stress tested it and everything seems stable.


Review status: 2 of 5 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


build/check-style.sh, line 111 [r2] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why do we need to do this for grpc.Errorf but go vet is fine with our logging functions that include ctx as their first parameter?

I don't know, but it does. @tamird?

storage/replica.go, line 152 [r2] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Maybe say "sender replica" instead of just "replica", since I would otherwise expect error messages to refer to the recipient replica's state.

Done.

storage/replica.go, line 154 [r2] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Maybe just create those when you need them. You're not using them as a sentinel, so the fact that we use both the error and the error message is a bit confusing.

Took Ben's suggestion.

storage/replica.go, line 351 [r2] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

You could define grpcErrorEqual as grpc.Code(a) == grpc.Code(b) && grpc.ErrorDesc(a) == grpc.ErrorDesc(b) so we don't need to expose the separate replicaTooOld constant or duplicate the Aborted constant.

Done, though I did `err1.String() == err2.String()`.

Comments from Reviewable

@bdarnell
Copy link
Contributor

bdarnell commented Aug 1, 2016

LGTM


Reviewed 3 of 3 files at r3.
Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks pending.


storage/client_raft_test.go, line 1603 [r3] (raw file):

  }

  // Stop node 3; while it is down remove the range from it. Since the ndoe is

s/ndoe/node/


storage/client_raft_test.go, line 1619 [r3] (raw file):

  mtc.restartStore(3)

  util.SucceedsSoon(t, func() error {

It would be nice if we could be more explicit that the new error check is the reason that the replica was removed.


util/grpcutil/grpc_util.go, line 52 [r3] (raw file):

      return err1 == err2
  }
  return err1.Error() == err2.Error()

This isn't GPRC-specific, so this probably isn't the right place for it. Or you could switch to the grpc-specific methods (which are slightly faster since they don't allocate)


Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from b9d4c95 to fdecc0b Compare August 1, 2016 16:29
@petermattis
Copy link
Collaborator Author

Review status: 3 of 5 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending.


storage/client_raft_test.go, line 1603 [r3] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/ndoe/node/

Done.

storage/client_raft_test.go, line 1619 [r3] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

It would be nice if we could be more explicit that the new error check is the reason that the replica was removed.

Any suggestions on how to do that? I suppose I can disable the replica scanner so that the GC action has to be driven by an event.

storage/replica.go, line 345 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

I've verified that this works, but it is fugly. Also, need to figure out how to test.

Done.

util/grpcutil/grpc_util.go, line 52 [r3] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This isn't GPRC-specific, so this probably isn't the right place for it. Or you could switch to the grpc-specific methods (which are slightly faster since they don't allocate)

Ah, I misread what `grpc.{Code,ErrorDesc}` do for non-grpc errors. Switched to using those methods.

Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Aug 1, 2016

:lgtm:


Reviewed 2 of 4 files at r2, 1 of 3 files at r3, 2 of 2 files at r4.
Review status: all files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.


build/check-style.sh, line 111 [r2] (raw file):

Previously, petermattis (Peter Mattis) wrote…

I don't know, but it does. @tamird?

I haven't a clue. Note that in 1.7 go vet is also fine with _this_ case (I have a branch locally), so there might be a special case in vet.

storage/client_raft_test.go, line 1619 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Any suggestions on how to do that? I suppose I can disable the replica scanner so that the GC action has to be driven by an event.

Looks like this is done.

storage/client_raft_test.go, line 1604 [r4] (raw file):

  // Stop node 3; while it is down remove the range from it. Since the node is
  // down it won't see the removal and clean up its replica.

and won't clean up its replica


storage/client_raft_test.go, line 1615 [r4] (raw file):

  mtc.waitForValues(roachpb.Key("a"), []int64{16, 16, 16, 5})

  // Restart the node 3. The removed range will start talking to the other

s/the//. s/range/replica/


storage/client_raft_test.go, line 1618 [r4] (raw file):

  // replicas and determine it needs to be GC'd.
  mtc.restartStore(3)
  mtc.stores[3].SetReplicaScannerDisabled(true)

don't you want to do this before restarting it?


storage/client_raft_test.go, line 1622 [r4] (raw file):

  util.SucceedsSoon(t, func() error {
      replica, err := mtc.stores[3].GetReplica(rangeID)
      if err != nil {

can this check for a specific whitelisted error?


storage/client_raft_test.go, line 1625 [r4] (raw file):

          return nil
      }
      return errors.Errorf("found %s", replica)

instead of "found" how about something more descriptive? this message will not be helpful in a year's time.


storage/replica.go, line 346 [r4] (raw file):

  r.raftSender = store.ctx.Transport.MakeSender(
      func(err error, toReplica roachpb.ReplicaDescriptor) {
          ctx := context.Background()

I think this might want to be context.TODO - should be possible to plumb it in from the transport.


storage/replica.go, line 354 [r4] (raw file):

              if err := r.store.replicaGCQueue.Add(r, 1.0); err != nil {
                  log.Errorf(ctx, "%s: unable to add to GC queue: %s", r, err)

unable to add %d


storage/replica.go, line 360 [r4] (raw file):

          if err != nil && !grpcutil.IsClosedConnection(err) {
              log.Warningf(ctx,
                  "%s: outgoing raft transport stream to %s closed by the remote: %v",

s/%v/%s/ for consistency with logging above.


Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from fdecc0b to ee50e31 Compare August 1, 2016 17:55
@tamird
Copy link
Contributor

tamird commented Aug 1, 2016

Reviewed 2 of 2 files at r5.
Review status: all files reviewed at latest revision, 8 unresolved discussions.


storage/client_raft_test.go, line 1628 [r5] (raw file):

          return err
      }
      return errors.Errorf("found %s, waiting for it to be GC'd: %v", replica, err)

err is guaranteed to be nil here.


Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from ee50e31 to 0de0aa1 Compare August 1, 2016 18:21
@petermattis
Copy link
Collaborator Author

Review status: 4 of 5 files reviewed at latest revision, 14 unresolved discussions, some commit checks pending.


storage/client_raft_test.go, line 1619 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Looks like this is done.

Well, one part. Curious if Ben has something better.

storage/client_raft_test.go, line 1604 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

and won't clean up its replica

Done.

storage/client_raft_test.go, line 1615 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/the//. s/range/replica/

Done.

storage/client_raft_test.go, line 1618 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

don't you want to do this before restarting it?

I do, but I can't. `restartStore` creates a new store. I think this is the best I can do without plumbing a bunch of stuff to allow starting the store with the scanner disabled.

storage/client_raft_test.go, line 1622 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can this check for a specific whitelisted error?

Done. Added an explicit check for "not found".

storage/client_raft_test.go, line 1625 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

instead of "found" how about something more descriptive? this message will not be helpful in a year's time.

Expanded the message.

storage/client_raft_test.go, line 1628 [r5] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

err is guaranteed to be nil here.

Done.

storage/replica.go, line 346 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I think this might want to be context.TODO - should be possible to plumb it in from the transport.

Done.

storage/replica.go, line 354 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

unable to add %d

What's the `%d` for?

storage/replica.go, line 360 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/%v/%s/ for consistency with logging above.

Done.

Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Aug 1, 2016

Reviewed 1 of 1 files at r6.
Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks pending.


storage/replica.go, line 354 [r4] (raw file):

Previously, petermattis (Peter Mattis) wrote…

What's the %d for?

repID, as above? alternatively, remove it from above.

Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from 0de0aa1 to bda9196 Compare August 1, 2016 18:33
@bdarnell
Copy link
Contributor

bdarnell commented Aug 1, 2016

Reviewed 1 of 1 files at r6.
Review status: 4 of 5 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


storage/client_raft_test.go, line 1619 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Well, one part. Curious if Ben has something better.

Disabling the scanner is probably good enough. The only other way I could think of would be to expose counters or something for the different ways in which replicas could get added to the GC queue.

storage/client_raft_test.go, line 1618 [r4] (raw file):

Previously, petermattis (Peter Mattis) wrote…

I do, but I can't. restartStore creates a new store. I think this is the best I can do without plumbing a bunch of stuff to allow starting the store with the scanner disabled.

The plumbing shouldn't be too bad. The `multiTestContext` already has what you need, we just need a boolean in `StoreContext`. Then you could do
sc := storage.TestStoreContext()
sc.ScannerDisabled = true
mtc := multiTestContext{storeContext: &sc}
mtc.start(t, 4)

Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from bda9196 to 8d2004e Compare August 1, 2016 19:28
@petermattis
Copy link
Collaborator Author

Review status: 2 of 7 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


storage/client_raft_test.go, line 1618 [r4] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

The plumbing shouldn't be too bad. The multiTestContext already has what you need, we just need a boolean in StoreContext. Then you could do

sc := storage.TestStoreContext()
sc.ScannerDisabled = true
mtc := multiTestContext{storeContext: &sc}
mtc.start(t, 4)
Done.

storage/replica.go, line 354 [r4] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

repID, as above? alternatively, remove it from above.

I'll add it to the error. No harm in being a bit more verbose (the same can be said for your original comment).

Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from 8d2004e to 98d673d Compare August 1, 2016 19:29
@tamird
Copy link
Contributor

tamird commented Aug 1, 2016

Reviewed 1 of 1 files at r7, 3 of 4 files at r8.
Review status: 6 of 7 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


storage/replica.go, line 354 [r4] (raw file):

Previously, petermattis (Peter Mattis) wrote…

I'll add it to the error. No harm in being a bit more verbose (the same can be said for your original comment).

Point taken!

Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Aug 1, 2016

Reviewed 1 of 4 files at r8.
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/gc-replica-on-error branch from 98d673d to cefda3b Compare August 1, 2016 20:02
When the Raft transport stream returns an error we can use that error as
a signal that the replica may need to be GC'd.

Suggested in cockroachdb#8130.
Fixes cockroachdb#5789.
@petermattis petermattis merged commit e3b70d0 into cockroachdb:master Aug 1, 2016
@petermattis petermattis deleted the pmattis/gc-replica-on-error branch August 1, 2016 20:19
@tamird tamird mentioned this pull request Aug 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants