Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: fix cockroach quit #26158

Merged
merged 1 commit into from
May 29, 2018
Merged

cli: fix cockroach quit #26158

merged 1 commit into from
May 29, 2018

Conversation

knz
Copy link
Contributor

@knz knz commented May 29, 2018

Informs/fixes #25870.
Fixes #26144.

This patch fixes the following:

  • the logic in doShutdown() aims to ignore errors caused by attempts
    connect to a server which is closing its gRPC channels, but was
    missing one case of such errors: during the initial check whether
    the node was running. This patch causes gRPC "closed connection"
    errors to become also ignored in that case.

  • previously if there was a transient gRPC error during a hard
    shutdown whereby the shutdown could still succeed, then cockroach quit would fail no matter what. This patch makes cockroach quit
    retry a hard shutdown.

  • the warning messages are now emitted on stderr (via log.Warningf)
    instead of stdout.

Release note (bug fix): fix a bug where cockroach quit would
erroneously fail even though the node already successfully shut down.

Release note (cli change): cockroach quit now emits warning message
on its standard error stream, not standard output.

@knz knz requested a review from tbg May 29, 2018 11:13
@knz knz requested a review from a team as a code owner May 29, 2018 11:13
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member

tbg commented May 29, 2018

:lgtm:


Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.


pkg/cli/start.go, line 1075 at r1 (raw file):

				return nil
			}
			if strings.Contains(err.Error(), "proceeding with hard shutdown") {

Could you extract this string constant so that it's shared with the other use? As is, it will rot too easily.


pkg/cli/start.go, line 1078 at r1 (raw file):

				// Something happened server-side that made the server decide on
				// a hard shutdown of its own. This is not an error, but still
				// inform the user.

Perhaps also mention that we don't fold this into IsClosedConnection because we don't know whether this error can only occur in that circumstance (don't think so).


Comments from Reviewable

This patch fixes the following:

- the logic in `doShutdown()` aims to ignore errors caused by attempts
  connect to a server which is closing its gRPC channels, but was
  missing one case of such errors: during the initial check whether
  the node was running. This patch causes gRPC "closed connection"
  errors to become also ignored in that case.

- previously if there was a transient gRPC error during a hard
  shutdown whereby the shutdown could still succeed, then `cockroach
  quit` would fail no matter what. This patch makes `cockroach quit`
  retry a hard shutdown.

- the warning messages are now emitted on stderr (via `log.Warningf`)
  instead of stdout.

Release note (bug fix): fix a bug where `cockroach quit` would
erroneously fail even though the node already successfully shut down.

Release note (cli change): `cockroach quit` now emits warning message
on its standard error stream, not standard output.
@knz knz force-pushed the 20180529-quit branch from 23238cb to 0d7603b Compare May 29, 2018 11:31
@knz
Copy link
Contributor Author

knz commented May 29, 2018

Review status: 0 of 1 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/cli/start.go, line 1075 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Could you extract this string constant so that it's shared with the other use? As is, it will rot too easily.

I made a mistake -- this string is not generated by the server. Solved the problem in a different way (retry hard shutdown).


pkg/cli/start.go, line 1078 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Perhaps also mention that we don't fold this into IsClosedConnection because we don't know whether this error can only occur in that circumstance (don't think so).

n/a.


Comments from Reviewable

@knz
Copy link
Contributor Author

knz commented May 29, 2018

bors r+

@tbg
Copy link
Member

tbg commented May 29, 2018

Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

@craig
Copy link
Contributor

craig bot commented May 29, 2018

Build failed

@knz
Copy link
Contributor Author

knz commented May 29, 2018

bors failed on #26098, retrying

bors r+

@knz
Copy link
Contributor Author

knz commented May 29, 2018

bors r-

@craig
Copy link
Contributor

craig bot commented May 29, 2018

Canceled

@knz
Copy link
Contributor Author

knz commented May 29, 2018

(bors was stuck)

bors r+

@craig
Copy link
Contributor

craig bot commented May 29, 2018

Timed out

@knz
Copy link
Contributor Author

knz commented May 29, 2018

bors r+

@craig
Copy link
Contributor

craig bot commented May 29, 2018

Timed out (retrying...)

@@ -1045,6 +1045,9 @@ func doShutdown(ctx context.Context, c serverpb.AdminClient, onModes []int32) er
// out, or perhaps drops the connection while waiting). To that end, we first
// run a noop DrainRequest. If that fails, we give up.
if err := checkNodeRunning(ctx, c); err != nil {
if grpcutil.IsClosedConnection(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @bdarnell asked on #26163, does this swallow the error if you run quit against a server that isn't running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answered to ben there. if the server does not exist the thing still fails.

@knz
Copy link
Contributor Author

knz commented May 29, 2018

bors r-

@knz
Copy link
Contributor Author

knz commented May 29, 2018

bors r+

@couchand
Copy link
Contributor

internal bors weirdness, retrying

bors r=knz

@craig
Copy link
Contributor

craig bot commented May 29, 2018

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request May 29, 2018
26143: opt: Enable additional logic tests for opt configs r=andy-kimball a=andy-kimball

Enable more logic tests (orms -> snapshot_unrelated_update).
Fix various bugs and issues that were failing tests.


26158: cli: fix `cockroach quit` r=knz a=knz

Informs/fixes #25870.
Fixes  #26144.

This patch fixes the following:

- the logic in `doShutdown()` aims to ignore errors caused by attempts
  connect to a server which is closing its gRPC channels, but was
  missing one case of such errors: during the initial check whether
  the node was running. This patch causes gRPC "closed connection"
  errors to become also ignored in that case.

- previously if there was a transient gRPC error during a hard
  shutdown whereby the shutdown could still succeed, then `cockroach
  quit` would fail no matter what. This patch makes `cockroach quit`
  retry a hard shutdown.

- the warning messages are now emitted on stderr (via `log.Warningf`)
  instead of stdout.

Release note (bug fix): fix a bug where `cockroach quit` would
erroneously fail even though the node already successfully shut down.

Release note (cli change): `cockroach quit` now emits warning message
on its standard error stream, not standard output.

26165: roachtest: enable periodic heap prof output for kv/splits/nodes=3 r=tschottdorf a=petermattis

See #26081

Release note: None

26167: encryption: add cli command `cockroach gen encryption-key` r=windchan7 a=windchan7

Added cli command `cockroach gen encryption-key` to generate store keys for
encryption at rest.

Encryption CLI: #19783.
Release note: None

Co-authored-by: Andrew Kimball <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Peter Mattis <[email protected]>
Co-authored-by: Victor Chen <[email protected]>
@craig
Copy link
Contributor

craig bot commented May 29, 2018

Build succeeded

@craig craig bot merged commit 0d7603b into cockroachdb:master May 29, 2018
craig bot pushed a commit that referenced this pull request Jun 8, 2018
26163: release-2.0: cli: fix `cockroach quit` r=knz a=knz

Backport 1/1 commits from #26158.

This `cockroach quit` bug is likely to affect production users, so we need to back-port it.

/cc @cockroachdb/release


Co-authored-by: Raphael 'kena' Poss <[email protected]>
@knz knz deleted the 20180529-quit branch July 2, 2018 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants