Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…db#72731

72349: cloud,ccl: fix improperly wrapped errors r=HonoreDB a=rafiss

refs cockroachdb#42510

I'm working on a linter that detects errors that are not wrapped
correctly, and it discovered these.

Release note: None

72609: server: mention cli-side --drain-wait in corresponding server-side settings r=knz a=tbg

See (internal) https://github.com/cockroachlabs/support/issues/1318.

Release note: None


72703: roachprod: replace fatal logging by propagating errors r=[tbg,rail] a=healthy-pod

Fatal logging prevented errors from being properly propagated back
to the binary error wrapper.

Release note: None

72731: gitignore: ignore `roachprod logs` local log buffers r=rickystewart a=erikgrinaker

The `roachprod logs` command fetches cluster logs into the local
directory `<cluster>.logs` before displaying them. This patch ignores
those logs.

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Ahmad Abedalqader <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
  • Loading branch information
5 people committed Nov 16, 2021
5 parents 370d1ac + 4957e07 + 687f243 + 3d141a8 + 36de3eb commit 525322a
Show file tree
Hide file tree
Showing 26 changed files with 219 additions and 133 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,6 @@ build/Railroad.jar

# Per-user .bazelrc
/.bazelrc.user

# Local disk buffers for "roachprod logs" command
/*.logs
6 changes: 3 additions & 3 deletions docs/generated/settings/settings-for-tenants.txt
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,9 @@ server.oidc_authentication.provider_url string sets OIDC provider URL ({provide
server.oidc_authentication.redirect_url string https://localhost:8080/oidc/v1/callback sets OIDC redirect URL via a URL string or a JSON string containing a required `redirect_urls` key with an object that maps from region keys to URL strings (URLs should point to your load balancer and must route to the path /oidc/v1/callback)
server.oidc_authentication.scopes string openid sets OIDC scopes to include with authentication request (space delimited list of strings, required to start with `openid`)
server.rangelog.ttl duration 720h0m0s if nonzero, range log entries older than this duration are deleted every 10m0s. Should not be lowered below 24 hours.
server.shutdown.drain_wait duration 0s the amount of time a server waits in an unready state before proceeding with the rest of the shutdown process
server.shutdown.lease_transfer_wait duration 5s the amount of time a server waits to transfer range leases before proceeding with the rest of the shutdown process
server.shutdown.query_wait duration 10s the server will wait for at least this amount of time for active queries to finish
server.shutdown.drain_wait duration 0s the amount of time a server waits in an unready state before proceeding with the rest of the shutdown process (note that the --drain-wait parameter for cockroach node drain may need adjustment after changing this setting)
server.shutdown.lease_transfer_wait duration 5s the amount of time a server waits to transfer range leases before proceeding with the rest of the shutdown process (note that the --drain-wait parameter for cockroach node drain may need adjustment after changing this setting)
server.shutdown.query_wait duration 10s the server will wait for at least this amount of time for active queries to finish (note that the --drain-wait parameter for cockroach node drain may need adjustment after changing this setting)
server.time_until_store_dead duration 5m0s the time after which if there is no new gossiped information about a store, it is considered dead
server.user_login.timeout duration 10s timeout after which client authentication times out if some system range is unavailable (0 = no timeout)
server.web_session.auto_logout.timeout duration 168h0m0s the duration that web sessions will survive before being periodically purged, since they were last used
Expand Down
6 changes: 3 additions & 3 deletions docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@
<tr><td><code>server.oidc_authentication.redirect_url</code></td><td>string</td><td><code>https://localhost:8080/oidc/v1/callback</code></td><td>sets OIDC redirect URL via a URL string or a JSON string containing a required `redirect_urls` key with an object that maps from region keys to URL strings (URLs should point to your load balancer and must route to the path /oidc/v1/callback) </td></tr>
<tr><td><code>server.oidc_authentication.scopes</code></td><td>string</td><td><code>openid</code></td><td>sets OIDC scopes to include with authentication request (space delimited list of strings, required to start with `openid`)</td></tr>
<tr><td><code>server.rangelog.ttl</code></td><td>duration</td><td><code>720h0m0s</code></td><td>if nonzero, range log entries older than this duration are deleted every 10m0s. Should not be lowered below 24 hours.</td></tr>
<tr><td><code>server.shutdown.drain_wait</code></td><td>duration</td><td><code>0s</code></td><td>the amount of time a server waits in an unready state before proceeding with the rest of the shutdown process</td></tr>
<tr><td><code>server.shutdown.lease_transfer_wait</code></td><td>duration</td><td><code>5s</code></td><td>the amount of time a server waits to transfer range leases before proceeding with the rest of the shutdown process</td></tr>
<tr><td><code>server.shutdown.query_wait</code></td><td>duration</td><td><code>10s</code></td><td>the server will wait for at least this amount of time for active queries to finish</td></tr>
<tr><td><code>server.shutdown.drain_wait</code></td><td>duration</td><td><code>0s</code></td><td>the amount of time a server waits in an unready state before proceeding with the rest of the shutdown process (note that the --drain-wait parameter for cockroach node drain may need adjustment after changing this setting)</td></tr>
<tr><td><code>server.shutdown.lease_transfer_wait</code></td><td>duration</td><td><code>5s</code></td><td>the amount of time a server waits to transfer range leases before proceeding with the rest of the shutdown process (note that the --drain-wait parameter for cockroach node drain may need adjustment after changing this setting)</td></tr>
<tr><td><code>server.shutdown.query_wait</code></td><td>duration</td><td><code>10s</code></td><td>the server will wait for at least this amount of time for active queries to finish (note that the --drain-wait parameter for cockroach node drain may need adjustment after changing this setting)</td></tr>
<tr><td><code>server.time_until_store_dead</code></td><td>duration</td><td><code>5m0s</code></td><td>the time after which if there is no new gossiped information about a store, it is considered dead</td></tr>
<tr><td><code>server.user_login.timeout</code></td><td>duration</td><td><code>10s</code></td><td>timeout after which client authentication times out if some system range is unavailable (0 = no timeout)</td></tr>
<tr><td><code>server.web_session.auto_logout.timeout</code></td><td>duration</td><td><code>168h0m0s</code></td><td>the duration that web sessions will survive before being periodically purged, since they were last used</td></tr>
Expand Down
1 change: 1 addition & 0 deletions pkg/ccl/backupccl/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,7 @@ go_test(
"//pkg/sql/execinfra",
"//pkg/sql/execinfrapb",
"//pkg/sql/parser",
"//pkg/sql/pgwire/pgerror",
"//pkg/sql/randgen",
"//pkg/sql/rowenc",
"//pkg/sql/rowflow",
Expand Down
2 changes: 1 addition & 1 deletion pkg/ccl/backupccl/backup_processor.go
Original file line number Diff line number Diff line change
Expand Up @@ -423,7 +423,7 @@ func runBackupProcessor(
// TimeoutError improves the opaque `context deadline exceeded` error
// message so use that instead.
if errors.HasType(exportRequestErr, (*contextutil.TimeoutError)(nil)) {
return errors.Wrapf(exportRequestErr, "timeout: %s", exportRequestErr.Error())
return errors.Wrap(exportRequestErr, "export request timeout")
}
return errors.Wrapf(exportRequestErr, "exporting %s", span.span)
}
Expand Down
3 changes: 2 additions & 1 deletion pkg/ccl/backupccl/backup_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/catalog/systemschema"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc"
"github.com/cockroachdb/cockroach/pkg/sql/execinfra"
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgerror"
"github.com/cockroachdb/cockroach/pkg/sql/randgen"
"github.com/cockroachdb/cockroach/pkg/sql/sem/tree"
"github.com/cockroachdb/cockroach/pkg/sql/stats"
Expand Down Expand Up @@ -3183,7 +3184,7 @@ CREATE TYPE d.greeting AS ENUM ('hello', 'howdy', 'hi');
return errors.New("expected error, found none")
}
if !testutils.IsError(err, tc.expectedError[i]) {
return errors.Newf("expected error %v, found %v", tc.expectedError[i], err)
return errors.Newf("expected error %q, found %v", tc.expectedError[i], pgerror.FullError(err))
}
return nil
})
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ BACKUP TABLE d2.t TO 'nodelocal://0/d2-table';
exec-sql server=s2 user=testuser
SHOW BACKUP 'http://COCKROACH_TEST_HTTP_SERVER/'
----
pq: http storage file does not exist: error response from server: 404 Not Found "404 page not found\n": external_storage: file doesn't exist
pq: error response from server: 404 Not Found "404 page not found\n": http storage file does not exist: external_storage: file doesn't exist

# Test that implicit access is disallowed when the testing knob is not set.
new-server name=s3 share-io-dir=s1
Expand Down
4 changes: 2 additions & 2 deletions pkg/ccl/changefeedccl/helpers_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ func stripTsFromPayloads(payloads []cdctest.TestFeedMessage) ([]string, error) {
var value []byte
var message map[string]interface{}
if err := gojson.Unmarshal(m.Value, &message); err != nil {
return nil, errors.Newf(`unmarshal: %s: %s`, m.Value, err)
return nil, errors.Wrapf(err, `unmarshal: %s`, m.Value)
}
delete(message, "updated")
value, err := reformatJSON(message)
Expand All @@ -121,7 +121,7 @@ func extractUpdatedFromValue(value []byte) (float64, error) {
Updated string `json:"updated"`
}
if err := gojson.Unmarshal(value, &updatedRaw); err != nil {
return -1, errors.Newf(`unmarshal: %s: %s`, value, err)
return -1, errors.Wrapf(err, `unmarshal: %s`, value)
}
updatedVal, err := strconv.ParseFloat(updatedRaw.Updated, 64)
if err != nil {
Expand Down
4 changes: 2 additions & 2 deletions pkg/ccl/importccl/import_stmt_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1433,7 +1433,7 @@ END
intoCols: "a",
typ: "PGCOPY",
contents: "randomvalue\n",
errString: "encountered error invalid input value for enum greeting",
errString: "invalid input value for enum greeting",
},
}

Expand Down Expand Up @@ -3011,7 +3011,7 @@ func TestImportIntoCSV(t *testing.T) {
`IMPORT INTO t (a, b) CSV DATA (%s) WITH disable_glob_matching`,
testFiles.filesUsingWildcard,
` WITH disable_glob_matching`,
"pq: (.+) no such file or directory",
"pq: (.+)no such file or directory: nodelocal storage file does not exist:",
},
} {
t.Run(tc.name, func(t *testing.T) {
Expand Down
2 changes: 1 addition & 1 deletion pkg/ccl/importccl/read_import_mysqlout.go
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ func (d *delimitedConsumer) FillDatums(
if err != nil {
col := conv.VisibleCols[datumIdx]
return newImportRowError(
fmt.Errorf("error %s while parse %q as %s", err, col.GetName(), col.GetType().SQLString()),
errors.Wrapf(err, "error while parse %q as %s", col.GetName(), col.GetType().SQLString()),
string(data), rowNum)
}
}
Expand Down
8 changes: 5 additions & 3 deletions pkg/ccl/importccl/read_import_pgcopy.go
Original file line number Diff line number Diff line change
Expand Up @@ -333,9 +333,11 @@ func (p *pgCopyConsumer) FillDatums(
conv.Datums[i], err = rowenc.ParseDatumStringAs(conv.VisibleColTypes[i], *s, conv.EvalCtx)
if err != nil {
col := conv.VisibleCols[i]
return newImportRowError(fmt.Errorf(
"encountered error %s when attempting to parse %q as %s",
err.Error(), col.GetName(), col.GetType().SQLString()), data.String(), rowNum)
return newImportRowError(errors.Wrapf(
err,
"encountered error when attempting to parse %q as %s",
col.GetName(), col.GetType().SQLString(),
), data.String(), rowNum)
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/ccl/multiregionccl/datadriven_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -571,7 +571,7 @@ func parseReplicasFromRange(

leaseHolder, err := tc.FindRangeLeaseHolder(desc, nil)
if err != nil {
return nil, errors.Newf("could not get leaseholder: %v", err)
return nil, errors.Wrap(err, "could not get leaseholder")
}
leaseHolderIdx := nodeIdToIdx(t, tc, leaseHolder.NodeID)
replicaMap[leaseHolderIdx] = replicaTypeLeaseholder
Expand Down
3 changes: 2 additions & 1 deletion pkg/cli/quit.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,8 @@ func doDrain(
})
if errors.HasType(err, (*contextutil.TimeoutError)(nil)) || grpcutil.IsTimeout(err) {
log.Infof(ctx, "drain timed out: %v", err)
err = errors.New("drain timeout")
err = errors.New("drain timeout, consider adjusting --drain-wait, especially under " +
"custom server.shutdown.{drain,query,lease_transfer}_wait cluster settings")
}
return
}
Expand Down
6 changes: 5 additions & 1 deletion pkg/cloud/amazon/s3_storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -433,7 +433,11 @@ func (s *s3Storage) openStreamAt(
switch aerr.Code() {
// Relevant 404 errors reported by AWS.
case s3.ErrCodeNoSuchBucket, s3.ErrCodeNoSuchKey:
return nil, errors.Wrapf(cloud.ErrFileDoesNotExist, "s3 object does not exist: %s", err.Error())
return nil, errors.WithMessagef(
errors.Wrap(cloud.ErrFileDoesNotExist, "s3 object does not exist"),
"%s",
err.Error(),
)
}
}
return nil, errors.Wrap(err, "failed to get s3 object")
Expand Down
6 changes: 5 additions & 1 deletion pkg/cloud/azure/azure_storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,11 @@ func (s *azureStorage) ReadFileAt(
switch azerr.ServiceCode() {
// TODO(adityamaru): Investigate whether both these conditions are required.
case azblob.ServiceCodeBlobNotFound, azblob.ServiceCodeResourceNotFound:
return nil, 0, errors.Wrapf(cloud.ErrFileDoesNotExist, "azure blob does not exist: %s", err.Error())
return nil, 0, errors.WithMessagef(
errors.Wrap(cloud.ErrFileDoesNotExist, "azure blob does not exist"),
"%s",
err.Error(),
)
}
}
return nil, 0, errors.Wrap(err, "failed to create azure reader")
Expand Down
6 changes: 5 additions & 1 deletion pkg/cloud/gcp/gcs_storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,11 @@ func (g *gcsStorage) ReadFileAt(
// if file does not exist. Regardless why we couldn't open the stream
// (whether its invalid bucket or file doesn't exist),
// return our internal ErrFileDoesNotExist.
err = errors.Wrapf(cloud.ErrFileDoesNotExist, "gcs object does not exist: %s", err.Error())
err = errors.WithMessagef(
errors.Wrap(cloud.ErrFileDoesNotExist, "gcs object does not exist"),
"%s",
err.Error(),
)
}
return nil, 0, err
}
Expand Down
6 changes: 5 additions & 1 deletion pkg/cloud/httpsink/http_storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,11 @@ func (h *httpStorage) req(
_ = resp.Body.Close()
err := errors.Errorf("error response from server: %s %q", resp.Status, body)
if err != nil && resp.StatusCode == 404 {
err = errors.Wrapf(cloud.ErrFileDoesNotExist, "http storage file does not exist: %s", err.Error())
err = errors.WithMessagef(
errors.Wrap(cloud.ErrFileDoesNotExist, "http storage file does not exist"),
"%s",
err.Error(),
)
}
return nil, err
}
Expand Down
6 changes: 5 additions & 1 deletion pkg/cloud/nodelocal/nodelocal_storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,11 @@ func (l *localFileStorage) ReadFileAt(
// The local store returns a golang native ErrNotFound, whereas the remote
// store returns a gRPC native NotFound error.
if oserror.IsNotExist(err) || status.Code(err) == codes.NotFound {
return nil, 0, errors.Wrapf(cloud.ErrFileDoesNotExist, "nodelocal storage file does not exist: %s", err.Error())
return nil, 0, errors.WithMessagef(
errors.Wrap(cloud.ErrFileDoesNotExist, "nodelocal storage file does not exist"),
"%s",
err.Error(),
)
}
return nil, 0, err
}
Expand Down
4 changes: 3 additions & 1 deletion pkg/kv/kvserver/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,9 @@ var queueAdditionOnSystemConfigUpdateBurst = settings.RegisterIntSetting(
var leaseTransferWait = func() *settings.DurationSetting {
s := settings.RegisterDurationSetting(
leaseTransferWaitSettingName,
"the amount of time a server waits to transfer range leases before proceeding with the rest of the shutdown process",
"the amount of time a server waits to transfer range leases before proceeding with the rest of the shutdown process "+
"(note that the --drain-wait parameter for cockroach node drain may need adjustment "+
"after changing this setting)",
5*time.Second,
func(v time.Duration) error {
if v < 0 {
Expand Down
Loading

0 comments on commit 525322a

Please sign in to comment.