Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
35936: server,log: increase the max sync durations r=bdarnell a=tbg

We know that the engine check fires during some tests ([clearrange], for
example). This puts us in an awkward position: on the one hand, not
being able to sync an engine in 10s is certainly going to cause lots
of terrible snowball effects which then eat up troubleshooting time,
but on the other hand we're not likely to fix all of the problems in
19.1.

For now, up the limit significantly. Also up the corresponding log
partition time limit, though we've seen that fire only in rare cases
that likely really highlighted some I/O problem (or a severe case of
being CPU bound).

[clearrange]: cockroachdb#34860 (comment)

Release note: None

Co-authored-by: Tobias Schottdorf <[email protected]>
  • Loading branch information
craig[bot] and tbg committed Mar 19, 2019
2 parents 40eee2e + d0f758a commit 01f9662
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 2 deletions.
4 changes: 3 additions & 1 deletion pkg/server/server_engine_health.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@ import (
"github.com/cockroachdb/cockroach/pkg/util/timeutil"
)

var maxSyncDuration = envutil.EnvOrDefaultDuration("COCKROACH_ENGINE_MAX_SYNC_DURATION", 10*time.Second)
// maxSyncDuration is very conservatively set high due to known issues such as
// https://github.com/cockroachdb/cockroach/issues/34860#issuecomment-469262019.
var maxSyncDuration = envutil.EnvOrDefaultDuration("COCKROACH_ENGINE_MAX_SYNC_DURATION", 120*time.Second)

// startAssertEngineHealth starts a goroutine that periodically verifies that
// syncing the engines is possible within maxSyncDuration. If not,
Expand Down
4 changes: 3 additions & 1 deletion pkg/util/log/clog.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,9 @@ import (
"github.com/petermattis/goid"
)

var maxSyncDuration = envutil.EnvOrDefaultDuration("COCKROACH_LOG_MAX_SYNC_DURATION", 10*time.Second)
// maxSyncDuration is set to a conservative value since this is a new mechanism.
// In practice, even a fraction of that would indicate a problem.
var maxSyncDuration = envutil.EnvOrDefaultDuration("COCKROACH_LOG_MAX_SYNC_DURATION", 30*time.Second)

const fatalErrorPostamble = `
Expand Down

0 comments on commit 01f9662

Please sign in to comment.