Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: propagate errors from contentionQueue, catch stalls in roachtest #37199

Merged
merged 7 commits into from
Jul 8, 2019
Prev Previous commit
Next Next commit
roachtest: run kv/contention/nodes=4 with an extreme TxnLivenessThres…
…hold

Informs #36089.

This commit bumps the TxnLivenessThreshold for clusters running
`kv/contention/nodes=4` to 10 minutes. This is sufficiently large
such that if at any point a transaction is abandoned then all other
transactions will begin waiting for it and the test will fail to
achieve its minimum QPS requirement.

Release note: None
  • Loading branch information
nvanbenschoten committed Jul 8, 2019
commit 521da1a45169db9d87b8e05cc42cce1eb9b3196a
8 changes: 7 additions & 1 deletion pkg/cmd/roachtest/kv.go
Original file line number Diff line number Diff line change
@@ -157,7 +157,13 @@ func registerKVContention(r *testRegistry) {
Run: func(ctx context.Context, t *test, c *cluster) {
c.Put(ctx, cockroach, "./cockroach", c.Range(1, nodes))
c.Put(ctx, workload, "./workload", c.Node(nodes+1))
c.Start(ctx, t, c.Range(1, nodes))

// Start the cluster with an extremely high txn liveness threshold.
// If requests ever get stuck on a transaction that was abandoned
// then it will take 10m for them to get unstuck, at which point the
// QPS threshold check in the test is guaranteed to fail.
args := startArgs("--env=COCKROACH_TXN_LIVENESS_HEARTBEAT_MULTIPLIER=600")
c.Start(ctx, t, args, c.Range(1, nodes))

// Enable request tracing, which is a good tool for understanding
// how different transactions are interacting.
6 changes: 4 additions & 2 deletions pkg/storage/txnwait/txnqueue.go
Original file line number Diff line number Diff line change
@@ -20,6 +20,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/internal/client"
"github.com/cockroachdb/cockroach/pkg/roachpb"
"github.com/cockroachdb/cockroach/pkg/storage/engine/enginepb"
"github.com/cockroachdb/cockroach/pkg/util/envutil"
"github.com/cockroachdb/cockroach/pkg/util/hlc"
"github.com/cockroachdb/cockroach/pkg/util/log"
"github.com/cockroachdb/cockroach/pkg/util/retry"
@@ -33,12 +34,13 @@ const maxWaitForQueryTxn = 50 * time.Millisecond

// TxnLivenessHeartbeatMultiplier specifies what multiple the transaction
// liveness threshold should be of the transaction heartbeat internval.
const TxnLivenessHeartbeatMultiplier = 5
var TxnLivenessHeartbeatMultiplier = envutil.EnvOrDefaultInt(
"COCKROACH_TXN_LIVENESS_HEARTBEAT_MULTIPLIER", 5)

// TxnLivenessThreshold is the maximum duration between transaction heartbeats
// before the transaction is considered expired by Queue. It is exposed and
// mutable to allow tests to override it.
var TxnLivenessThreshold = TxnLivenessHeartbeatMultiplier * base.DefaultHeartbeatInterval
var TxnLivenessThreshold = time.Duration(TxnLivenessHeartbeatMultiplier) * base.DefaultHeartbeatInterval

// ShouldPushImmediately returns whether the PushTxn request should
// proceed without queueing. This is true for pushes which are neither