Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…achdb#103343

102793: backupccl,kvserver: log failed ExportRequest trace on client and server r=irfansharif a=adityamaru

This change strives to improve observability around
backups that fail because of timed out ExportRequests.
Currently, there is very little indication of what the request
was doing when the client cancelled the context after
the pre-determined timeout window. With this change we
now log the trace of the ExportRequest that failed. If
the ExportRequest was served locally, then the trace will be
part of the sender's tracing span. However, if the request
was served on a remote node then the sender does not wait
for the server side evaluation to notice the context cancellation.
To work around this, we also print the trace on the server side
if the request encountered a context cancellation and the
associating tracing span is not a noop.

This change also adds a private cluster setting
`bulkio.backup.export_request_verbose_tracing` that if set to true
will send all backup export requests with verbose tracing
mode.

To debug a backup failing with a timed out export request we
can now:
- SET CLUSTER SETTING bulkio.backup.export_request_verbose_tracing = true;
- SET CLUSTER SETTING trace.snapshot.rate = '1m'

Once the backup times out we can look at the logs
for the server side and client side ExportRequest traces
to then understand where we were stuck executing for so long.

Fixes: cockroachdb#86047
Release note: None

103241: kv: decouple parallel commit auto-retries from refresh auto-retries r=arulajmani a=nvanbenschoten

Informs cockroachdb#100131.

This commit decouples the concepts of a parallel commit auto-retry from a refresh auto-retry. This is important for weak isolation transactions which can tolerate write skew, where a failed parallel commit does not necessarily imply that the transaction needs to refresh its read spans in order to commit.

Parallel commit auto-retries are performed when a parallel committing batch is successful in its execution but fails to qualify for the implicit commit condition. In such cases, the EndTxn request (and just the EndTxn request) is re-issued to move the transaction directly to the COMMITTED state. This only implies the need for a refresh for serializable transactions. For these transactions, a preemptive refresh will be performed by the txnSpanRefresher before the EndTxn is re-issued.

Conversely, refresh auto-retries are performed when a batch fails to execute fully and returns an error. In such cases, the txnSpanRefresher will intercept the error, attempt to refresh, and then re-issue the entire original batch. Because we no longer make an attempt to determine which parts of a previously issued batch were successful (this was subtle and error-prone when it existed), we instead re-issue the entire batch and rely on idempotency.

To make this change, this commit moves the txnCommitter interceptor above the txnSpanRefresher interceptor in the TxnCoordSender interceptor stack. This means that the txnCommitter no longer needs to guard against re-issued batches, because all retries are handled below it. Instead, the txnSpanRefresher now needs to learn about the STAGING transaction status. The txnSpanRefresher was already taught about the STAGING status on successful batches in ef77322, and is taught about the STAGING status on errors here.

While the primary motivation of this change is to permit weak isolation transactions to perform parallel commit auto-retries without needing to refresh, it also has another benefit. Failed parallel commits no longer re-issue their entire batch on serialization failures. Instead, they just re-issue the EndTxn. As a result, this replaces and closes cockroachdb#93099. It also motivates release note below.

Release note (performance improvement): Some large, long-running INSERT statements now perform less work during their commit phase and can run faster.

103278: kvserver: add expiration lease escape hatch r=erikgrinaker a=erikgrinaker

This patch adds `COCKROACH_DISABLE_EXPIRATION_LEASES_ONLY`, which can be used to hard-disable expiration-based leases, e.g. in cases where the lease extensions overload the cluster and prevent it from working, and thus prevent operators from disabling the setting.

Epic: none
Release note: None

103343: logictest: increase max-sql-memory from 192MiB to 256MiB r=yuzefovich a=yuzefovich

We have seen some rare flakes where the validation (after all the test queries are done) hits "memory budget exceeded" on the root monitor. To avoid this, let's just bump the root pool a little.

Fixes: cockroachdb#103244.

Release note: None

Co-authored-by: adityamaru <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
  • Loading branch information
5 people committed May 15, 2023
5 parents 19e5755 + d59a111 + 019d157 + ec5ac9f + 13d23f0 commit 7b86c7c
Show file tree
Hide file tree
Showing 16 changed files with 400 additions and 142 deletions.
1 change: 1 addition & 0 deletions pkg/ccl/backupccl/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ go_library(
"//pkg/util/syncutil",
"//pkg/util/timeutil",
"//pkg/util/tracing",
"//pkg/util/tracing/tracingpb",
"//pkg/util/uuid",
"@com_github_cockroachdb_errors//:errors",
"@com_github_cockroachdb_logtags//:logtags",
Expand Down
29 changes: 27 additions & 2 deletions pkg/ccl/backupccl/backup_processor.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/rowenc"
"github.com/cockroachdb/cockroach/pkg/sql/rowexec"
"github.com/cockroachdb/cockroach/pkg/sql/types"
"github.com/cockroachdb/cockroach/pkg/util"
"github.com/cockroachdb/cockroach/pkg/util/admission/admissionpb"
"github.com/cockroachdb/cockroach/pkg/util/bulk"
"github.com/cockroachdb/cockroach/pkg/util/contextutil"
Expand All @@ -40,6 +41,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/util/stop"
"github.com/cockroachdb/cockroach/pkg/util/timeutil"
"github.com/cockroachdb/cockroach/pkg/util/tracing"
"github.com/cockroachdb/cockroach/pkg/util/tracing/tracingpb"
"github.com/cockroachdb/errors"
"github.com/cockroachdb/logtags"
"github.com/cockroachdb/redact"
Expand All @@ -55,7 +57,7 @@ var (
time.Minute,
settings.NonNegativeDuration,
).WithPublic()
delayPerAttmpt = settings.RegisterDurationSetting(
delayPerAttempt = settings.RegisterDurationSetting(
settings.TenantWritable,
"bulkio.backup.read_retry_delay",
"amount of time since the read-as-of time, per-prior attempt, to wait before making another attempt",
Expand All @@ -82,6 +84,13 @@ var (
"split backup data on timestamps when writing revision history",
true,
)

sendExportRequestWithVerboseTracing = settings.RegisterBoolSetting(
settings.TenantWritable,
"bulkio.backup.export_request_verbose_tracing",
"send each export request with a verbose tracing span",
util.ConstantWithMetamorphicTestBool("export_request_verbose_tracing", false),
)
)

const (
Expand Down Expand Up @@ -378,7 +387,7 @@ func runBackupProcessor(
// We're okay with delaying this worker until then since we assume any
// other work it could pull off the queue will likely want to delay to
// a similar or later time anyway.
if delay := delayPerAttmpt.Get(&clusterSettings.SV) - timeutil.Since(span.lastTried); delay > 0 {
if delay := delayPerAttempt.Get(&clusterSettings.SV) - timeutil.Since(span.lastTried); delay > 0 {
timer.Reset(delay)
log.Infof(ctx, "waiting %s to start attempt %d of remaining spans", delay, span.attempts+1)
select {
Expand Down Expand Up @@ -427,13 +436,22 @@ func runBackupProcessor(
log.VEventf(ctx, 1, "sending ExportRequest for span %s (attempt %d, priority %s)",
span.span, span.attempts+1, header.UserPriority.String())
var rawResp kvpb.Response
var recording tracingpb.Recording
var pErr *kvpb.Error
requestSentAt := timeutil.Now()
exportRequestErr := contextutil.RunWithTimeout(ctx,
fmt.Sprintf("ExportRequest for span %s", span.span),
timeoutPerAttempt.Get(&clusterSettings.SV), func(ctx context.Context) error {
sp := tracing.SpanFromContext(ctx)
opts := make([]tracing.SpanOption, 0)
opts = append(opts, tracing.WithParent(sp))
if sendExportRequestWithVerboseTracing.Get(&clusterSettings.SV) {
opts = append(opts, tracing.WithRecording(tracingpb.RecordingVerbose))
}
ctx, exportSpan := sp.Tracer().StartSpanCtx(ctx, "backupccl.ExportRequest", opts...)
rawResp, pErr = kv.SendWrappedWithAdmission(
ctx, flowCtx.Cfg.DB.KV().NonTransactionalSender(), header, admissionHeader, req)
recording = exportSpan.FinishAndGetConfiguredRecording()
if pErr != nil {
return pErr.GoError()
}
Expand All @@ -453,6 +471,9 @@ func runBackupProcessor(
// TimeoutError improves the opaque `context deadline exceeded` error
// message so use that instead.
if errors.HasType(exportRequestErr, (*contextutil.TimeoutError)(nil)) {
if recording != nil {
log.Errorf(ctx, "failed export request for span %s\n trace:\n%s", span.span, recording)
}
return errors.Wrap(exportRequestErr, "export request timeout")
}
// BatchTimestampBeforeGCError is returned if the ExportRequest
Expand All @@ -467,6 +488,10 @@ func runBackupProcessor(
continue
}
}

if recording != nil {
log.Errorf(ctx, "failed export request %s\n trace:\n%s", span.span, recording)
}
return errors.Wrapf(exportRequestErr, "exporting %s", span.span)
}

Expand Down
41 changes: 26 additions & 15 deletions pkg/kv/kvclient/kvcoord/dist_sender_server_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2060,6 +2060,7 @@ func TestTxnCoordSenderRetries(t *testing.T) {
expClientRestart bool // client-side txn restart
expServerRefresh bool // server-side refresh
expOnePhaseCommit bool // 1PC commits
expParallelCommitAutoRetry bool // parallel commit auto-retries
expFailure string // regexp pattern to match on error, if not empty
}
type testCase struct {
Expand Down Expand Up @@ -3045,6 +3046,7 @@ func TestTxnCoordSenderRetries(t *testing.T) {
expServerRefresh: true,
expClientRefreshSuccess: false,
expClientAutoRetryAfterRefresh: false,
expParallelCommitAutoRetry: false,
},
},
{
Expand All @@ -3062,10 +3064,12 @@ func TestTxnCoordSenderRetries(t *testing.T) {
// The Put to "c" will succeed with a forwarded timestamp. However, the
// txn has already staged on the other range at an earlier timestamp. As a
// result, it does not qualify for the implicit commit condition and
// requires a client-side refresh.
// requires a parallel commit auto-retry and preemptive client-side
// refresh.
allIsoLevels: &expect{
expClientRefreshSuccess: true,
expClientAutoRetryAfterRefresh: true,
expClientAutoRetryAfterRefresh: false,
expParallelCommitAutoRetry: true,
},
},
{
Expand Down Expand Up @@ -3149,6 +3153,7 @@ func TestTxnCoordSenderRetries(t *testing.T) {
expServerRefresh: true,
expClientRefreshSuccess: false,
expClientAutoRetryAfterRefresh: false,
expParallelCommitAutoRetry: false,
},
},
{
Expand All @@ -3165,11 +3170,12 @@ func TestTxnCoordSenderRetries(t *testing.T) {
// The Put to "c" will succeed after a server-side refresh. However, the
// txn has already staged on the other range at the pre-refresh timestamp.
// As a result, it does not qualify for the implicit commit condition and
// requires a client-side refresh.
// requires a parallel commit auto-retry.
allIsoLevels: &expect{
expServerRefresh: true,
expClientRefreshSuccess: true,
expClientAutoRetryAfterRefresh: true,
expClientRefreshSuccess: false,
expClientAutoRetryAfterRefresh: false,
expParallelCommitAutoRetry: true,
},
},
{
Expand All @@ -3184,12 +3190,13 @@ func TestTxnCoordSenderRetries(t *testing.T) {
return txn.CommitInBatch(ctx, b)
},
priorReads: true,
// The Put to "a" will fail, failing the parallel commit and forcing a
// client-side refresh.
// The Put to "a" will fail, failing the parallel commit with an error and
// forcing a client-side refresh and auto-retry of the full batch.
allIsoLevels: &expect{
expServerRefresh: false,
expClientRefreshSuccess: true,
expClientAutoRetryAfterRefresh: true,
expParallelCommitAutoRetry: false,
},
},
{
Expand All @@ -3205,11 +3212,12 @@ func TestTxnCoordSenderRetries(t *testing.T) {
},
priorReads: true,
// The Put to "c" will fail, failing the parallel commit and forcing a
// client-side refresh.
// parallel commit auto-retry and preemptive client-side refresh.
allIsoLevels: &expect{
expServerRefresh: false,
expClientRefreshSuccess: true,
expClientAutoRetryAfterRefresh: true,
expClientAutoRetryAfterRefresh: false,
expParallelCommitAutoRetry: true,
},
},
{
Expand Down Expand Up @@ -3387,11 +3395,12 @@ func TestTxnCoordSenderRetries(t *testing.T) {
// The cput to "c" will succeed after a server-side refresh. However, the
// txn has already staged on the other range at the pre-refresh timestamp.
// As a result, it does not qualify for the implicit commit condition and
// requires a client-side refresh.
// requires a parallel commit auto-retry.
allIsoLevels: &expect{
expServerRefresh: true,
expClientRefreshSuccess: true,
expClientAutoRetryAfterRefresh: true,
expClientRefreshSuccess: false,
expClientAutoRetryAfterRefresh: false,
expParallelCommitAutoRetry: true,
},
},
{
Expand Down Expand Up @@ -3435,11 +3444,12 @@ func TestTxnCoordSenderRetries(t *testing.T) {
// The cput to "c" will succeed after a server-side refresh. However, the
// txn has already staged on the other range at the pre-refresh timestamp.
// As a result, it does not qualify for the implicit commit condition and
// requires a client-side refresh.
// requires a parallel commit auto-retry.
allIsoLevels: &expect{
expServerRefresh: true,
expClientRefreshSuccess: true,
expClientAutoRetryAfterRefresh: true,
expClientRefreshSuccess: false,
expClientAutoRetryAfterRefresh: false,
expParallelCommitAutoRetry: true,
},
},
{
Expand Down Expand Up @@ -3615,6 +3625,7 @@ func TestTxnCoordSenderRetries(t *testing.T) {
require.Equal(t, exp.expServerRefresh, metrics.ServerRefreshSuccess.Count() != 0, "TxnMetrics.ServerRefreshSuccess")
require.Equal(t, exp.expClientRestart, metrics.Restarts.TotalSum() != 0, "TxnMetrics.Restarts")
require.Equal(t, exp.expOnePhaseCommit, metrics.Commits1PC.Count() != 0, "TxnMetrics.Commits1PC")
require.Equal(t, exp.expParallelCommitAutoRetry, metrics.ParallelCommitAutoRetries.Count() != 0, "TxnMetrics.ParallelCommitAutoRetries")
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
Expand Down
21 changes: 12 additions & 9 deletions pkg/kv/kvclient/kvcoord/txn_coord_sender.go
Original file line number Diff line number Diff line change
Expand Up @@ -163,8 +163,8 @@ type TxnCoordSender struct {
txnHeartbeater
txnSeqNumAllocator
txnPipeliner
txnSpanRefresher
txnCommitter
txnSpanRefresher
txnMetricRecorder
txnLockGatekeeper // not in interceptorStack array.
}
Expand Down Expand Up @@ -254,6 +254,7 @@ func newRootTxnCoordSender(
tcs.interceptorAlloc.txnCommitter = txnCommitter{
st: tcf.st,
stopper: tcs.stopper,
metrics: &tcs.metrics,
mu: &tcs.mu.Mutex,
}
tcs.interceptorAlloc.txnMetricRecorder = txnMetricRecorder{
Expand All @@ -274,18 +275,20 @@ func newRootTxnCoordSender(
// never generate transaction retry errors that could be avoided
// with a refresh.
&tcs.interceptorAlloc.txnPipeliner,
// The committer sits above the span refresher because it will
// never generate transaction retry errors that could be avoided
// with a refresh. However, it may re-issue parts of "successful"
// batches that failed to qualify for the parallel commit condition.
// These re-issued batches benefit from pre-emptive refreshes in the
// span refresher.
&tcs.interceptorAlloc.txnCommitter,
// The span refresher may resend entire batches to avoid transaction
// retries. Because of that, we need to be careful which interceptors
// sit below it in the stack.
// The span refresher sits below the committer, so it must be prepared
// to see STAGING transaction protos in responses and errors. It should
// defer to the committer for how to handle those.
&tcs.interceptorAlloc.txnSpanRefresher,
// The committer sits beneath the span refresher so that any
// retryable errors that it generates have a chance of being
// "refreshed away" without the need for a txn restart. Because the
// span refresher can re-issue batches, it needs to be careful about
// what parts of the batch it mutates. Any mutation needs to be
// idempotent and should avoid writing to memory when not changing
// it to avoid looking like a data race.
&tcs.interceptorAlloc.txnCommitter,
// The metrics recorder sits at the bottom of the stack so that it
// can observe all transformations performed by other interceptors.
&tcs.interceptorAlloc.txnMetricRecorder,
Expand Down
Loading

0 comments on commit 7b86c7c

Please sign in to comment.