Skip to content

Commit

Permalink
sql: improve stack trace for get-user-timeout timeouts
Browse files Browse the repository at this point in the history
Fixes #95794

The cause of the `get-user-timeout errors` is unknown.
Part of the problem is that the stack trace gets cut off at
```
  |   | github.com/cockroachdb/cockroach/pkg/sql.retrieveSessionInitInfoWithCache
  |   | 	github.com/cockroachdb/cockroach/pkg/sql/user.go:238
```
which does not explain what is actually being blocked.

The reason that the stack trace is cut off is that the timeout is initiated by
`contextutil.RunWithTimeout` which results in a "simple" (no stack trace)
`context.DeadlineExceeded` error.

`retrieveSessionInitInfoWithCache` is the first line in the stack trace because
it calls `errors.Wrap` on `context.DeadlineExceeded`.

To get a fuller stack trace, `context.DeadlineExceeded` must be wrapped
immediately (`errors.Wrap` or `errors.WithStack`) before it bubbles up.

Release note: None
  • Loading branch information
ecwall committed Jan 24, 2023
1 parent 24b7cf5 commit 60e3926
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 0 deletions.
13 changes: 13 additions & 0 deletions pkg/util/contextutil/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,13 +79,26 @@ func wrap(ctx context.Context, cancel context.CancelFunc) (context.Context, cont
}
}

// ctxWithStacktrace overrides Err to annotate context.DeadlineExceeded and
// context.Canceled errors with a stacktrace.
// See: https://github.com/cockroachdb/cockroach/issues/95794
type ctxWithStacktrace struct {
context.Context
}

// Err implements the context.Context interface.
func (ctx *ctxWithStacktrace) Err() error {
return errors.WithStack(ctx.Context.Err())
}

// RunWithTimeout runs a function with a timeout, the same way you'd do with
// context.WithTimeout. It improves the opaque error messages returned by
// WithTimeout by augmenting them with the op string that is passed in.
func RunWithTimeout(
ctx context.Context, op string, timeout time.Duration, fn func(ctx context.Context) error,
) error {
ctx, cancel := context.WithTimeout(ctx, timeout)
ctx = &ctxWithStacktrace{Context: ctx}
defer cancel()
start := timeutil.Now()
err := fn(ctx)
Expand Down
19 changes: 19 additions & 0 deletions pkg/util/contextutil/context_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ package contextutil

import (
"context"
"fmt"
"net"
"testing"
"time"
Expand Down Expand Up @@ -68,6 +69,24 @@ func TestRunWithTimeout(t *testing.T) {
}
}

func testFuncA(ctx context.Context) error {
return testFuncB(ctx)
}

func testFuncB(ctx context.Context) error {
<-ctx.Done()
return ctx.Err()
}

func TestRunWithTimeoutCtxWithStacktrace(t *testing.T) {
ctx := context.Background()
err := RunWithTimeout(ctx, "foo", 1, testFuncA)
require.Error(t, err)
stacktrace := fmt.Sprintf("%+v", err)
require.Contains(t, stacktrace, "testFuncB")
require.Contains(t, stacktrace, "testFuncA")
}

// TestRunWithTimeoutWithoutDeadlineExceeded ensures that when a timeout on the
// context occurs but the underlying error does not have
// context.DeadlineExceeded as its Cause (perhaps due to serialization) the
Expand Down

0 comments on commit 60e3926

Please sign in to comment.