-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachpb: improve AmbiguousResultError #65714
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @knz and @tbg)
pkg/roachpb/errors.go, line 656 at r1 (raw file):
s.Print(cause) if f, l, _, ok := errors.GetOneLineSource(cause); ok { s.Printf("; originated at %s:%d", redact.SafeString(f), redact.Safe(l))
This is the sort of line that I keep fighting with @knz about, so I'm curious if he likes it (I do).
Rafa, would you say that errors.GetOneLineSource()
should return a SafeString
and an int
type that implements SafeValue
instread of (string, int)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 13 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @tbg)
pkg/ccl/backupccl/restore_data_processor_test.go, line 246 at r1 (raw file):
return nil } return roachpb.NewError(roachpb.NewAmbiguousResultErrorf(strconv.Itoa(int(r))))
Errorf("%v", int(r))
Generally using a computed string as format argument is bad style, and I think we have a linter against it now.
pkg/roachpb/errors.go, line 656 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
This is the sort of line that I keep fighting with @knz about, so I'm curious if he likes it (I do).
Rafa, would you say thaterrors.GetOneLineSource()
should return aSafeString
and anint
type that implementsSafeValue
instread of(string, int)
?
This entire SafeFormat
method makes me sad, but I'm also realizing this is not explained well enough elsewhere.
The logic to format and print errors is best done using a different interface, the errors.SafeFormatter
interface (not redact.SafeFormatter
). It has a different structure (because also a different purpose).
This means a SafeFormatError
method, and next to it for backward-compat with the fmt
package a method that redirects Format
to SafeFormatError
like this:
func (e *AmbiguousResultError) Format(s fmt.State, verb rune) { errors.FormatError(e, s, verb) }
I can help you build the SafeFormatError
method but tehre are a few examples in the errors
library already and perhaps in cockroachdb (and we'll need to document it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @tbg)
pkg/roachpb/errors.go, line 656 at r1 (raw file):
Previously, knz (kena) wrote…
This entire
SafeFormat
method makes me sad, but I'm also realizing this is not explained well enough elsewhere.The logic to format and print errors is best done using a different interface, the
errors.SafeFormatter
interface (notredact.SafeFormatter
). It has a different structure (because also a different purpose).This means a
SafeFormatError
method, and next to it for backward-compat with thefmt
package a method that redirectsFormat
toSafeFormatError
like this:func (e *AmbiguousResultError) Format(s fmt.State, verb rune) { errors.FormatError(e, s, verb) }I can help you build the
SafeFormatError
method but tehre are a few examples in theerrors
library already and perhaps in cockroachdb (and we'll need to document it.)
I'll give this a try, thanks Raphael. Will ping when done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @tbg)
pkg/roachpb/errors.go, line 656 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I'll give this a try, thanks Raphael. Will ping when done.
we can also sit together and I could watch over your shoulder, would help me write a doc on how to do it for other folk afterwards.
This error type held on to a string message, which is unsuitable for redaction. In practice, this error opaquely wraps an error, so make this transparent, which we can now do thanks to `cockroachdb/errors`. It turned out that the proto already had an unused field that was (once, but no longer) used in testing, so we can use that in production now. I chose to print the root error's file:line in the error as we typically don't print errors via `%+v` which is otherwise how these would be exposed. I expect that to be mildly useful to diagnose problems such as \cockroachdb#65701. I think there is room for a more general mechanism that exposes the root cause's location in general, though I am unsure how to best introduce it given that we usually observe errors through SQL, and on many possible paths especially for rare flakes such as the above, so no steps towards that are taken at this point. Release note: None
This is more idiomatic. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @tbg)
pkg/roachpb/errors.go, line 656 at r1 (raw file):
Previously, knz (kena) wrote…
we can also sit together and I could watch over your shoulder, would help me write a doc on how to do it for other folk afterwards.
I tried this just now. I think it works mostly fine, but I'm confused that the wrapped errors in my unit test are no longer printed. This isn't what I had expected, though maybe it's the right behavior.
I recorded myself making this change in case it's helpful for doc writing: https://drive.google.com/file/d/16AAq38A5KeqShEAq5tQ08o4_cNByMG5G/view?usp=sharing (unfortunately the quality is heinous...)
Honestly just looking at an example of how this was done was probably all I needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)
pkg/roachpb/errors.go, line 656 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I tried this just now. I think it works mostly fine, but I'm confused that the wrapped errors in my unit test are no longer printed. This isn't what I had expected, though maybe it's the right behavior.
I recorded myself making this change in case it's helpful for doc writing: https://drive.google.com/file/d/16AAq38A5KeqShEAq5tQ08o4_cNByMG5G/view?usp=sharing (unfortunately the quality is heinous...)
Honestly just looking at an example of how this was done was probably all I needed.
-
why doesn't
AmbiguousResultError
not behave as a regular error wrapper? i.e. implement anUnwrap
and let that return the cause, and let the error formatting code do its thing? -
if this was a proper wrapper, you wouldn't need to use
GetOneLineSource
at all here (and bother printing the "originated" thing). That comes for free when the full error is printed / reported to sentry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)
pkg/roachpb/errors.go, line 656 at r1 (raw file):
Previously, knz (kena) wrote…
why doesn't
AmbiguousResultError
not behave as a regular error wrapper? i.e. implement anUnwrap
and let that return the cause, and let the error formatting code do its thing?if this was a proper wrapper, you wouldn't need to use
GetOneLineSource
at all here (and bother printing the "originated" thing). That comes for free when the full error is printed / reported to sentry.
AmbiguousResultError
is intentionally opaquely wrapping the error. For example, if, hypothetically, an error marked as retriable gets wrapped in an AmbiguousResultError, we need to make sure that nobody "unwraps" past the ambiguous result with something likeerrors.Is(ambErr, someMarkerErr)
. At the same time, AmbiguousResultError is also an ErrorDetail, so we can't just by convention wrap it witherrors.Opaque
(at least not with the way we currently do*roachpb.Error
, I think). Hmm. Is the right thing to make sure that we populate thee.EncodedErr
field with anbarriers.Handled(err)
when we populate it? Then the ambiguous result would unwrap tobarriers.Handled(err)
which is fine. But not sure what the point of that is.
re 2:, I'm not sure I understand how this all works. Look at this example:
type myErr struct {
cause error
}
func (err *myErr) Unwrap() error {
return err.cause
}
func (err *myErr) Error() string {
return "myErr"
}
func TestFoo(t *testing.T) {
err := &myErr{
cause: &myErr{
cause: errors.New("i'm ambiguous"),
},
}
t.Error(err)
t.Errorf("%+v", err)
t.Errorf("%+v", errors.Wrap(err, "asd"))
}
This prints:
stopper_test.go:677: myErr
stopper_test.go:678: myErr
stopper_test.go:679: asd: i'm ambiguous
(1) attached stack trace
-- stack trace:
| github.com/cockroachdb/cockroach/pkg/util/stop_test.TestFoo
| /home/tobias/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper_test.go:679
| [...repeated from below...]
Wraps: (2) asd
Wraps: (3)
Wraps: (4)
Wraps: (5) attached stack trace
-- stack trace:
| github.com/cockroachdb/cockroach/pkg/util/stop_test.TestFoo
| /home/tobias/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper_test.go:674
| testing.tRunner
| /usr/local/go/src/testing/testing.go:1123
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1374
Wraps: (6) i'm ambiguous
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *stop_test.myErr (4) *stop_test.myErr (5) *withstack.withStack (6) *errutil.leafError
This looks like unless I use a cockroachdb/errors
supplied wrapper I don't get any nice formatted output for %+v
, which makes sense since I didn't implement any special interface for formatting there.
Besides, I am looking for something here that I can return with the SQL error - when an ambiguous result bubbles up to the client, I would like to, for debugging purposes, log the innermost wrapped error's file:line because without that it's anyone's guess where that came from (and I don't think the output of %+v
is concise enough given that it's client-facing). This is similar to our woes around context canceled
, where unfortunately the innermost wrapper is a singleton & it's therefore harder to do this.
My hope is that you can tell me what I actually want here, and then I'll go and do that. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)
pkg/roachpb/errors.go, line 656 at r1 (raw file):
Regarding point 1:
AmbiguousResultError is intentionally opaquely wrapping the error.
ok I did not understand this (or perhaps I had forgotten it) and it explains a lot about the remainder of your approach.
Is the right thing to make sure that we populate the e.EncodedErr field with an barriers.Handled(err) when we populate it? Then the ambiguous result would unwrap to barriers.Handled(err) which is fine. But not sure what the point of that is.
I don't know whether it would be "the right thing" but it would ensure that the error "under" the barrier would be rendered fully under %+v
.
Regarding point 2:
unless I use a cockroachdb/errors supplied wrapper I don't get any nice formatted output for %+v
you can either implement a FormatError()
method as discussed previously, or (e *yourType) Format(f fmt.Printer, r rune) { return errors.FormatError(e, f, r) }
and that will get you there (if that is not properly documented, we can talk)
I would like to, for debugging purposes, log the innermost wrapped error's file:line because without that it's anyone's guess where that came from (and I don't think the output of %+v is concise enough given that it's client-facing)
To be honest, this is more or less the first time that I've heard the requirement that we want to include a reference to the source code in the concise user-facing message (i.e. the one returned by Error()).
If that is truly a requirement, then at the same time I hardly see how that requirement would be unique to AmbiguousError - it feels to me that if you have a compelling reason why this is useful in this case, it's likely to also be useful in many other cases.
Then if it's really a requirement, then I also think that the behavior should be consistent across error types, and that the errors
library should take ownership for doing this somehow.
Then I'd like to understand what is truly required:
- does this need to be present in the string returned by
Error()
, which is going to be embarked in many places besides log messages; or - are we looking for a new formatting verb/flag, e.g.
%#v
or similar, which would print a short reference to the source code in addition to the error message but without details?
Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey Tobi let's get this show back on the road, cause I want it. Actually, what I want is for AmbiguousResultError
to implement causer
so I can check whether it was caused by ctx deadline exceeded or canceled.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)
pkg/kv/kvclient/kvcoord/dist_sender.go, line 2032 at r2 (raw file):
default: if ambiguousError != nil { return nil, roachpb.NewAmbiguousResultError(errors.Wrap(ambiguousError, "propagate"))
I think the commits got mixed up and this won't compile without the 2nd commit
Mind taking this over @andreimatei? I'm happy to review but there's a pile of triage on my plate right now and I don't want to get sidetracked (even more) with actually writing this. |
This error type held on to a string message, which is unsuitable for
redaction. In practice, this error opaquely wraps an error, so make this
transparent, which we can now do thanks to
cockroachdb/errors
.It turned out that the proto already had an unused field that was (once,
but no longer) used in testing, so we can use that in production now.
I chose to print the root error's file:line in the error as we typically
don't print errors via
%+v
which is otherwise how these would beexposed. I expect that to be mildly useful to diagnose problems such as
#65701.
I think there is room for a more general mechanism that exposes the root
cause's location in general, though I am unsure how to best introduce it
given that we usually observe errors through SQL, and on many possible
paths especially for rare flakes such as the above, so no steps towards
that are taken at this point.
Release note: None