Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: demonstrate uses of the errors library #37765

Closed
wants to merge 0 commits into from

Conversation

knz
Copy link
Contributor

@knz knz commented May 23, 2019

This PR adapts various places in CockroachDB to demonstrate benefits of the errors library introduced in #36987 / #37121.

The main goals of this PR is to remove most usages of pgerror.Error, and replace them by regular error objects.

The benefit of this change is that all errors details are then preserved throughout distsql execution.

This is achieved by the following high level steps:

  1. the introduction of a new function pgerror.Flatten() that can convert a regular error into a *pgerror.Error.
    This step occurs in a commit in the PR near the beginning, called "pgerror: add Flatten() to convert errors into pgerror.Error"

    This function is meant to be used at "boundaries" only:

    • when sending an error packet to a SQL client
    • when reporting a distsql error to a v19.1 sql gateway
  2. changing the implementation of the pgerror.XXX() constructor functions to use the new library's constructors instead.
    This step is achieved by another commit in the PR near the end, called "pgwire, distsqlpb: let native errors flow through"

    From that point, most of the in-flight error objects inside CockroachDB are not instances of pgerror.Error any more.
    (The exception is a pgerror.Error object received on a distsql flow from a v19.1 node).
    This is thus also the moment where calls to pgerror.Flatten() really become necessary at the aforementioned boundaries.

The end result is that most of the "interior" of CockroachDB sees plain error objects and cannot expect a *pgerror.Error object any more.
The remaining commits in this PR are necessary to make these thing work together well:

  • the few commits towards the beginning, prior to step 1 are ancillary commits that ensure that SQL errors are produced where appropriate.

    (two actually belong to other PRs sql/row: use assertion failures instead of panics #37802 sem/tree: avoid deconstructing the error when parsing a datum #37806 and will disappear when those PRs are merged)

  • the commits in-between steps 1 and 2 changes the "interior" code to not rely on the presence of a pgerror.Error object in the causal chain.

    These commits must occur after step 1 because they require the presence of a "pg error code" which is computed by Flatten()'s companion function GetPGCode().

  • the commits after step 2 are mostly cosmetic and remove more dependencies on pgerror.Error in other places.

@knz knz requested review from a team May 23, 2019 10:04
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz force-pushed the 20190523-errlib-diffs branch from d11cad5 to 0f5c518 Compare May 23, 2019 10:21
@knz knz force-pushed the 20190523-errlib branch from 7dca358 to 3401c79 Compare May 23, 2019 10:35
@knz knz force-pushed the 20190523-errlib-diffs branch from 0f5c518 to 9e6c9cb Compare May 23, 2019 10:36
@knz knz requested a review from a team May 23, 2019 10:36
@knz knz force-pushed the 20190523-errlib-diffs branch from 9e6c9cb to 066cab0 Compare May 23, 2019 13:25
@knz knz requested review from a team May 23, 2019 13:25
@knz knz force-pushed the 20190523-errlib branch 2 times, most recently from ae7ea98 to 88c0dd5 Compare May 23, 2019 14:50
@knz knz force-pushed the 20190523-errlib-diffs branch from 066cab0 to 05b9ce8 Compare May 23, 2019 14:51
@knz knz requested review from a team May 23, 2019 14:51
@knz knz force-pushed the 20190523-errlib-diffs branch from 05b9ce8 to 55b6d56 Compare May 23, 2019 14:52
@knz knz force-pushed the 20190523-errlib-diffs branch from 55b6d56 to 0414b1e Compare May 23, 2019 16:40
@knz knz force-pushed the 20190523-errlib branch from 88c0dd5 to 3f2dbed Compare May 23, 2019 16:41
@knz knz force-pushed the 20190523-errlib-diffs branch from 0414b1e to 920e14c Compare May 23, 2019 17:00
@knz knz force-pushed the 20190523-errlib branch from 3f2dbed to bb8691e Compare May 23, 2019 17:21
@knz knz force-pushed the 20190523-errlib-diffs branch 3 times, most recently from adff1c1 to ee625e8 Compare May 23, 2019 17:57
@knz knz force-pushed the 20190523-errlib branch from bb8691e to 86acaf9 Compare May 24, 2019 11:41
@RaduBerinde
Copy link
Member

RaduBerinde commented May 25, 2019

I tried but I got stuck. There is an inversion of control in the error generation from the LALR parsing code. I'll take suggestions!

It's not a big deal if it's not easy. One idea would be to use WithDetailf instead of Wrapf("at or near"). That also makes more sense conceptually, as we are providing detail about the error.

By the way - WithDetailf is a bit awkward - it looks like we have to prepend a space or the regular error message is bad but then the extra space shows up over pgwire (e.g. DETAIL: at or near ",") [edit: there are two spaces before at, github doesn't show it right)

Another thing - when looking through the WithXX functions available, it was hard to understand whether the info I'm adding would be visible if you just print the error (in general, it's hard to know how the info would show up in various contexts, like printing or going over pgwire). It would be good to add to the comments for these.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just timid review beginnings from me here and in #37121.
My excitement is high!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/sql/pgwire/pgerror/flatten.go, line 139 at r4 (raw file):

			return e.Code, true
		// Special roachpb errors get a special code.
		case roachpb.ClientVisibleRetryError:

this ClientVisibleRetryError interface and the fact that we're assigning a pg code to both errors that implement it gives me pause.
TransactionRestartWithProtoRefresh errors can make their way to the user and they need this error code.
UnhandledRetryableError do not make their way to the client, and so then one would think that they don't need a code. So why exactly are you giving it a code here? I thought it has to be because of DistSQL communication with 19.1 gateways, but that doesn't appear to be the case; I see in r14 that you're not using Flatten() for UnhandledRetryableError.

So I guess my question is - can we recognize TransactionRestartWithProtoRefresh and just use the SerializationFailure code for that?

// MakeDateFromUnixEpoch creates a Date from the number of days since the
// Unix epoch.
func MakeDateFromUnixEpoch(days int64) (Date, error) {
days, ok := arith.SubWithOverflow(days, unixEpochToPGEpoch)
if !ok || days <= math.MinInt32 || days >= math.MaxInt32 {
return Date{}, pgerror.New(pgerror.CodeDatetimeFieldOverflowError,
"date is out of range")
return Date{}, errDateOutOfRange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care that this breaks stack printing? That is, the error's stack is not from where it happened, but from where the variable was init'd.

@yuzefovich
Copy link
Member

sql/exec commit looks good to me.

@knz knz force-pushed the 20190523-errlib branch from 5e9a19a to 2750e62 Compare June 4, 2019 08:50
@knz knz force-pushed the 20190523-errlib-diffs branch from eddb962 to 53e49e4 Compare June 4, 2019 08:51
@knz knz force-pushed the 20190523-errlib branch from 2750e62 to f195b5d Compare June 4, 2019 09:46
@knz knz force-pushed the 20190523-errlib-diffs branch from 53e49e4 to d93ccf7 Compare June 4, 2019 09:47
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @mjibson)


pkg/sql/pgwire/pgerror/flatten.go, line 139 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

this ClientVisibleRetryError interface and the fact that we're assigning a pg code to both errors that implement it gives me pause.
TransactionRestartWithProtoRefresh errors can make their way to the user and they need this error code.
UnhandledRetryableError do not make their way to the client, and so then one would think that they don't need a code. So why exactly are you giving it a code here? I thought it has to be because of DistSQL communication with 19.1 gateways, but that doesn't appear to be the case; I see in r14 that you're not using Flatten() for UnhandledRetryableError.

So I guess my question is - can we recognize TransactionRestartWithProtoRefresh and just use the SerializationFailure code for that?

Good point. I changed the interface declarations in a separate PR: #37998
(An interface is needed here nonetheless because of a dependency cycle)


pkg/util/timeutil/pgdate/pgdate.go, line 118 at r15 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Do we care that this breaks stack printing? That is, the error's stack is not from where it happened, but from where the variable was init'd.

Good point. Fixed.

@knz knz force-pushed the 20190523-errlib-diffs branch from d93ccf7 to 5fe37e7 Compare June 4, 2019 10:16
@maddyblue
Copy link
Contributor

LGTM

@knz
Copy link
Contributor Author

knz commented Jun 4, 2019

@andreimatei FYI this PR has just 3 remaining commits that need a review:

  • sql: use the errors library in the schema changer
  • *: remove most references to GetPGCause()
  • pgwire, distsqlpb: let native errors flow through

I'll be especially interested in your opinion on the first of these 3. The 3rd we have discussed a month or two ago so there is no surprise. The middle one is trivial/mechanical.

@knz knz force-pushed the 20190523-errlib branch from f195b5d to 55a823c Compare June 5, 2019 16:04
@knz knz force-pushed the 20190523-errlib-diffs branch from 5fe37e7 to 3b70f6d Compare June 5, 2019 16:05
Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @mjibson)


pkg/sql/schema_changer.go, line 195 at r26 (raw file):

	// The Backfill will grab a new timestamp to read at for the rest
	// of the backfill.
	// TODO(knz): this should really use errors.Is(). However until/unless

Have you considered doing gating the use of errors.Is() here on a cluster version check? That'll remind us to remove the old code next release.


pkg/sql/schema_changer.go, line 219 at r26 (raw file):

	// TODO(knz): This should really use the errors library with
	// a custom marker.
	if _, ok := errors.UnwrapAll(err).(errTableVersionMismatch); ok {

Why not errors.Is() here?
Please put more words in this comment; as written I don't think it'll mean much to people (it doesn't even mean much to me :P)


pkg/sql/schema_changer.go, line 840 at r26 (raw file):

			// As we are dismissing the error, go through the recording motions.
			// This ensures that any important error gets reported to Sentry, etc.
			sqltelemetry.RecordError(ctx, err, &sc.settings.SV)

do we still need the logging above if we're doing RecordError()? Seems to me that RecordError could log if it doesn't already.


pkg/sql/schema_changer.go, line 865 at r26 (raw file):

			log.Warningf(ctx, "while reasing schema change lease: %+v", err)
			// Go through the recording motions. See comment above.
			sqltelemetry.RecordError(ctx, err, &sc.settings.SV)

same about the logging above

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at those 3 commits.
LGTM

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @mjibson)


pkg/cli/sql.go, line 1479 at r28 (raw file):

// decomposition in the first return value. If it is not, the function
// assembles a pgerror.Error with suitable Detail and Hint fields.
func (c *cliState) serverSideParse(sql string) (stmts []string, pgErr *pgerror.Error) {

I was gonna ask if this method could stop using pgErr but I see you've done that in a future PR :plause:


pkg/sql/distsqlpb/data.go, line 149 at r28 (raw file):

	resErr.FullError = &fullError

	// Now populate compatibility fields for 19.1 nodes.

consider gating this code on a cluster version check for 19.2


pkg/sql/distsqlpb/data.go, line 171 at r28 (raw file):

	if e.FullError != nil {
		// If there's a 19.1-forward full error, decode and use that.

you mean 19.2 here I think


pkg/sql/distsqlpb/data.proto, line 58 at r28 (raw file):

  // 19.1 nodes and further are expected to use full_error directly where
  // all error decorations are preserved.
  optional errorspb.EncodedError full_error = 3;

writing this field must have been satisfying :)


pkg/sql/pgwire/pgerror/errors.proto, line 52 at r28 (raw file):

  //   populated with the first key found in errors.GetTelemetryKeys().
  //
  // After 19.1 is outphased, this field can be removed.

nit: just say "TODO: remove in 19.3" here and elsewhere

@knz knz force-pushed the 20190523-errlib branch from 55a823c to 481c873 Compare June 9, 2019 08:12
@knz knz force-pushed the 20190523-errlib-diffs branch from 3b70f6d to 39d834a Compare June 9, 2019 08:12
@knz knz force-pushed the 20190523-errlib branch 2 times, most recently from 56bd140 to a280a4f Compare June 9, 2019 10:21
@knz knz force-pushed the 20190523-errlib-diffs branch from 39d834a to c6acd56 Compare June 9, 2019 10:21
@knz knz force-pushed the 20190523-errlib branch from a280a4f to b614153 Compare June 9, 2019 15:20
@knz knz force-pushed the 20190523-errlib-diffs branch from c6acd56 to f67019c Compare June 9, 2019 15:20
@knz knz force-pushed the 20190523-errlib branch from b614153 to 4737917 Compare June 10, 2019 11:20
@knz knz closed this Jun 10, 2019
@knz knz force-pushed the 20190523-errlib-diffs branch from f67019c to 4737917 Compare June 10, 2019 11:20
@knz
Copy link
Contributor Author

knz commented Jun 10, 2019

Somehow github has decided that it did not like this PR any more: I just force-pushed a new version and this auto-closed the PR. Now github prevents me from re-opening the PR because "there are no new commits". Yet the base branch and the new branch are clearly different.

No idea how to fix this, so I opened a new PR: #38127

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants