-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: flattening error objects into other errors erases telemetry data and user hints #42510
Comments
42509: sqlbase: link to issue #42508 in sequence-related errors r=knz a=knz (I intend to back-port this to 19.2 and perhaps 19.1.) Informs #42508. Informs #42510. This patch fixes two things. 1) When the backfiller encounters a sequence operation it fails with a hard error. This and consequences is described further in issue #42508. To inform the user better, this patch links the error message to that issue. Additionally, this also ensures that telemetry usage is reported for the unimplemented feature. For example: ``` [email protected]:39861/movr> alter table t add column y int default nextval('s'); pq: nextval(): unimplemented: cannot backfill such sequence operation HINT: You have attempted to use a feature that is not yet implemented. See: #42508 ``` and ``` > begin; alter table t add column y int default nextval('s'); commit; pq: transaction committed but schema change aborted with error: (0A000): nextval(): unimplemented: cannot backfill such sequence operation HINT: You have attempted to use a feature that is not yet implemented. See: #42508 -- Some of the non-DDL statements may have committed successfully, but some of the DDL statement(s) failed. Manual inspection may be required to determine the actual state of the database. -- See: #42061 ``` 2) The other thing. Prior to this patch, an error "under" a backfill would be flattened into error code 42P15 (invalid schema), even if the schema was valid and the error was really about something encountered during the backfill. This patch fixes it. For example, ``` ALTER TABLE shopping ADD COLUMN c int AS (quantity::int // 0) STORED ``` Would previously error out with code 42P15, whereas the true error is 22012 (division by zero) which is reported now. Release note (sql change): CockroachDB will now properly refer to issue #42508 in the error message hint when a client attempts to add a sequence-based column to an existing table, which is an unimplemented feature. Release note (sql change): CockroachDB will now report an more accurate error message, hint and error code if/when an error is encountered while adding a new column. Co-authored-by: Raphael 'kena' Poss <[email protected]>
We have marked this issue as stale because it has been inactive for |
there's at least one case remaining, so this is still current:
cc @rafiss |
@e-mbrown will be working in this As one of the items, we should change |
70274: sql: address some cases of flattened error objects r=rafiss,ajwerner a=e-mbrown refs #42510 There are some cases of error objects being flattened, which strips them of all detail such as stack traces and user-directed hints. This change address a few of the cases by wrapping the error object. Release note: None Co-authored-by: e-mbrown <[email protected]>
Using the linter in #71877 the following was reported. It seems like the size of the list is pretty manageable. Hopefully we can address these package-by-package and ask the respective team to review.
|
72351: jobs: fix improperly wrapped errors r=ajwerner a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None Co-authored-by: Rafi Shamim <[email protected]>
72352: server,cli: fix improperly wrapped errors r=knz a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None Co-authored-by: Rafi Shamim <[email protected]>
71985: sql: fix comment on constraint for tables in a schema r=e-mbrown a=RichardJCai Release note (sql change): Previously if you did comment on constraint on a table in a schema the command would succeed but the comment would not actually created. Now the comment is successfully created. 72350: sql/*: fix improperly wrapped errors r=ajwerner a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None 72379: build: fix bazel invocation of stress in github-pull-request-make r=rickystewart a=stevendanna Every run of the stress and stressrace bazel CI jobs were failing with: [17:16:00][Run stress tests] /bin/bash: stress: command not found I haven't dug into the Git history enough to know why this was working before. Rather, I've just copied what the `dev` tool does for me and checked that the constructed `bazci` command does in fact work. Fixes #72321 Co-authored-by: richardjcai <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Steven Danna <[email protected]>
69614: spanconfig: introduce spanconfig.KVSubscriber r=irfansharif a=irfansharif KVSubscriber presents a consistent[^1] snapshot of a spanconfig.StoreReader that's incrementally maintained with changes made to the global span configurations state. The maintenance happens transparently; callers can subscribe to learn about what key spans may have seen a configuration change. After learning about a span update, consulting the embedded StoreReader would retrieve an up-to-date[^2] config for it. When a callback is first installed, it's invoked with the [min,max) span -- a shorthand to indicate that callers should consult the StoreReader for all spans of interest. Subsequent updates are of the more incremental kind. It's possible that the span updates received are no-ops, i.e. consulting the StoreReader for the given span would retrieve the last config observed for the span[^2]. ```go type KVSubscriber interface { StoreReader Subscribe(func(updated roachpb.Span)) } ``` Internally we maintain a rangefeed over the global store of span configurations (system.span_configurations), applying updates from it into an embedded spanconfig.Store. A read-only view of this data structure (spanconfig.StoreReader) is exposed as part of the KVSubscriber interface. Rangefeeds used as is don't offer any ordering guarantees with respect to updates made over non-overlapping keys, which is something we care about[^4]. For that reason we make use of a rangefeed buffer, accumulating raw rangefeed updates and flushing them out en-masse in timestamp order when the rangefeed frontier is bumped[^5]. If the buffer overflows (as dictated by the memory limit the KVSubscriber is instantiated with), the subscriber is wound down and an appropriate error is returned to the caller. When running into the errors above, it's safe for the caller to re-subscribe to effectively re-establish the underlying rangefeeds. When re-establishing a new rangefeed and populating a spanconfig.Store using the contents of the initial scan[^6], we wish to preserve the existing spanconfig.StoreReader. Discarding it would entail either blocking all external readers until a new spanconfig.StoreReader was fully populated, or presenting an inconsistent view of the spanconfig.Store that's currently being populated. For new rangefeeds what we do then is route all updates from the initial scan to a fresh spanconfig.Store, and once the initial scan is done, swap at the source for the exported spanconfig.StoreReader. During the initial scan, concurrent readers would continue to observe the last spanconfig.StoreReader if any. After the swap, it would observe the more up-to-date source instead. Future incremental updates will also target the new source. When this source swap occurs, we inform handlers of the need to possibly refresh their view of all configs. This commit also wires up the KVSubscriber into KV stores, replacing the use of the gossiped system config span (possible given the StoreReader interface, only happens if a testing flag/env var is set). [^1]: The contents of the StoreReader at t1 corresponds exactly to the contents of the global span configuration state at t0 where t0 <= t1. If the StoreReader is read from at t2 where t2 > t1, it's guaranteed to observe a view of the global state at t >= t0. [^2]: For the canonical KVSubscriber implementation, this is typically the closed timestamp target duration. [^3]: The canonical KVSubscriber implementation internally re-establishes feeds when errors occur, possibly re-transmitting earlier updates (usually through a lazy [min,max) span) despite possibly not needing to. We could do a bit better and diff the two data structures, emitting only targeted updates. [^4]: For a given key k, it's config may be stored as part of a larger span S (where S.start <= k < S.end). It's possible for S to get deleted and replaced with sub-spans S1...SN in the same transaction if the span is getting split. When applying these updates, we need to make sure to process the deletion event for S before processing S1...SN. [^5]: In our example above deleting the config for S and adding configs for S1...Nwe want to make sure that we apply the full set of updates all at once -- lest we expose the intermediate state where the config for S was deleted but the configs for S1...SN were not yet applied. [^6]: When tearing down the subscriber due to underlying errors, we could also surface a checkpoint to use the next time the subscriber is established. That way we can avoid the full initial scan over the span configuration state and simply pick up where we left off with our existing spanconfig.Store. Release note: None 71239: sql: improve historical descriptor look up efficiency r=jameswsj10 a=jameswsj10 Fixes #70692. The existing implementation for looking up old historical descriptors required multiple round trips to storage. This improvement requires only 1, at most 2, KV calls to storage by using a single ExportRequest. Release note (performance improvement): reduce kv calls when looking up old historical descriptors to 1 or at most 2. 72314: kvserver: trim state used from snapshots r=erikgrinaker a=tbg Snapshots contain a serialized copy of the `ReplicaState`. However, the snapshot itself contains that data already. Having two sources of data that may be interpreted differently can lead to problems, as we saw in [#72239]. This commit deprecates using the entire ReplicaState. Instead, we pick out the descriptor and ignore everything else. Instead of using the copy of the state to initialize the recipient's in-memory state, we now use a state loader. In 22.2 we can only send the descriptor but maybe we won't do that; for logging and debugging it's kind of nice to have everything present. Fixes #72222. [#72239]: #72239 Release note: None 72323: sql: fix excess privileges being created from default privileges. r=RichardJCai a=RichardJCai Release note (bug fix): Previously, when creating an object default privileges from users that were not the user creating the object would be added to the privileges of the object. This fix ensures only the relevant default privileges are applied. Resolves #72322 72346: roachtest: fix gorm roachtest r=rafiss a=RichardJCai Release note: None Fixes #72023 72352: server,cli: fix improperly wrapped errors r=knz a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None 72366: roachtest: update ruby-pg blocklist for 22.1 r=rafiss a=ZhouXing19 Fixes #72316 Release note: None 72390: roachprod: avoid flaky test due to unused functions r=[RaduBerinde,stevendanna,rail] a=healthy-pod Merging #71660 trigerred a flaky test due to unused functions. This patch avoids that test by making use of / commenting out unused functions. Release note: None Co-authored-by: irfan sharif <[email protected]> Co-authored-by: jameswsj10 <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: richardjcai <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Jane Xing <[email protected]> Co-authored-by: Ahmad Abedalqader <[email protected]>
72304: bazel: add `pkg/cmd/mirror` r=rail a=rickystewart This tool will be extended with functionality to mirror Go module ZIP's to cloud storage. For now, the mirroring functionality is missing, so we just generate the `DEPS.bzl` content in much the same way as `gazelle update-repos` does. (This doesn't change the content of `DEPS.bzl` in any way besides alphabetizing the contents.) Release note: None 72353: *: fix improperly wrapped errors r=otan,RaduBerinde,stevendanna a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None 72417: sql: add unit tests for creating default privileges r=ajwerner a=RichardJCai Adding some unit test coverage so we don't hit bugs like this again. #72322 Release note: None Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: richardjcai <[email protected]> Co-authored-by: Richard Cai <[email protected]>
70330: util/log: add buffer sink decorator r=knz a=rauchenstein Previously, only the file sink had buffering, and in that case it is built into the sink. It's important to add buffering to network sinks for various reasons -- reducing network chatter, and making the networking call itself asynchronous so the log call returns with very low latency. This change adds a buffering decorator so that buffering can be added to any log sink with little or no development effort, and allowing buffering to be configured in a uniform way. Release note (cli change): Add buffering to log sinks. This can be configured with the new "buffering" field on any log sink provided via the "--log" or "--log-config-file" flags. Release justification: This change is safe because it is a no-op without a configuration change specifically enabling it. 72353: *: fix improperly wrapped errors r=otan,RaduBerinde,stevendanna a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None 72417: sql: add unit tests for creating default privileges r=ajwerner a=RichardJCai Adding some unit test coverage so we don't hit bugs like this again. #72322 Release note: None 72430: kvserver: use wrapper type for Store.mu.replicas r=erikgrinaker a=tbg This simplifies lots of callers and it will also make it easier to work on #72374, where this map will start containing more than one type as value. Release note: None 72432: roachprod: fix `roachprod start` ignoring --binary flag r=[rail,tbg] a=healthy-pod Merging #71660 introduced a bug where roachprod ignores --binary flag when running `roachprod start`. This patch reverts to the old way of setting config.Binary as a quick solution to the bug. Release note: None Fixes #72425 #72420 #72373 #72372 Co-authored-by: Jay Rauchenstein <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: richardjcai <[email protected]> Co-authored-by: Richard Cai <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: Ahmad Abedalqader <[email protected]>
72348: kv: fix improperly wrapped errors r=arulajmani,nvanbenschoten a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None 72692: server: tidy tenant testing client r=matthewtodd a=matthewtodd Pushing these assertions into the test client removes some of the line noise from the tests, making them easier to read. We also generalize the tenant http client to provide plain text as well as json responses. Release note: None 72717: server: add check for sql instance ID in tenant status server rpc handlers r=lindseyjin a=lindseyjin Previously, most RPC handlers in tenant_status.go did not check if the request was received by the status server before the sql pod had finished its instance ID initialization. This commit adds in those missing checks. Release note: None 72749: cmd/roachtest: fix `--cluster` for new `roachprod` r=ajwerner a=ajwerner The `roachprod list` command no longer takes a positional argument to filter the results. This commit adopts the new `--pattern` flag in `roachtest`. This is important when providing your own cluster for roachtest with the `--cluster` flag. Release note: None 72770: ui: create error Boundary component r=maryliag a=maryliag Previously, when there was an error on the pages on the console it would crash and nothing would load. This commit introduces an error boundary, making an error message show instead of the blank page. <img width="897" alt="Screen Shot 2021-11-15 at 1 47 39 PM" src="https://user-images.githubusercontent.com/1017486/141837360-eeba1e0b-5245-43cf-a0a8-e185f2cfa179.png"> Release note (ui change): New page for when something goes wrong and the page crashes. 72777: sql: disallow ALTER COLUMN TYPE for columns stored in secondary indexes r=ajwerner a=ajwerner It is broken. After the change, the index, after the change, refers to a column which is not in the table. Fixes #72718. Relates to #72771. Release note (sql change): The experimental `ALTER COLUMN TYPE` statement is no longer permitted when the column is stored in a secondary index. Prior to this change, that was the only sort of secondary index membership which was allowed, but the result of the operation was a subtly corrupted table. 72787: util: make FastIntMap return -1 when not found always r=ajwerner a=ajwerner Before this change, it would only return -1 if it had not outgrown the small map. This lead to the oddities uncovered in #72718. Release note: None Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: Lindsey Jin <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]>
72349: cloud,ccl: fix improperly wrapped errors r=HonoreDB a=rafiss refs #42510 I'm working on a linter that detects errors that are not wrapped correctly, and it discovered these. Release note: None 72609: server: mention cli-side --drain-wait in corresponding server-side settings r=knz a=tbg See (internal) https://github.com/cockroachlabs/support/issues/1318. Release note: None 72703: roachprod: replace fatal logging by propagating errors r=[tbg,rail] a=healthy-pod Fatal logging prevented errors from being properly propagated back to the binary error wrapper. Release note: None 72731: gitignore: ignore `roachprod logs` local log buffers r=rickystewart a=erikgrinaker The `roachprod logs` command fetches cluster logs into the local directory `<cluster>.logs` before displaying them. This patch ignores those logs. Release note: None Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: Ahmad Abedalqader <[email protected]> Co-authored-by: Erik Grinaker <[email protected]>
I found this while looking for something else:
By flattening the error with
err.Error()
this constructor strips the error object of all its details, including stack traces but also user-directed hints, links to documentation issues and telemetry data.Other examples (non exhaustive):
There are multiple ways to fix it:
errors.Wrap[f]
when no need to add a pg codepgerror.WithCandidateCode
when need to add a pg code if there wasn't one already to start witherrors.Combine
orerrors.WithSecondaryError
to combine errors togetherI am fixing the first example at the top in #42509, but the other cases remain to be fixed.
cc @cockroachdb/sql-execution
The text was updated successfully, but these errors were encountered: