Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster-ui: handle partial response errors on the databases page #109245

Merged
merged 1 commit into from
Aug 31, 2023

Conversation

THardy98
Copy link

@THardy98 THardy98 commented Aug 22, 2023

Part of: #102386

Demos (Note: these demos show this same logic applied to both the database details and database table pages as well):
DB-Console

Prior to this change, any query error in the reponse payload for the
databases page would cause the page to render an error. This was
problematic as some queries used to power the databases page directly
query system tables, meaning only ADMIN users could access the page.
This change allows the databases page to handle network responses with
query errors, consequently allowing non-admin users to view the data
they are privy to.

On the databases page, two types of requests are made:

  • a single request to fetch all database names
  • a request to fetch the database details for each database name

The error handling for these requests has changed as such:

  • if we encounter a network or a non size-related query error when
    requesting database names, render a page error
  • if we encounter a 'max size' query error when requesting database
    names, render an alert that we're showing partial results
  • if we encounter any error requesting a database's details, render a
    Caution icon next to the database's name to indicate there was an
    issue getting results, the Caution icon has a tooltip providing a
    general explanation as to what the issue is
  • network errors when fetching database details provide no data for the
    database's table row, consequently the row of statistics for that
    database is unavailable, the network error message is provided in the
    Caution icon tooltip
  • query errors when fetching database details are scoped to the row
    cells for that query, which are unavailable
  • unavailable cells have a tooltip that highlight the error for that
    cell as well

Note: the change in planner.go allows users to see span statistics for any table, not just tables a user has SELECT privileges for. This is particularly useful for the databases page where users need to see the span statistics for a database, but only have SELECT privileges on a subset of tables. Regardless, span statistics are gated by VIEWACTIVITY, not SELECT privileges.

Release note (ui change): Allow non-admin users to view the databases
page.

@THardy98 THardy98 requested a review from a team August 22, 2023 15:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@THardy98 THardy98 force-pushed the db_page_partial_err branch from a6e02b6 to c20c703 Compare August 22, 2023 16:22
@THardy98 THardy98 requested a review from a team August 22, 2023 16:22
@THardy98 THardy98 force-pushed the db_page_partial_err branch 4 times, most recently from 25aebaa to afeb988 Compare August 23, 2023 13:17
@THardy98 THardy98 added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Aug 23, 2023
@THardy98 THardy98 force-pushed the db_page_partial_err branch 3 times, most recently from b8d7945 to a54e3e4 Compare August 23, 2023 20:59
@THardy98 THardy98 force-pushed the db_page_partial_err branch from a54e3e4 to a99076d Compare August 24, 2023 14:54
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Aug 24, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database details page, enabling non-admin users to use the database
details page and providing better transparency to data fetching issues.

Errors encountered while fetching table details can be viewed via the
tooltip provided by the `Caution` icon at the table's name.
`unavailable` cells also provide a tooltip that displays the error
impacting that exact cell.

Release note (ui change): Non-admin users are able to use the database
details page.
@THardy98 THardy98 force-pushed the db_page_partial_err branch 3 times, most recently from a47ccaf to 845c2b0 Compare August 25, 2023 20:31
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Aug 25, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database table page, enabling non-admin users to use the database table
page and providing better transparency to data fetching issues.

`unavailable` fields provide a tooltip that displays the error impacting
that field.

Release note (ui change): Non-admin users are able to use the database
table page.
@THardy98 THardy98 force-pushed the db_page_partial_err branch from 845c2b0 to 61cdea0 Compare August 25, 2023 21:38
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the demo I see the (empty) that we discussed offline about not using, and I don't see it on the code anymore, can you just confirm what it will show on those cases on the UI?
I also added these questions to the videos itself

Reviewed 7 of 12 files at r1, 1 of 1 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


pkg/ui/workspaces/cluster-ui/src/api/sqlApi.ts line 220 at r3 (raw file):

    } else {
      // Otherwise, just log.
      console.error(

here you would log the permission errors, right? This will show up on our datadog and increase our error rate, but this is not necessarily an error if is just permission, because that is the expected behaviour, so you can change to console.warn() when is a permission error, so we can still see if necessary, but doesn't increase our errors rate

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)

@THardy98
Copy link
Author

THardy98 commented Aug 28, 2023

On the demo I see the (empty) that we discussed offline about not using, and I don't see it on the code anymore, can you just confirm what it will show on those cases on the UI?

Yes that is the case. I recorded the demo on another branch that didn't capture the empty -> unavailable change I made to this PR. Cells that are unavailable due to being empty with have a tooltip message saying Empty result

Part of: cockroachdb#102386

Prior to this change, any query error in the reponse payload for the
databases page would cause the page to render an error. This was
problematic as some queries used to power the databases page directly
query `system` tables, meaning only `ADMIN` users could access the page.
This change allows the databases page to handle network responses with
query errors, consequently allowing non-admin users to view the data
they are privy to.

On the databases page, two types of requests are made:
- a single request to fetch all database names
- a request to fetch the database details for each database name

The error handling for these requests has changed as such:
- if we encounter a network or a non size-related query error when
  requesting database names, render a page error

- if we encounter a 'max size' query error when requesting database
  names, render an alert that we're showing partial results

- if we encounter any error requesting a database's details, render a
  `Caution` icon next to the database's name to indicate there was an
issue getting results, the `Caution` icon has a tooltip providing a
general explanation as to what the issue is

- network errors when fetching database details provide no data for the
  database's table row, consequently the row of statistics for that
database is `unavailable`, the network error message is provided in the
`Caution` icon tooltip

- query errors when fetching database details are scoped to the row
  cells for that query, which are `unavailable`

- `unavailable` cells have a tooltip that highlight the error for that
  cell as well

Release note (ui change): Allow non-admin users to view the databases
page.
@THardy98 THardy98 force-pushed the db_page_partial_err branch from 61cdea0 to d77bee0 Compare August 28, 2023 14:17
Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @maryliag)


pkg/ui/workspaces/cluster-ui/src/api/sqlApi.ts line 220 at r3 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

here you would log the permission errors, right? This will show up on our datadog and increase our error rate, but this is not necessarily an error if is just permission, because that is the expected behaviour, so you can change to console.warn() when is a permission error, so we can still see if necessary, but doesn't increase our errors rate

Good point, done.

@THardy98 THardy98 requested a review from maryliag August 29, 2023 13:42
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @THardy98)

@THardy98
Copy link
Author

TYFR

@THardy98
Copy link
Author

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 31, 2023

Build succeeded:

@craig craig bot merged commit 06a506c into cockroachdb:master Aug 31, 2023
@blathers-crl
Copy link

blathers-crl bot commented Aug 31, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from d77bee0 to blathers/backport-release-23.1-109245: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Sep 11, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database details page, enabling non-admin users to use the database
details page and providing better transparency to data fetching issues.

Errors encountered while fetching table details can be viewed via the
tooltip provided by the `Caution` icon at the table's name.
`unavailable` cells also provide a tooltip that displays the error
impacting that exact cell.

Release note (ui change): Non-admin users are able to use the database
details page.
craig bot pushed a commit that referenced this pull request Sep 11, 2023
108175: kvserver: unskip lease preferences during outage  r=andrewbaptist a=kvoli

Previously, `TestLeasePreferenceDuringOutage` would force replication
queue processing of the test range, then assert that the range
up-replicated and lease transferred to a preferred locality.

This test was skipped, and two of the assumptions it relied on to pass
were no longer true.

After #85219, the replicate queue no longer
re-processes replicas. Instead, the queue requeues replicas after
processing, at the appropriate priority. This broke the test due to the
replicate queue being disabled, making the re-queue a no-op.

After #94023, the replicate queue no longer looked for lease transfers,
after processing a replication action. Combined with #85219, the queue
would now be guaranteed to not process both up-replication and lease
transfers from a single enqueue.

Update the test to not require a manual process, instead using a queue
range filter, which allows tests which disable automatic replication, to
still process filtered ranges via the various replica queues. Also,
ensure that the non-stopped stores are considered live targets, after
simulating an outage (bumping manual clocks, stopping servers) -- so
that the expected up-replication, then lease transfer can proceed.

Fixes: #88769
Release note: None

109432: cluster-ui: handle partial response errors on the database details page r=THardy98 a=THardy98

Part of: #102386

**Demos** (Note: these demos show this same logic applied to both the databases and database table pages as well):
DB-Console
- https://www.loom.com/share/5108dd655ad342f28323e72eaf68219c
- https://www.loom.com/share/1973383dacd7494a84e10bf39e5b85a3

This change applies the same error handling ideas from #109245 to the
database details page, enabling non-admin users to use the database
details page and providing better transparency to data fetching issues.

Errors encountered while fetching table details can be viewed via the
tooltip provided by the `Caution` icon at the table's name.
`unavailable` cells also provide a tooltip that displays the error
impacting that exact cell.

Release note (ui change): Non-admin users are able to use the database
details page.

110292: c2c: use seperate spanConfigEventStreamSpec in the span config event stream r=stevendanna a=msbutler

Previously, the spanConfigEventStream used a streamPartitionSpec, which contained a bunch of fields unecessary for span config streaming. This patch creates a new spanConfigEventStreamSpec which contains the fields only necessary for span config event streaming.

Informs #109059

Release note: None

110309: teamcity-trigger: ensure that `race` tag is only passed once r=healthy-pod a=healthy-pod

By running under `-race`, the go command defines the `race` build tag for us [1].

Previously, we defined it under `TAGS` to let the issue poster know that this is a failure under `race` and indicate that in the issue.

At the time, defining the tag twice didn't cause issues but after #109773, it led to build failures [2].

To reproduce locally:
```
bazel test -s --config=race pkg/util/ctxgroup:all --test_env=GOTRACEBACK=all --define gotags=bazel,gss,race
```

As a follow-up, we should find another way to let the issue poster know that a failure was running under `race`.

[1] https://go.dev/doc/articles/race_detector#Excluding_Tests
[2] #109994 (comment)

Epic: none
Release note: None

Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Thomas Hardy <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: healthy-pod <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Sep 11, 2023
Part of: #102386

This change applies the same error handling ideas from #109245 to the
database details page, enabling non-admin users to use the database
details page and providing better transparency to data fetching issues.

Errors encountered while fetching table details can be viewed via the
tooltip provided by the `Caution` icon at the table's name.
`unavailable` cells also provide a tooltip that displays the error
impacting that exact cell.

Release note (ui change): Non-admin users are able to use the database
details page.
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Sep 11, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database details page, enabling non-admin users to use the database
details page and providing better transparency to data fetching issues.

Errors encountered while fetching table details can be viewed via the
tooltip provided by the `Caution` icon at the table's name.
`unavailable` cells also provide a tooltip that displays the error
impacting that exact cell.

Release note (ui change): Non-admin users are able to use the database
details page.
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Sep 11, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database table page, enabling non-admin users to use the database table
page and providing better transparency to data fetching issues.

`unavailable` fields provide a tooltip that displays the error impacting
that field.

Release note (ui change): Non-admin users are able to use the database
table page.
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Sep 12, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database table page, enabling non-admin users to use the database table
page and providing better transparency to data fetching issues.

`unavailable` fields provide a tooltip that displays the error impacting
that field.

Release note (ui change): Non-admin users are able to use the database
table page.
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Sep 13, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database table page, enabling non-admin users to use the database table
page and providing better transparency to data fetching issues.

`unavailable` fields provide a tooltip that displays the error impacting
that field.

Release note (ui change): Non-admin users are able to use the database
table page.
craig bot pushed a commit that referenced this pull request Sep 14, 2023
109521: cluster-ui: handle partial response errors on the database table page r=THardy98 a=THardy98

Part of: #102386

**Demos** (Note: these demos show this same logic applied to both the databases and database details pages as well):
DB-Console
- https://www.loom.com/share/5108dd655ad342f28323e72eaf68219c
- https://www.loom.com/share/1973383dacd7494a84e10bf39e5b85a3

This change applies the same error handling ideas from #109245 to the
database table page, enabling non-admin users to use the database table
page and providing better transparency to data fetching issues.

`unavailable` fields provide a tooltip that displays the error impacting
that field.

Release note (ui change): Non-admin users are able to use the database
table page.

109788: docs-issue-generation: Create docs issues in Jira r=nickvigilante a=nickvigilante

Fixes DOC-7066

Additional fixes:

- Update the authorization scheme for GitHub to `Bearer`
- Divide `docs_issue_generation.go` into logical package files

Release note: None

110288: roachtest: codify longer ttl external storage buckets r=renatolabs a=msbutler

Previously, the only codified external buckets for roachtests to back up to was
the `cockroachdb-backup-testing` buckets in s3 and gcs which each had a ttl of
1 day. This low ttl is not suitable for roachtests that produce backups that
the test failure investigator may want to inspect. This patch codifies the new
`cockroachdb-backup-testing-long-ttl` buckets in s3 and gcs, which currently
have a ttl of 20 days, the same ttl that team city artifacts have.

This patch also points the c2c, backup-restore/mixed-version, and
disagg-rebalance roachtests to use these new buckets.

Note this PR only points roachtests that run in public TC environments to the
new buckets. A future PR will set the BACKUP_TESTING_BUCKET_LONG_TTL env var
for private roacttests to a new bucket with a longer ttl.

Epic: none

Release note: none


110610: parser/backup: fix normalization of detached = false r=dt a=dt

Release note: none.
Epic: none.

110638: roachtest: add c2c/multiregion/SameRegions/kv0 roachtest r=stevendanna a=msbutler

This patch adds a new c2c roachtest that spins up multiregion source and destination clusters, constrains the kv database to the us-east1-b region, and asserts that the replicated span configuration enforces the regional constraint on the destination cluster during replication.

Informs #109059

Release note: None

Co-authored-by: Thomas Hardy <[email protected]>
Co-authored-by: Nick Vigilante <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: David Taylor <[email protected]>
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Sep 14, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database table page, enabling non-admin users to use the database table
page and providing better transparency to data fetching issues.

`unavailable` fields provide a tooltip that displays the error impacting
that field.

Release note (ui change): Non-admin users are able to use the database
table page.
THardy98 pushed a commit to THardy98/cockroach that referenced this pull request Sep 15, 2023
Part of: cockroachdb#102386

This change applies the same error handling ideas from cockroachdb#109245 to the
database table page, enabling non-admin users to use the database table
page and providing better transparency to data fetching issues.

`unavailable` fields provide a tooltip that displays the error impacting
that field.

Release note (ui change): Non-admin users are able to use the database
table page.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants