Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifically catch the 504 Gateway Timeout error on the Statements/Transactions Pages, and show a more informative error message #78979

Closed
jocrl opened this issue Mar 29, 2022 · 0 comments · Fixed by #87153
Assignees
Labels
A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console A-sql-observability Related to observability of the SQL layer C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@jocrl
Copy link
Contributor

jocrl commented Mar 29, 2022

Currently, the 504 Gateway Timeout error on the Statements/Transactions Pages shows a generic error message (the CSS alignment is already fixed in a future version).

image

The 504 Gateway Timeout error is a common enough error mode that we should show a specific error message for it. This would help us more easily distinguish the root cause of errors.

Jira issue: CRDB-14471

@jocrl jocrl added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-observability Related to observability of the SQL layer A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console labels Mar 29, 2022
@maryliag maryliag self-assigned this Aug 30, 2022
craig bot pushed a commit that referenced this issue Aug 31, 2022
…87158

85354: sql: notices for NotVisible Indexes r=wenyihu6 a=wenyihu6

Optimizer now supports creating invisible indexes after this
[PR](#85794). An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: #72576

See also: #85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none

86592: kvserver: rework memory allocation in replicastats r=kvoli a=kvoli

This patch removes some unused fields within the replica stats object.
It also opts to allocate all the memory needed upfront for a replica
stats object for better cache locality and less GC overhead.

This patch also removes locality tracking for the other throughput trackers
to reduce per-replica memory footprint.

resolves #85112

Release justification: low risk, lowers memory footprint to avoid oom.
Release note: None

87024: sql: Prevent primary region being same as secondary region r=rafiss a=e-mbrown

fixes #86879

We found that the primary region could be assigned the same region as the secondary region. This commit adds an error to prevent that.

Release justification: Low risk high benefit change to existing functionality
Release note: None

87110: ui: fixes to high contention copy in insight workload pages r=ericharmeling a=ericharmeling

Previously, the High Contention insight type was labeled
"High Contention Time", and the waiting transactions list
was labeled in the incorrect tense. This commit fixes those
typos.

Release justification: bug fix
Release note: None

87135: build: remove newly-added node_modules/ trees in ui-maintainer-clean r=rickystewart a=sjbarag

A few recent features [1, 2] introduced new node_modules/ trees for
dependencies, but didn't update the ui-maintainer-clean Make target to
remove them. This allowed those directories to leak between TeamCity
builds with Docker user permissions, preventing a `yarn install` in
those packages from properly laying out a `node_modules/.bin` directory
for executables like `tsc`. Remove the recently-introduced
`node_modules/` directories as part of `make ui-maintainer-clean`, to
restore a clean state between jobs.

[1] d28c072 (ui: add eslint-plugin-crdb package with custom eslint rules, 2022-05-27)
[2] c58279d (ui: reintroduce end-to-end UI tests with cypress, 2022-08-12)

Release justification: Non-production code changes

87149: sql: clean up physical planning for system tenant r=yuzefovich a=yuzefovich

This commit audits a couple of methods around the health and version of
DistSQL nodes that are used only for the system tenant to make that more
explicit. Additionally, it unexports `NodeStatuses` map from the
planning context as well as removes some unnecessary short-circuiting
behavior around checking the node health and version (it was unnecessary
because we already short-circuit in
`checkInstanceHealthAndVersionSystem`).

Release justification: low-risk cleanup.

Release note: None

87153: ui: ux improvements on stmt details page r=maryliag a=maryliag

This commit adds a few improvements and bug fixes:

- Handles the case where we hit a
timeout on statement details, so it doesn't crash
anymore and you can still see the time picker to
be able to select a new time interval.

- Updates the error message, to
clarify it was a timeout error and increase the
timeout from 30s to 30m on the details endpoint.
Fixes #78979

- Updates the last error for statement
details with the proper value, which previously
was using the error for all statements endpoint,
instead of the specific for that fingerprint id.

- Adds a message when page takes longer to load.

- Uses a proper count formatting for
execution count.

Release justification: bug fixes and smaller improvements
Release note (ui change): Proper formatting of execution count
under Statement Details page.
Increase timeout for Statement Details page and shows
proper timeout error when it happens, no longer
crashing the page.

87155: github-post: allow for finding the test in a parent directory of the pkg r=srosenberg,rail a=rickystewart

In some cases the Bazel test runner "incorrectly" reports the package
path for tests. For example, we have [issues](#85376) where the name of
the test is reported as `pkg/.../package/package_test` rather than
`pkg/.../package` as we might expect. I suspect this is confusing
`github-post` when it tries to find tests in the `package_test`
directory rather than the `package` directory.

We address this by allowing `github-post` to search up the directory
tree for the test rather than expecting it to be in one particular
directory.

Also update a repro command to use `dev test` rather than
`make stressrace`.

Closes #85420.

Release justification: Non-production code changes
Release note: None

87156: ci: disable sharding in random syntax tests r=srosenberg a=rickystewart

The different shards were trampling each other's test.json.txt,
preventing failures from being reported accurately.

Release justification: Non-production code changes
Release note: None

87158: sql: clean up node dialer fields r=yuzefovich a=yuzefovich

This commit removes no longer used `nodeDialer` field (for SQL - KV
communication) as well as renames some of the similarly named fields to
`podNodeDialer` to indicate that its only a SQL - SQL dialer.

Release justification: low-risk cleanup.

Release note: None

Co-authored-by: wenyihu3 <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: e-mbrown <[email protected]>
Co-authored-by: Eric Harmeling <[email protected]>
Co-authored-by: Sean Barag <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Marylia Gutierrez <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
@craig craig bot closed this as completed in 687fc95 Aug 31, 2022
maryliag added a commit to maryliag/cockroach that referenced this issue Aug 31, 2022
This commit adds a few improvements and bug fixes:

- Handles the case where we hit a
timeout on statement details, so it doesn't crash
anymore and you can still see the time picker to
be able to select a new time interval.

- Updates the error message, to
clarify it was a timeout error and increase the
timeout from 30s to 30m on the details endpoint.
Fixes cockroachdb#78979

- Updates the last error for statement
details with the proper value, which previously
was using the error for all statements endpoint,
instead of the specific for that fingerprint id.

- Adds a message when page takes longer to load.

- Uses a proper count formatting for
execution count.

Release justification: bug fixes and smaller improvements
Release note (ui change): Proper formatting of execution count
under Statement Details page.
Increase timeout for Statement Details page and shows
proper timeout error when it happens, no longer
crashing the page.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-console-general SQL Observability issues on the DB console spanning multiple areas. Includes Cockroach Cloud Console A-sql-observability Related to observability of the SQL layer C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants