Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: internal retries cause wacky statement timings #50108

Closed
solongordon opened this issue Jun 11, 2020 · 1 comment · Fixed by #53846
Closed

sql: internal retries cause wacky statement timings #50108

solongordon opened this issue Jun 11, 2020 · 1 comment · Fixed by #53846
Assignees
Labels
A-sql-executor SQL txn logic A-sql-ui Why is my query slow? C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@solongordon
Copy link
Contributor

Internal retries appear to screw up our accounting for statement timings. It seems like the total latency is attributed to all of the retries (or something along these lines), which can result in a large "Overhead" bucket and an absurdly large "Total Time" value. For example, this was the result of running SELECT crdb_internal.force_retry('5s') a single time.

Screen Shot 2020-06-11 at 1 05 31 PM

Screen Shot 2020-06-11 at 1 05 02 PM

@piyush-singh, do you have any thoughts about how internal retries should be accounted for in statement timings? It feels to me like this example should have been counted as only one execution of the statement, with the internal retry time categorized into its own bucket. Though I'm not sure how tricky it is to determine what time should count as "retry time".

@solongordon solongordon added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-executor SQL txn logic A-sql-ui Why is my query slow? labels Jun 11, 2020
@solongordon solongordon self-assigned this Jun 11, 2020
@piyush-singh
Copy link

It feels to me like this example should have been counted as only one execution of the statement, with the internal retry time categorized into its own bucket.

That would make sense to me. These internal retries happen transparently to the end user, and the end user probably only cares about how long it took to execute the statement once.

craig bot pushed a commit that referenced this issue Sep 3, 2020
53800: server: defer gossip start to when absolutely needed r=irfansharif a=irfansharif

This was pulled out of #52526 to keep the diff there focussed on the
introduction of the RPC (and to see if that alone shaked out any
failures). That change lets us make this one, were we can defer gossip
start until right before we start up Node.

Release justification: low risk, high benefit change to existing functionality
Release note: None

53846: sql: fix inflated "overhead" in statement timings r=solongordon a=solongordon

Fixes #40675
Fixes #50108

Release justification: Low-risk, high-reward fix to existing
functionality.

Release note (admin ui change): In some cases, the Execution Stats page
would show a confusingly high Overhead latency for a statement. This
could happen due to multiple statements being parsed together or due to
statement execution being retried. To avoid this, we stop considering
the time between when parsing ends and execution begins when determining
service latency.

53906: roachtest: Unskip disk-full roachtest r=tbg a=itsbilal

The disk-full roachtest no longer fails now that Pebble is the
default storage engine. Remove the skipped marker.

Fixes #35328.

Release justification: Roachtest-only change.
Release note: None.

53907: geos: bump to include seg fault fix r=sumeerbhola a=otan

Resolves #53254.

Release justification: bug fix to new functionality

Release note: None

Co-authored-by: irfan sharif <[email protected]>
Co-authored-by: Solon Gordon <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
@craig craig bot closed this as completed in 6371d37 Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-executor SQL txn logic A-sql-ui Why is my query slow? C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants