Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI graphs showing more active statements than open transactions #70234

Closed
a-entin opened this issue Sep 15, 2021 · 1 comment
Closed

UI graphs showing more active statements than open transactions #70234

a-entin opened this issue Sep 15, 2021 · 1 comment
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@a-entin
Copy link

a-entin commented Sep 15, 2021

In DB Console -> Metrics -> SQL of a mostly idling cluster

In an idling cluster where practically all sql is internal, the number of active statements appears to be greater than the number of open transactions.

2021-09-14_103236

An open transaction can execute a statement or do nothing. So at no point the number of active (executing) statements can exceed the number of transactions.

Apparently some of the internal transactions are not properly reported/reflected in the metric. As
Per Yahor, “Open SQL transactions” we distinguish between external and internal txns (i.e. there are sql.txns.open and sql.txns.open.internal metrics) whereas for Active SQL Statements we don’t have this distinction. The execution engine currently is not aware whether the query is external or not, but the connExecutor (responsible for txn handling) is aware.

The cleanest an most helpful way to fix the issue might be rename the current sql.txns.open to a new sql.txns.open.external and add a new sql.txns.open that would be SUM(sql.txns.open.internal + sql.txns.open.external).
Or just change the graph so it charts the current sql.txns.open + sql.txns.open.internal?

Environment:

  • CockroachDB version 21.1.7

Additional context
What was the impact?

Astute customers who pay attention to details get slightly confused, specifically when the cluster has low concurrency user workload. Note that low concurrency user workload does not mean the cluster in not used. To the opposite - we ran into this because the background system jobs drove excessively high cpu, impacting the user workload.
So ability to accurately account external/internal/total is super essential for troubleshooting.

Jira issue: CRDB-9997

@a-entin a-entin added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 15, 2021
@THardy98
Copy link

THardy98 commented Mar 1, 2022

This issue has been resolved by #75815. We now only track external (i.e. user-initiated) open transactions & active statements.

@THardy98 THardy98 closed this as completed Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

3 participants