Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: surface KV contention statistics #55243

Closed
6 tasks done
asubiotto opened this issue Oct 6, 2020 · 6 comments
Closed
6 tasks done

sql: surface KV contention statistics #55243

asubiotto opened this issue Oct 6, 2020 · 6 comments
Labels
A-kv-observability A-sql-observability Related to observability of the SQL layer C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) meta-issue Contains a list of several other issues.

Comments

@asubiotto
Copy link
Contributor

asubiotto commented Oct 6, 2020

A high priority for the KV team this release (21.1) is to make it easier for users to understand what is happening with their queries at the KV layer. Since the SQL Execution team is also responsible for query observability, a goal for 21.1 is to surface contention statistics provided by the KV team.

At a general level, we aim to surface these contention statistics in two ways:

  1. On a per-statement/transaction level accessed through the transactions page.
  2. On a cluster level, using a “contention heatmap”.

Design Proposal

This section outlines the work needed on the backend. It does not explicitly lay out what will happen on the frontend but takes as reference the customer needs outlined above to guide the work.

At a basic level, SQL Execution will have access to the following protobufs emitted after completion of a statement:

type ContentionEvent struct {
  Key roachpb.Key
  Txn roachpb.Transaction // the “other” transaction, i.e. what transaction the current transaction is contending with. 
  Duration time.Duration // time spent contending on the key against that txn in that one instance
  Outcome enum // pushed?
}

There is more detail about how this protobuf will be produced in #55583.

Per-statement

These contention events will be used to enrich existing EXPLAIN ANALYZE plans. Any operator that sends KV requests will intercept these contention events and keep track of the cumulative time spent contending and output this as another execution stat. These contention seconds can be aggregated across a whole query much as other top-level stats are and can provide a percentage of time that the query spends contending with other transactions by calculating the ratio of cumulative time spent contending with respect to the cumulative execution time (i.e. addition of cumulative disk time + cumulative network time + cumulative execution time). This stat will be surfaced through EXPLAIN ANALYZE and the respective statement’s page.

The transaction IDs that the statement contended with will also be stored by the operator although these will not be displayed in EXPLAIN ANALYZE, only surfaced to the statements page with enough information to display a list where each row links to the transaction page that was contending with that statement. TODO(asubiotto): It’s unclear at this moment how to tie a transaction ID back to a transaction page since the transactions in the admin UI seem to be keyed by a hash of statement strings but I haven’t looked into it too much.

Global View

Global contention statistics show the operator what the state of contention is in a cluster. Since each statement provides a small piece of this puzzle, the contention events will be sent back to the gateway as metadata and processed by the distsql receiver on that gateway, similar to what happens when the range cache is updated when a misplanned range is encountered.

Contention View

The contention view is primarily a time series metric that gives the operator the data to determine whether the cluster is suffering from contention or not, but no more. The goal of this view is to be a litmus test for whether further investigation is needed by looking at the contention map and specific statement contention data. The proposal is to surface the number of contended queries per second. This should be relatively simple to plumb down as a timeseries metric to the distsql receiver to be incremented if the query receives at least one contention event.

Contention Map

This map view is still pending design, but we can start work on the backend given the general idea. This “global” view should also be queryable via a virtual table. This kind of API is reminiscent of other features like SHOW QUERIES, where each server has a local view of its queries and there is a cluster-level API through the DB console as well as SQL shell to query the global state.

The proposal in this case is to keep an in-memory contention registry keyed by table ID/indexID pair that is updated by the DistSQL receiver when receiving contention events. When a StatusServer receives a request for a global contention view, it will broadcast a contention request to each node for its local contention view, and merge it with its own local view.

Diving a bit deeper, what should this contention registry look like? It depends on the questions we want to answer. At a high level, we want to be able to answer which tables are experiencing contention and allow the user to dive deeper into a table to understand what key/row range is experiencing contention and which transactions are responsible for this contention. Therefore, the proposal is a top-level map keyed by a table ID with a value struct that looks something like:

type tableContentionInfo struct {
    contentionEvents uint64
    cumulativeContentionTime time.Duration
    orderedKeyContentionInfo []keyContentionInfo
}

type keyContentionInfo struct {
    key roachpb.Key
    contendingTransactions []roachpb.Transaction
}

This map cannot grow unboundedly, so an eviction policy needs to be put into place. As a simple starter, we can also keep track of the last contention event for a given table and use an LRU policy with a maximum size for the map. In a given map entry, we could similarly use an LRU policy for keys.

Merging these structs should be relatively straightforward.

Observations and remaining questions

When talking about a global contention view, we’ve talked about showing contended spans, but contention events only produce keys. I’ll follow up with KV to define what we’re going to do here better, otherwise I’m guessing we can slightly modify the contention event and have the table reader or cfetcher annotate the contention proto with the span that was passed to the fetcher and produced this contention event.
Need to figure out how to use transaction IDs to get richer transaction information.

Work Items

Out of scope

Getting information for currently hanging requests is out of scope. This will be exposed to users through a debug page by the KV team, so the information will be available. The proposal here is to surface the contention statistics the KV team will provide through execution metadata. High contention is usually a performance problem, not a stuck queries problem so it will be enough for users to be able to visualize the contention information for queries that have completed.

@asubiotto asubiotto added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-observability A-sql-observability Related to observability of the SQL layer labels Oct 6, 2020
@asubiotto
Copy link
Contributor Author

cc @jordanlewis @tbg

@asubiotto asubiotto added the meta-issue Contains a list of several other issues. label Oct 6, 2020
@tbg
Copy link
Member

tbg commented Oct 23, 2020

@glennfawcett mentioned interest in this at the per-stmt level as well. Was this considered as well?

@awoods187
Copy link
Contributor

Yes. The requirement is to surface contention information on both a per statement and to aggregate this information on a per cluster basis. We haven't conducted all of our discovery yet but we will get increasingly more specific.

@tbg
Copy link
Member

tbg commented Oct 23, 2020

Just to clarify, you mean both per-stmt, per-txn, and global, right? As opposed to per-stmt xor per-txn.

@asubiotto
Copy link
Contributor Author

Updated the issue with a draft design proposal

@asubiotto
Copy link
Contributor Author

I declare this meta-issue complete 🔨 We'll track the implementation of a UI global contention map separately, but a user can currently view all contention events through a virtual table thanks to @yuzefovich's latest work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-observability A-sql-observability Related to observability of the SQL layer C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) meta-issue Contains a list of several other issues.
Projects
None yet
Development

No branches or pull requests

3 participants