Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: deadlock due to planning query for SHOW TABLE #46447

Closed
ajwerner opened this issue Mar 23, 2020 · 9 comments · Fixed by #46588
Closed

sql: deadlock due to planning query for SHOW TABLE #46447

ajwerner opened this issue Mar 23, 2020 · 9 comments · Fixed by #46588

Comments

@ajwerner
Copy link
Contributor

ajwerner commented Mar 23, 2020

When planning a query we use a different transaction. We do this, at least in part, to ensure that reads of the schema performed during planning are cacheable. See here:

// Preparing needs a transaction because it needs to retrieve db/table
// descriptors for type checking.
// TODO(andrei): Needing a transaction for preparing seems non-sensical, as
// the prepared statement outlives the txn. I hope that it's not used for
// anything other than getting a timestamp.
txn := kv.NewTxn(ctx, ex.server.cfg.DB, ex.server.cfg.NodeID.Get())
ex.statsCollector.reset(&ex.server.sqlStats, ex.appStats, &ex.phaseTimes)
p := &ex.planner
ex.resetPlanner(ctx, p, txn, ex.server.cfg.Clock.PhysicalTime() /* stmtTS */)

Unfortunately, we do use this transaction for more than just a timestamp. This transaction makes its way to name resolution. An example of a problematic case is here:

dataSource, resName, err := d.catalog.ResolveDataSource(d.ctx, flags, &tn)

Imagine that we have Txn1 and Txn2:

Txn1: BEGIN;
Txn1: ALTER TABLE foo ...
Txn2: BEGIN;
Txn2: ALTER TABLE bar ...
Txn2: ALTER TABLE foo ... // blocks on Txn1
Txn1: SHOW COLUMNS FROM bar // blocks on Txn2 in planner txn

Had the planner been using the user's transaction then deadlock detection would have kicked in and we would have been fine. The problem here is that deadlock detection has no way to associate the planner txn with Txn1.

The fix for the short term is likely to be to follow the pattern of #46384 and #46170 and make the transaction used here high priority. Unlike in those cases however, the result of this query is not necesarily going to be cached as a shared resource. In fact, if we do this, then there's a chance that we can introduce a live-lock scenario where concurrent SHOW COLUMN ... can hold off a schema change. I'm not certain that that's the worst thing. I think for now it's the right thing to do.

A better solution might be to explicitly relate the planner txn to Tx1 for deadlock detection. Perhaps something like a read-only child transaction which carries its parent in push requests. @nvanbenschoten I think you've been having thoughts in this area.


This is fallout from #46402.

@ajwerner
Copy link
Contributor Author

cc @andreimatei

@jordanlewis
Copy link
Member

Should we try to get this done for 20.1, or is it too complex?

@ajwerner
Copy link
Contributor Author

The easy fix is, well, 1-line sort of easy and it does seem to work. I’ll post the PR.

I would love some thoughts from Nathan. My best guess is that, yes, we should fix it with the high priority fix and then revisit for 20.2.

@andreimatei
Copy link
Contributor

I would not make that preparing txn high priority. That seems very weird and fragile to me. I'd look into using the session's txn instead (if we are in an explicit txn).

But I'm also curious to hear Nathan's thoughts on how easy/opportune it'd be to express dependencies between txns beyond the current dependencies that we track - one txn trying to push another one.

@ajwerner
Copy link
Contributor Author

I'd look into using the session's txn instead (if we are in an explicit txn).

I'll investigate what it would take to ensure that the results aren't cached. My current understanding is that that's the main problem. I wonder to what extent it'd be okay to cache the results because we should be hitting the in-memory table collection when retrieving modified resources.

@andreimatei
Copy link
Contributor

Well my assumption is that there's already no caching of anything for the troublesome schema lookups happening when planning the show columns. Is there?

@ajwerner
Copy link
Contributor Author

Well my assumption is that there's already no caching of anything for the troublesome schema lookups happening when planning the show columns. Is there?

No, but unless we do something invasive down in that code, we'd need to make the decision about which txn to use at a much higher level. I was hoping to not involve anything in delegate with which txn we're using. Ideally we'd just look at whether this is an explicit transaction inside of connExecutor.prepare()

@andreimatei
Copy link
Contributor

Right, absolutely. It's in connExecutor.prepare() that I'd change things.

What I was saying about the caching is that, hopefully, the d.catalog.ResolveDataSource() call never caches anything, or at least not for use across transactions.

@ajwerner
Copy link
Contributor Author

d.catalog.ResolveDataSource() call never caches anything, or at least not for use across transactions.

Yeah that seems right. I went through and looked at the database cache and table descriptors and seems like the only caching that happens underneath planning is scoped to the planner's TableCollection. Let me try that out and see if anything breaks.

ajwerner added a commit to ajwerner/cockroach that referenced this issue Mar 26, 2020
Prepartion of certain queries requires performing reads against the database.
If the user has already laid down intents, these reads may become part of a
dependency cycle. Prior to this commit, these reads would be on a different
transaction and thus this cycle would not be detected by our deadlock detection
mechanism.

This change opts to use the user's transaction for planning if there is one
and thus will properly interact with deadlock detection.

Fixes cockroachdb#46447.

Release justification: fixes for high-priority or high-severity bugs in
existing functionality

Release note: None
craig bot pushed a commit that referenced this issue Mar 31, 2020
45920: UI Telemetry for Statements r=dhartunian a=nathanstilwell

fixes #45506

- [x] Changing sort order (want to see which column and asc vs desc)
- [x] Search
- [x] Clicking to paginate
- [x] Diagnostic bundle activations (capturing high level statement performance information as part of the event is nice, e.g. latency or execution count; we should scrub the actual fingerprint)

Adding a tracking function to [`analytics.ts`]() to send analytics payloads Segment.io. I begin by adding tracking calls to interesting events on the [Statements]() page of the Admin UI. The events being tracked are as follows,

### Table Sort
This event is fired when the sorting order is changed by clicking a column header on the Statements page. 

![statements-sort-order](https://user-images.githubusercontent.com/397448/77473575-0a419100-6dec-11ea-850d-8663af4ccc37.gif)
```
{
  userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e'
  event: 'Table Sort',
  properties: {
    columnName: 'Txn Type',
    pagePath: '/statements',
    sortDirection: 'desc',
    tableName: 'statements-table'
  }
}
```

### Search
This event is fired when a the statements are filtered using a search term.

![statements-search](https://user-images.githubusercontent.com/397448/77474910-5097ef80-6dee-11ea-828c-43b387354327.gif)

```
{
  userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e',
  event: 'Search',
  properties: {
    numberOfResults: 17,
    pagePath: '/statements',
    searchTerm: 'system'
  }
}
```

### Paginate
This event is fired when the user interacts with pagination on the statements page.
![statements-pagination](https://user-images.githubusercontent.com/397448/77474995-7ae9ad00-6dee-11ea-948f-00fcc500438a.gif)

```
{
  userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e',
  event: 'Paginate',
  properties: {
    pagePath: '/statements/',
    selectedPage: 4
  }
}
```

### Diagnostics Activation
This event it tracked when a user clicks "Activate" on the diagnostics activation modal.
![statements-diagnostics-activation](https://user-images.githubusercontent.com/397448/77475062-95bc2180-6dee-11ea-96c6-d72c91915000.gif)

```
{
  userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e',
  event: 'Diagnostics Activation',
  properties: {
    fingerprint: 'SELECT blah, blah FROM blah.blah WHERE blah blah blah'
    pagePath: '/statements/'
  }
}
```


46190: ui: removed metric function `setDefaultTime` r=dhartunian a=elkmaster

Removed setter the default graph time window based on the age of the older node in the cluster because we have default redux scale. That fixed our problem when we using the time dropdown on the metrics page and clicking on "10m" we get result with "6h" duration.

Resolves: #46145

Release justification: bug fixes and low-risk updates to new functionality

Release note (ui): default timescale on metrics page is always 10m. previously it defaulted to the age of the longest running node

46275: kvcoord: condense read spans when they exceed the memory limit r=andreimatei a=andreimatei

Before this patch, once a transaction exceeds the
kv.transaction.max_refresh_spans_bytes limit, it stopped tracking reads
and it didn't attempt to refresh any more when pushed.
This patch make the span refresher condense the spans when it runs out
of memory instead. So we'll get bigger spans and potentially false
conflicts, but at least we have a chance at refreshing. In particular,
it'll succeed if there's no writes anywhere.

The condensing is performed using the condensableSpanSet, like we do in
the pipeliner interceptor for the tracking of write intents. Internally,
that guy condenses spans in ranges with lots of reads.

We've seen people run into kv.transaction.max_refresh_spans_bytes in the
past, so this should help many uses cases. But in particular I've
written this patch because, without it, I'm scared about the effects of
20.1's reduction in the closed timestamp target duration to 3s from a
previous 30s. Every transaction writing something after having run for
longer than that will get pushed, so being able to refresh is getting
more important.

Fixes #46095

Release note (general change): Transactions reading a lot of data behave
better when exceeding the memory limit set by
kv.transaction.max_refresh_spans_bytes. Such transactions now attempt to
resolve the conflicts they run into instead of being forced to always
retry. Increasing kv.transaction.max_refresh_spans_bytes should no
longer be necessary for most workloads.

Release justification: fix for new "functionality" - the reduction in
the closed timestamp target duration.

46557: ui: Jobs / Statements description tooltip r=dhartunian a=elkmaster

Updated job description tool tip to truncate at around ~425 characters
Updated tooltip to 500px wide

Resolves: #46078

Release justification: bug fixes and low-risk updates to new functionality

Release note (ui): tooltips showing statements and jobs are limited in size for very long statements

46588: sql: use user transaction if we have one to prepare queries r=andreimatei a=ajwerner

Prepartion of certain queries requires performing reads against the database.
If the user has already laid down intents, these reads may become part of a
dependency cycle. Prior to this commit, these reads would be on a different
transaction and thus this cycle would not be detected by our deadlock detection
mechanism.

This change opts to use the user's transaction for planning if there is one
and thus will properly interact with deadlock detection.

Fixes #46447.

Release justification: fixes for high-priority or high-severity bugs in
existing functionality

Release note: None

46700: sql: make aggregate builtins share the same memory account r=yuzefovich a=yuzefovich

Release justification: fixes for high-priority or high-severity
bugs in existing functionality (we could hit memory limit due to
accounting long before the available RAM is actually used up).

We recently fixed a couple of leaks in memory accounting by aggregate
builtins. It was done in the same way that other similar aggregates were
doing - by instantiating a separate memory account for each aggregate
struct. However, when a single aggregate, say `anyNotNull`, wants to
store a single datum, say `DInt` of size 8 bytes, when growing its own
memory account will actually make a reservation of
`mon.DefaultPoolAllocation = 10240` bytes although we will only be using
8 bytes. This can result in "memory-starvation" for OLAP-y queries
because we're likely to hit `max-sql-memory` limit long before we're
getting close to it because of such "overestimation" in the accounting.

This commit fixes this problem by making all aggregates that aggregate
a single datum (these are all aggregate builtins that perform memory
accounting except for `arrayAgg` which works with multiple datums) to
share the same memory account (when non-nil) which is plumbed via
`tree.EvalContext` (this is unfortunate but, as always, seems like
necessary evil). That account is instantiated by `rowexec.aggregator`
and `rowexec.windower` processors. Also it is acceptable from the
concurrency's point of view because the processors run in a single
goroutine, so we will never have concurrent calls to grow/shrink this
shared memory account.

If for some reason the field for shared memory account is nil in the
eval context, then we will fallback to old behavior of instantiating
a memory account for each aggregate builtin struct. A helper struct was
created (that is now embedded by all aggregate builtins in question)
that unifies the memory accounting.

Fixes: #46664.

Release note: None (a follow-up to a recent PR).

46780: ui: Default sort by Execution Count column for Statements r=dhartunian a=koorosh

Resolves: #46427

Before, default sorting was set to Latency column in
Statements page that was unintuitive.
Now it is sorted by Execution count column.

Release note (admin ui change): Change default sorting column on Statements
page to Execution Count

Release justification: bug fixes and low-risk updates to new functionality

<img width="1421" alt="Screenshot 2020-03-31 at 14 56 52" src="https://user-images.githubusercontent.com/3106437/78023771-edfba200-735f-11ea-93e6-db77c2582a00.png">


46784: roachprod: Update Slack DM user lookup r=bobvawter a=bobvawter

This change switches to finding users by their email addresses.  It also logs
any DM-lookup failures instead of silently ignoring them.

X-Ref: https://github.com/cockroachdb/dev-inf/issues/65

Release note: None

Co-authored-by: Nathan Stilwell <[email protected]>
Co-authored-by: Vlad Los <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Andrii Vorobiov <[email protected]>
Co-authored-by: Bob Vawter <[email protected]>
@craig craig bot closed this as completed in 0b945bf Mar 31, 2020
ajwerner added a commit to ajwerner/cockroach that referenced this issue Mar 31, 2020
Prepartion of certain queries requires performing reads against the database.
If the user has already laid down intents, these reads may become part of a
dependency cycle. Prior to this commit, these reads would be on a different
transaction and thus this cycle would not be detected by our deadlock detection
mechanism.

This change opts to use the user's transaction for planning if there is one
and thus will properly interact with deadlock detection.

Fixes cockroachdb#46447.

Release justification: fixes for high-priority or high-severity bugs in
existing functionality

Release note: None
ajwerner added a commit to ajwerner/cockroach that referenced this issue Apr 2, 2020
Prepartion of certain queries requires performing reads against the database.
If the user has already laid down intents, these reads may become part of a
dependency cycle. Prior to this commit, these reads would be on a different
transaction and thus this cycle would not be detected by our deadlock detection
mechanism.

This change opts to use the user's transaction for planning if there is one
and thus will properly interact with deadlock detection.

Fixes cockroachdb#46447.

Release justification: fixes for high-priority or high-severity bugs in
existing functionality

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants