sql: deadlock due to planning query for `SHOW TABLE` #46447

ajwerner · 2020-03-23T23:47:52Z

When planning a query we use a different transaction. We do this, at least in part, to ensure that reads of the schema performed during planning are cacheable. See here:

cockroach/pkg/sql/conn_executor_prepare.go

Lines 163 to 172 in e5bb268

    
           // Preparing needs a transaction because it needs to retrieve db/table 
        
           // descriptors for type checking. 
        
           // TODO(andrei): Needing a transaction for preparing seems non-sensical, as 
        
           // the prepared statement outlives the txn. I hope that it's not used for 
        
           // anything other than getting a timestamp. 
        
           txn := kv.NewTxn(ctx, ex.server.cfg.DB, ex.server.cfg.NodeID.Get()) 
        
           ex.statsCollector.reset(&ex.server.sqlStats, ex.appStats, &ex.phaseTimes) 
        
           p := &ex.planner 
        
           ex.resetPlanner(ctx, p, txn, ex.server.cfg.Clock.PhysicalTime() /* stmtTS */)

Unfortunately, we do use this transaction for more than just a timestamp. This transaction makes its way to name resolution. An example of a problematic case is here:

cockroach/pkg/sql/delegate/show_table.go

Line 172 in e5bb268

dataSource, resName, err := d.catalog.ResolveDataSource(d.ctx, flags, &tn)

Imagine that we have Txn1 and Txn2:

Txn1: BEGIN;
Txn1: ALTER TABLE foo ...
Txn2: BEGIN;
Txn2: ALTER TABLE bar ...
Txn2: ALTER TABLE foo ... // blocks on Txn1
Txn1: SHOW COLUMNS FROM bar // blocks on Txn2 in planner txn

Had the planner been using the user's transaction then deadlock detection would have kicked in and we would have been fine. The problem here is that deadlock detection has no way to associate the planner txn with Txn1.

The fix for the short term is likely to be to follow the pattern of #46384 and #46170 and make the transaction used here high priority. Unlike in those cases however, the result of this query is not necesarily going to be cached as a shared resource. In fact, if we do this, then there's a chance that we can introduce a live-lock scenario where concurrent SHOW COLUMN ... can hold off a schema change. I'm not certain that that's the worst thing. I think for now it's the right thing to do.

A better solution might be to explicitly relate the planner txn to Tx1 for deadlock detection. Perhaps something like a read-only child transaction which carries its parent in push requests. @nvanbenschoten I think you've been having thoughts in this area.

This is fallout from #46402.

The text was updated successfully, but these errors were encountered:

ajwerner · 2020-03-23T23:47:59Z

cc @andreimatei

jordanlewis · 2020-03-24T14:52:50Z

Should we try to get this done for 20.1, or is it too complex?

ajwerner · 2020-03-24T15:08:45Z

The easy fix is, well, 1-line sort of easy and it does seem to work. I’ll post the PR.

I would love some thoughts from Nathan. My best guess is that, yes, we should fix it with the high priority fix and then revisit for 20.2.

andreimatei · 2020-03-24T15:16:21Z

I would not make that preparing txn high priority. That seems very weird and fragile to me. I'd look into using the session's txn instead (if we are in an explicit txn).

But I'm also curious to hear Nathan's thoughts on how easy/opportune it'd be to express dependencies between txns beyond the current dependencies that we track - one txn trying to push another one.

ajwerner · 2020-03-24T15:18:50Z

I'd look into using the session's txn instead (if we are in an explicit txn).

I'll investigate what it would take to ensure that the results aren't cached. My current understanding is that that's the main problem. I wonder to what extent it'd be okay to cache the results because we should be hitting the in-memory table collection when retrieving modified resources.

andreimatei · 2020-03-24T15:23:24Z

Well my assumption is that there's already no caching of anything for the troublesome schema lookups happening when planning the show columns. Is there?

ajwerner · 2020-03-24T15:26:09Z

Well my assumption is that there's already no caching of anything for the troublesome schema lookups happening when planning the show columns. Is there?

No, but unless we do something invasive down in that code, we'd need to make the decision about which txn to use at a much higher level. I was hoping to not involve anything in delegate with which txn we're using. Ideally we'd just look at whether this is an explicit transaction inside of connExecutor.prepare()

andreimatei · 2020-03-24T15:31:46Z

Right, absolutely. It's in connExecutor.prepare() that I'd change things.

What I was saying about the caching is that, hopefully, the d.catalog.ResolveDataSource() call never caches anything, or at least not for use across transactions.

ajwerner · 2020-03-24T15:54:12Z

d.catalog.ResolveDataSource() call never caches anything, or at least not for use across transactions.

Yeah that seems right. I went through and looked at the database cache and table descriptors and seems like the only caching that happens underneath planning is scoped to the planner's TableCollection. Let me try that out and see if anything breaks.

Prepartion of certain queries requires performing reads against the database. If the user has already laid down intents, these reads may become part of a dependency cycle. Prior to this commit, these reads would be on a different transaction and thus this cycle would not be detected by our deadlock detection mechanism. This change opts to use the user's transaction for planning if there is one and thus will properly interact with deadlock detection. Fixes cockroachdb#46447. Release justification: fixes for high-priority or high-severity bugs in existing functionality Release note: None

45920: UI Telemetry for Statements r=dhartunian a=nathanstilwell fixes #45506 - [x] Changing sort order (want to see which column and asc vs desc) - [x] Search - [x] Clicking to paginate - [x] Diagnostic bundle activations (capturing high level statement performance information as part of the event is nice, e.g. latency or execution count; we should scrub the actual fingerprint) Adding a tracking function to [`analytics.ts`]() to send analytics payloads Segment.io. I begin by adding tracking calls to interesting events on the [Statements]() page of the Admin UI. The events being tracked are as follows, ### Table Sort This event is fired when the sorting order is changed by clicking a column header on the Statements page. ![statements-sort-order](https://user-images.githubusercontent.com/397448/77473575-0a419100-6dec-11ea-850d-8663af4ccc37.gif) ``` { userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e' event: 'Table Sort', properties: { columnName: 'Txn Type', pagePath: '/statements', sortDirection: 'desc', tableName: 'statements-table' } } ``` ### Search This event is fired when a the statements are filtered using a search term. ![statements-search](https://user-images.githubusercontent.com/397448/77474910-5097ef80-6dee-11ea-828c-43b387354327.gif) ``` { userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e', event: 'Search', properties: { numberOfResults: 17, pagePath: '/statements', searchTerm: 'system' } } ``` ### Paginate This event is fired when the user interacts with pagination on the statements page. ![statements-pagination](https://user-images.githubusercontent.com/397448/77474995-7ae9ad00-6dee-11ea-948f-00fcc500438a.gif) ``` { userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e', event: 'Paginate', properties: { pagePath: '/statements/', selectedPage: 4 } } ``` ### Diagnostics Activation This event it tracked when a user clicks "Activate" on the diagnostics activation modal. ![statements-diagnostics-activation](https://user-images.githubusercontent.com/397448/77475062-95bc2180-6dee-11ea-96c6-d72c91915000.gif) ``` { userId: 'ac7aafbc-1a79-4a5b-bc60-c1221cf80e1e', event: 'Diagnostics Activation', properties: { fingerprint: 'SELECT blah, blah FROM blah.blah WHERE blah blah blah' pagePath: '/statements/' } } ``` 46190: ui: removed metric function `setDefaultTime` r=dhartunian a=elkmaster Removed setter the default graph time window based on the age of the older node in the cluster because we have default redux scale. That fixed our problem when we using the time dropdown on the metrics page and clicking on "10m" we get result with "6h" duration. Resolves: #46145 Release justification: bug fixes and low-risk updates to new functionality Release note (ui): default timescale on metrics page is always 10m. previously it defaulted to the age of the longest running node 46275: kvcoord: condense read spans when they exceed the memory limit r=andreimatei a=andreimatei Before this patch, once a transaction exceeds the kv.transaction.max_refresh_spans_bytes limit, it stopped tracking reads and it didn't attempt to refresh any more when pushed. This patch make the span refresher condense the spans when it runs out of memory instead. So we'll get bigger spans and potentially false conflicts, but at least we have a chance at refreshing. In particular, it'll succeed if there's no writes anywhere. The condensing is performed using the condensableSpanSet, like we do in the pipeliner interceptor for the tracking of write intents. Internally, that guy condenses spans in ranges with lots of reads. We've seen people run into kv.transaction.max_refresh_spans_bytes in the past, so this should help many uses cases. But in particular I've written this patch because, without it, I'm scared about the effects of 20.1's reduction in the closed timestamp target duration to 3s from a previous 30s. Every transaction writing something after having run for longer than that will get pushed, so being able to refresh is getting more important. Fixes #46095 Release note (general change): Transactions reading a lot of data behave better when exceeding the memory limit set by kv.transaction.max_refresh_spans_bytes. Such transactions now attempt to resolve the conflicts they run into instead of being forced to always retry. Increasing kv.transaction.max_refresh_spans_bytes should no longer be necessary for most workloads. Release justification: fix for new "functionality" - the reduction in the closed timestamp target duration. 46557: ui: Jobs / Statements description tooltip r=dhartunian a=elkmaster Updated job description tool tip to truncate at around ~425 characters Updated tooltip to 500px wide Resolves: #46078 Release justification: bug fixes and low-risk updates to new functionality Release note (ui): tooltips showing statements and jobs are limited in size for very long statements 46588: sql: use user transaction if we have one to prepare queries r=andreimatei a=ajwerner Prepartion of certain queries requires performing reads against the database. If the user has already laid down intents, these reads may become part of a dependency cycle. Prior to this commit, these reads would be on a different transaction and thus this cycle would not be detected by our deadlock detection mechanism. This change opts to use the user's transaction for planning if there is one and thus will properly interact with deadlock detection. Fixes #46447. Release justification: fixes for high-priority or high-severity bugs in existing functionality Release note: None 46700: sql: make aggregate builtins share the same memory account r=yuzefovich a=yuzefovich Release justification: fixes for high-priority or high-severity bugs in existing functionality (we could hit memory limit due to accounting long before the available RAM is actually used up). We recently fixed a couple of leaks in memory accounting by aggregate builtins. It was done in the same way that other similar aggregates were doing - by instantiating a separate memory account for each aggregate struct. However, when a single aggregate, say `anyNotNull`, wants to store a single datum, say `DInt` of size 8 bytes, when growing its own memory account will actually make a reservation of `mon.DefaultPoolAllocation = 10240` bytes although we will only be using 8 bytes. This can result in "memory-starvation" for OLAP-y queries because we're likely to hit `max-sql-memory` limit long before we're getting close to it because of such "overestimation" in the accounting. This commit fixes this problem by making all aggregates that aggregate a single datum (these are all aggregate builtins that perform memory accounting except for `arrayAgg` which works with multiple datums) to share the same memory account (when non-nil) which is plumbed via `tree.EvalContext` (this is unfortunate but, as always, seems like necessary evil). That account is instantiated by `rowexec.aggregator` and `rowexec.windower` processors. Also it is acceptable from the concurrency's point of view because the processors run in a single goroutine, so we will never have concurrent calls to grow/shrink this shared memory account. If for some reason the field for shared memory account is nil in the eval context, then we will fallback to old behavior of instantiating a memory account for each aggregate builtin struct. A helper struct was created (that is now embedded by all aggregate builtins in question) that unifies the memory accounting. Fixes: #46664. Release note: None (a follow-up to a recent PR). 46780: ui: Default sort by Execution Count column for Statements r=dhartunian a=koorosh Resolves: #46427 Before, default sorting was set to Latency column in Statements page that was unintuitive. Now it is sorted by Execution count column. Release note (admin ui change): Change default sorting column on Statements page to Execution Count Release justification: bug fixes and low-risk updates to new functionality <img width="1421" alt="Screenshot 2020-03-31 at 14 56 52" src="https://user-images.githubusercontent.com/3106437/78023771-edfba200-735f-11ea-93e6-db77c2582a00.png"> 46784: roachprod: Update Slack DM user lookup r=bobvawter a=bobvawter This change switches to finding users by their email addresses. It also logs any DM-lookup failures instead of silently ignoring them. X-Ref: https://github.com/cockroachdb/dev-inf/issues/65 Release note: None Co-authored-by: Nathan Stilwell <[email protected]> Co-authored-by: Vlad Los <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Andrii Vorobiov <[email protected]> Co-authored-by: Bob Vawter <[email protected]>

Prepartion of certain queries requires performing reads against the database. If the user has already laid down intents, these reads may become part of a dependency cycle. Prior to this commit, these reads would be on a different transaction and thus this cycle would not be detected by our deadlock detection mechanism. This change opts to use the user's transaction for planning if there is one and thus will properly interact with deadlock detection. Fixes cockroachdb#46447. Release justification: fixes for high-priority or high-severity bugs in existing functionality Release note: None

ajwerner mentioned this issue Mar 25, 2020

sql: use user transaction if we have one to prepare queries #46588

Merged

craig bot closed this as completed in 0b945bf Mar 31, 2020

ajwerner mentioned this issue Mar 31, 2020

release-20.1: sql: use user transaction if we have one to prepare queries #46794

Merged

ajwerner mentioned this issue Sep 21, 2020

kv: child transactions #54633

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: deadlock due to planning query for `SHOW TABLE` #46447

sql: deadlock due to planning query for `SHOW TABLE` #46447

ajwerner commented Mar 23, 2020 •

edited

Loading

ajwerner commented Mar 23, 2020

jordanlewis commented Mar 24, 2020

ajwerner commented Mar 24, 2020

andreimatei commented Mar 24, 2020

ajwerner commented Mar 24, 2020

andreimatei commented Mar 24, 2020

ajwerner commented Mar 24, 2020

andreimatei commented Mar 24, 2020

ajwerner commented Mar 24, 2020

sql: deadlock due to planning query for SHOW TABLE #46447

sql: deadlock due to planning query for SHOW TABLE #46447

Comments

ajwerner commented Mar 23, 2020 • edited Loading

ajwerner commented Mar 23, 2020

jordanlewis commented Mar 24, 2020

ajwerner commented Mar 24, 2020

andreimatei commented Mar 24, 2020

ajwerner commented Mar 24, 2020

andreimatei commented Mar 24, 2020

ajwerner commented Mar 24, 2020

andreimatei commented Mar 24, 2020

ajwerner commented Mar 24, 2020

sql: deadlock due to planning query for `SHOW TABLE` #46447

sql: deadlock due to planning query for `SHOW TABLE` #46447

ajwerner commented Mar 23, 2020 •

edited

Loading