sql: make transaction_rows_{read|written}_err and large_full_scan_rows guardrails halt query execution #70473

michae2 · 2021-09-21T05:06:28Z

We recently added guardrails to limit the number of rows read or written by a single transaction. A transaction reading more than transaction_rows_read_err rows (or writing more than transaction_rows_written_err rows) now fails with an error.

To simplify implementation, we chose to emit this error after the violating query has already finished executing and delivered results to the application. This is still quite useful: well-behaved applications will see the error, and the transaction will be aborted. But there is a risk that poorly written applications might not see the error after receiving results of a read-only autocommit query. There is also a risk that the violating query will have already hurt cluster performance before hitting the guardrail.

It would be more defensive for the current query execution to halt immediately upon violating one of these limits.

Here's an example:

CREATE TABLE t (i INT PRIMARY KEY);
INSERT INTO t SELECT generate_series(0, 19);
SET transaction_rows_read_err = 10;
SET transaction_rows_written_err = 10;

[email protected]:26257/defaultdb> SELECT * FROM t;
  i
------
   0
   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
(20 rows)
(error encountered after some results were delivered)
ERROR: txn has read 20 rows, which is above the limit: TxnID cea43f29-a4a0-4996-a92d-f283c2626462 SessionID 16a6bd70640294880000000000000001
SQLSTATE: 54000

[email protected]:26257/defaultdb> INSERT INTO t SELECT generate_series(20, 39) RETURNING i;
  i
------
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
(20 rows)
(error encountered after some results were delivered)
ERROR: txn has written 20 rows, which is above the limit: TxnID 32334f62-5054-4c26-afca-5b88ff5a317f SessionID 16a6bd70640294880000000000000001
SQLSTATE: 54000

Jira issue: CRDB-10088

The text was updated successfully, but these errors were encountered:

vy-ton · 2021-10-14T16:39:50Z

There is also a risk that the violating query will have already hurt cluster performance before hitting the guardrail.

@rytaft @jordanlewis We should revisit this issue. The motivation of these guardrails would be to protect the cluster before destabilization.

rytaft · 2021-10-26T18:50:14Z

This is likely a non-trivial amount of work. We need to see if it's needed. @vy-ton

michae2 · 2022-09-28T19:08:25Z

Note for posterity: when fixing this, we might be able to take advantage of whatever mechanism is used for transaction timeouts (though there will be more to this fix than just invoking that mechanism).

michae2 · 2023-05-31T21:23:08Z

When we fix this, it would be nice to use the same technique to make a full scan halt immediately if it reads more than large_full_scan_rows when disallow_full_table_scans is in use.

Prior to this commit, setting transaction_rows_read_err to a non-zero value would cause a transaction to fail as soon as a statement caused the total number of rows read to exceed transaction_rows_read_err. However, it was possible that each statement could read many more than transaction_rows_read_err rows. This commit adds logic so that a single scan never reads more than transaction_rows_read_err+1 rows if transaction_rows_read_err is set. Informs cockroachdb#70473 Release note (performance improvement): If transaction_rows_read_err is set to a non-zero value, we now ensure that any single scan never reads more than transaction_rows_read_err+1 rows. This prevents transactions that would error due to the transaction_rows_read_err setting from causing a large performance overhead due to large scans.

104290: sql: make transaction_rows_read_err prevent large scans r=rytaft a=rytaft Prior to this commit, setting `transaction_rows_read_err` to a non-zero value would cause a transaction to fail as soon as a statement caused the total number of rows read to exceed `transaction_rows_read_err`. However, it was possible that each statement could read many more than `transaction_rows_read_err` rows. This commit adds logic so that a single scan never reads more than `transaction_rows_read_err+1` rows if `transaction_rows_read_err` is set. Informs #70473 Release note (performance improvement): If `transaction_rows_read_err` is set to a non-zero value, we now ensure that any single scan never reads more than `transaction_rows_read_err+1` rows. This prevents transactions that would error due to the `transaction_rows_read_err` setting from causing a large performance overhead due to large scans. 104337: sql: fix reset_sql_stats to truncate activity tables r=j82w a=j82w Previously: The reset stats on the ui and crdb_internal.reset_sql_stats() would only reset the statement_statistics and transaction_statics tables. This would leave the sql_activity table with old data. The reset stats now truncates the sql_activity table as well. Fixes: #104321 Epic: none Release note (sql change): Fix crdb_internal.reset_sql_stats() to cleanup the sql_activity table which work as a cache for the stats. Co-authored-by: Rebecca Taft <[email protected]> Co-authored-by: j82w <[email protected]>

Prior to this commit, setting transaction_rows_read_err to a non-zero value would cause a transaction to fail as soon as a statement caused the total number of rows read to exceed transaction_rows_read_err. However, it was possible that each statement could read many more than transaction_rows_read_err rows. This commit adds logic so that a single scan never reads more than transaction_rows_read_err+1 rows if transaction_rows_read_err is set. Informs cockroachdb#70473 Release note (performance improvement): If transaction_rows_read_err is set to a non-zero value, we now ensure that any single scan never reads more than transaction_rows_read_err+1 rows. This prevents transactions that would error due to the transaction_rows_read_err setting from causing a large performance overhead due to large scans.

Prior to this commit, setting transaction_rows_read_err to a non-zero value would cause a transaction to fail as soon as a statement caused the total number of rows read to exceed transaction_rows_read_err. However, it was possible that each statement could read many more than transaction_rows_read_err rows. This commit adds logic so that a single scan never reads more than transaction_rows_read_err+1 rows if transaction_rows_read_err is set. Informs cockroachdb#70473 Release note (performance improvement): If transaction_rows_read_err is set to a non-zero value, we now ensure that any single scan never reads more than transaction_rows_read_err+1 rows. This prevents transactions that would error due to the transaction_rows_read_err setting from causing a large performance overhead due to large scans. For some queries in rare cases this change may end up disabling cross-range parallelism of the scan operation which can result in increase of query latency.

michae2 added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-execution Relating to SQL execution. T-sql-queries SQL Queries Team labels Sep 21, 2021

michae2 added the docs-known-limitation label May 18, 2022

lnhsingh added the docs-done label May 19, 2022

rytaft mentioned this issue Jun 2, 2023

sql: make transaction_rows_read_err prevent large scans #104290

Merged

rytaft mentioned this issue Jun 5, 2023

release-23.1: sql: make transaction_rows_read_err prevent large scans #104364

Merged

rytaft mentioned this issue Jun 5, 2023

release-22.2: sql: make transaction_rows_read_err prevent large scans #104368

Merged

rytaft changed the title ~~sql: make transaction_rows_{read|written}_err guardrails halt query execution~~ sql: make transaction_rows_{read|written}_err and large_full_scan_rows guardrails halt query execution Jun 13, 2023

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to 23.2 Release in SQL Queries Jul 24, 2023

michae2 moved this from 23.2 Release to 24.1 Release in SQL Queries Sep 12, 2023

mgartner moved this from 24.1 Release to New Backlog in SQL Queries Nov 28, 2023

michae2 added the O-postmortem Originated from a Postmortem action item. label Sep 20, 2024

michae2 moved this from Backlog to Triage in SQL Queries Sep 20, 2024

mgartner moved this from Triage to Backlog in SQL Queries Sep 24, 2024

mgartner added the P-3 Issues/test failures with no fix SLA label Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: make transaction_rows_{read|written}_err and large_full_scan_rows guardrails halt query execution #70473

sql: make transaction_rows_{read|written}_err and large_full_scan_rows guardrails halt query execution #70473

michae2 commented Sep 21, 2021 •

edited by cockroach-jira-scripts

Loading

vy-ton commented Oct 14, 2021 •

edited

Loading

rytaft commented Oct 26, 2021

michae2 commented Sep 28, 2022

michae2 commented May 31, 2023

sql: make transaction_rows_{read|written}_err and large_full_scan_rows guardrails halt query execution #70473

sql: make transaction_rows_{read|written}_err and large_full_scan_rows guardrails halt query execution #70473

Comments

michae2 commented Sep 21, 2021 • edited by cockroach-jira-scripts Loading

vy-ton commented Oct 14, 2021 • edited Loading

rytaft commented Oct 26, 2021

michae2 commented Sep 28, 2022

michae2 commented May 31, 2023

michae2 commented Sep 21, 2021 •

edited by cockroach-jira-scripts

Loading

vy-ton commented Oct 14, 2021 •

edited

Loading