storage: increase maxSize of key-value to 512MB, in pebbleMVCCScanner #61818

sumeerbhola · 2021-03-10T23:33:09Z

This is to work around the fact that the only enforcement
that limits writing large key-values is based on the
cluster setting kv.raft.command.max_size. So if someone
increases the setting and allows a large value to be
written, the read path will go into a panic.

Release note: None

This is to work around the fact that the only enforcement that limits writing large key-values is based on the cluster setting kv.raft.command.max_size. So if someone increases the setting and allows a large value to be written, the read path will go into a panic. Release note: None

cockroach-teamcity · 2021-03-10T23:33:16Z

This change is

andreimatei

LGTM

but later please come and explain why 128MB is the right number.
And also to consider if the right TODO is about better handling / rejection of the large write in the storage engine rather than relying on the KV setting to do anything.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)

sumeerbhola · 2021-03-10T23:52:16Z

bors r+

sumeerbhola · 2021-03-11T00:17:07Z

And also to consider if the right TODO is about better handling / rejection of the large write in the storage engine rather than relying on the KV setting to do anything.

I've created #61822

craig · 2021-03-11T00:29:54Z

Build succeeded:

GitHub CI (Cockroach)

Addresses: cockroachdb#67400 Add sql.mutations.max_row_size.err, a new cluster setting similar to sql.mutations.max_row_size.log, which limits the size of rows written to the database. Statements trying to write a row larger than this will fail with an error. (Internal queries will not fail with an error, but will log a LargeRowInternal event to the SQL_INTERNAL_PERF channel.) We're reusing eventpb.CommonLargeRowDetails as the error type, out of convenience. Release note (ops change): A new cluster setting, sql.mutations.max_row_size.err, was added, which limits the size of rows written to the database (or individual column families, if multiple column families are in use). Statements trying to write a row larger than this will fail with a code 54000 (program_limit_exceeded) error. (Internal queries writing a row larger than this will not fail, but will log a LargeRowInternal event to the SQL_INTERNAL_PERF channel.) This limit is enforced for INSERT, UPSERT, and UPDATE statements. CREATE TABLE AS, CREATE INDEX, ALTER TABLE, ALTER INDEX, IMPORT, and RESTORE will not fail with an error, but will log LargeRowInternal events to the SQL_INTERNAL_PERF channel. SELECT, DELETE, TRUNCATE, and DROP are not affected by this limit. **Note that existing rows violating the limit *cannot* be updated, unless the update shrinks the size of the row below the limit, but *can* be selected, deleted, altered, backed-up, and restored.** For this reason we recommend using the accompanying setting sql.mutations.max_row_size.log in conjunction with SELECT pg_column_size() queries to detect and fix any existing large rows before lowering sql.mutations.max_row_size.err. Release justification: Low risk, high benefit change to existing functionality. This causes statements adding large rows to fail with an error. Default is 512 MiB, which was the maximum KV size in 20.2 as of cockroachdb#61818 and also the default range_max_bytes in 21.1, meaning rows larger than this were not possible in 20.2 and are not going to perform well in 21.1.

67953: sql, kv: add sql.mutations.max_row_size guardrails r=rytaft,andreimatei a=michae2 **kv: set Batch.pErr during Batch.prepare** If we fail to construct a Batch (e.g. fail to marshal a key or value) then an error will be placed in the resultsBuf and the batch will not actually be sent to the layers below. In this case we still need to set Batch.pErr, so that Batch.MustPErr is able to return a roachpb.Error to higher layers without panicking. I imagine in practice we never fail to marshal the key or value, so we have never seen this panic in the wild. Release note: None Release justification: Bug fix. **sql: add sql.mutations.max_row_size.log guardrail (large row logging)** Addresses: #67400 Add sql.mutations.max_row_size.log, a new cluster setting which controls large row logging. Rows larger than this size will have their primary keys logged to the SQL_PERF or SQL_INTERNAL_PERF channels whenever the SQL layer puts them into the KV layer. This logging takes place in rowHelper, which is used by both row.Inserter and row.Updater. Most of the work is plumbing settings.Values and SessionData into rowHelper, and adding a new structured event type. Release note (ops change): A new cluster setting, sql.mutations.max_row_size.log, was added, which controls large row logging. Whenever a row larger than this size is written (or a single column family if multiple column families are in use) a LargeRow event is logged to the SQL_PERF channel (or a LargeRowInternal event is logged to SQL_INTERNAL_PERF if the row was added by an internal query). This could occur for INSERT, UPSERT, UPDATE, CREATE TABLE AS, CREATE INDEX, ALTER TABLE, ALTER INDEX, IMPORT, or RESTORE statements. SELECT, DELETE, TRUNCATE, and DROP are not affected by this setting. Release justification: Low risk, high benefit change to existing functionality. This adds logging whenever a large row is written to the database. Default is 64 MiB, which is also the default for kv.raft.command.max_size, meaning on a cluster with default settings statements writing these rows will fail with an error anyway. **sql: add sql.mutations.max_row_size.err guardrail (large row errors)** Addresses: #67400 Add sql.mutations.max_row_size.err, a new cluster setting similar to sql.mutations.max_row_size.log, which limits the size of rows written to the database. Statements trying to write a row larger than this will fail with an error. (Internal queries will not fail with an error, but will log a LargeRowInternal event to the SQL_INTERNAL_PERF channel.) We're reusing eventpb.CommonLargeRowDetails as the error type, out of convenience. Release note (ops change): A new cluster setting, sql.mutations.max_row_size.err, was added, which limits the size of rows written to the database (or individual column families, if multiple column families are in use). Statements trying to write a row larger than this will fail with a code 54000 (program_limit_exceeded) error. (Internal queries writing a row larger than this will not fail, but will log a LargeRowInternal event to the SQL_INTERNAL_PERF channel.) This limit is enforced for INSERT, UPSERT, and UPDATE statements. CREATE TABLE AS, CREATE INDEX, ALTER TABLE, ALTER INDEX, IMPORT, and RESTORE will not fail with an error, but will log LargeRowInternal events to the SQL_INTERNAL_PERF channel. SELECT, DELETE, TRUNCATE, and DROP are not affected by this limit. **Note that existing rows violating the limit *cannot* be updated, unless the update shrinks the size of the row below the limit, but *can* be selected, deleted, altered, backed-up, and restored.** For this reason we recommend using the accompanying setting sql.mutations.max_row_size.log in conjunction with SELECT pg_column_size() queries to detect and fix any existing large rows before lowering sql.mutations.max_row_size.err. Release justification: Low risk, high benefit change to existing functionality. This causes statements adding large rows to fail with an error. Default is 512 MiB, which was the maximum KV size in 20.2 as of #61818 and also the default range_max_bytes in 21.1, meaning rows larger than this were not possible in 20.2 and are not going to perform well in 21.1. Co-authored-by: Michael Erickson <[email protected]>

sumeerbhola requested review from jordanlewis, andreimatei and ajwerner March 10, 2021 23:33

andreimatei reviewed Mar 10, 2021

View reviewed changes

sumeerbhola mentioned this pull request Mar 11, 2021

storage: enforce limit on key+value size #61822

Closed

craig bot merged commit dec0fec into cockroachdb:release-20.2 Mar 11, 2021

glennfawcett mentioned this pull request Mar 12, 2021

Exponential backoff for nodes in a panic loop #61886

Open

daniel-crlabs mentioned this pull request Apr 14, 2021

Column Backfill causing ERROR: command is too large: 268046651 bytes (max: 67108864) #63634

Closed

michae2 mentioned this pull request Aug 24, 2021

sql, kv: add sql.mutations.max_row_size guardrails #67953

Merged

cockroach-teamcity mentioned this pull request Aug 24, 2021

sql, kv: add sql.mutations.max_row_size guardrails cockroachdb/docs#11115

Closed

michae2 mentioned this pull request Sep 8, 2021

release-21.1: sql: add max_row_size guardrails #69946

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: increase maxSize of key-value to 512MB, in pebbleMVCCScanner #61818

storage: increase maxSize of key-value to 512MB, in pebbleMVCCScanner #61818

sumeerbhola commented Mar 10, 2021

cockroach-teamcity commented Mar 10, 2021

andreimatei left a comment

sumeerbhola commented Mar 10, 2021

sumeerbhola commented Mar 11, 2021

craig bot commented Mar 11, 2021

storage: increase maxSize of key-value to 512MB, in pebbleMVCCScanner #61818

storage: increase maxSize of key-value to 512MB, in pebbleMVCCScanner #61818

Conversation

sumeerbhola commented Mar 10, 2021

cockroach-teamcity commented Mar 10, 2021

andreimatei left a comment

Choose a reason for hiding this comment

sumeerbhola commented Mar 10, 2021

sumeerbhola commented Mar 11, 2021

craig bot commented Mar 11, 2021