Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Maximums on row sizes should be added/enforced #23636

Open
1 task done
es1024 opened this issue Aug 26, 2024 · 2 comments
Open
1 task done

[DocDB] Maximums on row sizes should be added/enforced #23636

es1024 opened this issue Aug 26, 2024 · 2 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/low Low priority

Comments

@es1024
Copy link
Contributor

es1024 commented Aug 26, 2024

Jira Link: DB-12549

Description

It is currently possible to create rows that are multiple GB in size:

CREATE TABLE test(pk INT PRIMARY KEY, col0 TEXT, col1 TEXT, ...);
INSERT INTO test(pk) VALUES (0);
UPDATE test SET col0 = REPEAT('0', 250000000) WHERE pk = 0;
UPDATE test SET col1 = REPEAT('0', 250000000) WHERE pk = 0;
...

The size of each column is bounded by rpc_max_message_size (each query is individually bounded by this). Specifying the primary key allows us to avoid a read of the row, which would otherwise cap the row size at ~2 * rpc_max_message_size (which is still problematic).

This leads to issues later on when the row is read in full (from a SELECT *, index backfill, etc.) -- the read response will blow past rpc_max_message_size and cause the read to fail. This was also the cause of #22301.

It is also difficult to find the large row in question, since reading the entire row to find its size also fails. rpc_max_message_size is limited by limitations from the protobuf library to values under 512MB, so it's also not an option to simply increase rpc_max_message_size to be larger than the largest row.

As we cannot properly handle these very large rows, we should have checks to prevent these rows from being written in the first place. A possible value for max row size would be a bit less than half of rpc_max_message_size, which would let us ensure that we never hit rpc_max_message_size responding to a read request, by capping yb_fetch_size_limit to the max row size.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@es1024 es1024 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Aug 26, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Aug 26, 2024
@rthallamko3 rthallamko3 added area/docdb YugabyteDB core features and removed area/ysql Yugabyte SQL (YSQL) labels Sep 11, 2024
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Sep 11, 2024
@rthallamko3 rthallamko3 changed the title [YSQL] Maximums on row sizes should be added/enforced [DocDB] Maximums on row sizes should be added/enforced Sep 11, 2024
@rthallamko3
Copy link
Contributor

@arybochkin , the question is whether the update that is causing the size of the entire row to go over rpc_max_message_size to fail.

For example, in the following scenario

CREATE TABLE test(pk INT PRIMARY KEY, col0 TEXT, col1 TEXT, ...);
INSERT INTO test(pk) VALUES (0);

UPDATE test SET col0 = REPEAT('0', 250000000) WHERE pk = 0;

UPDATE test SET col1 = REPEAT('0', 250000000) WHERE pk = 0; 

The last update can cause the size of the row to go over the limit. Can we fail the last update, so that subsequent selects on this PK not fail.

@rthallamko3 rthallamko3 added priority/low Low priority kind/enhancement This is an enhancement of an existing feature and removed priority/medium Medium priority issue kind/bug This issue is a bug labels Oct 24, 2024
@arybochkin
Copy link
Contributor

@rthallamko3
it will require a number of changes to add such validation.
background:

  • a table row may be stored in several records. a record is a key-value pair. e.g. the table row from the example has 3 records: a packed row (insert) + 2 updates.
  • a table row is read for an update operation to check its existence and make the update
  • when a table row is being read, the corresponding packed row and all the latest updates are being iterated and loaded into memory one by one.
  • there’s a table row serialization method from a table row into a buffer sent back to PG for read operations. this method is based on projection and builds a chain of encoders (once per operation), then a chain of encoders is applied for every constructed table row object to serialize the given row into a buffer (sidecars).
  • a structure, which is used to represent a table row, does not contain a continuous set of bytes, instead it has a collection on decoded columns (only required columns).

so, to get a row size which is passed back to PG, in the update handler, it is required to:

  1. create a chain of encoder for the whole row (iteration over all columns and still not trivial implementation because the base code uses repeated PgsqlExpressionPB targets from the read request for that).
  2. encode the constructed table row with the help of that chain of encoders (again, iteration over all the columns/encoders)
  3. check buffer’s taken space.
  4. to slightly speed the process, the current implementation of encoders could be updated to not encode the row, but just calculate the size it would take (it’s not clear, maybe a parallel implementation would be required for the size calculation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/low Low priority
Projects
None yet
Development

No branches or pull requests

7 participants
@es1024 @rthallamko3 @yugabyte-ci @arybochkin and others