Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[#23490] YSQL: Tighten notion of equality for update optimizations
Summary: ### Background Data undergoes multiple transformations during the lifetime of a query. The data that is input by a query may be type modified (`typmod`), padded and aligned (`typalign`), type casted, compressed (`typstorage`) and normalized before being stored in a Datum as part of a Postgres Tuple. This Datum is then massaged into a network format (`protobuf`) as it is sent over the network to a tserver. Finally, the Datum is unpacked and stored in persistent storage in a format that is supported by DocDB/RocksDB. The reverse process happens when the same piece of data needs to be output back to the user. Each of these formats have their own notion of equality: Postgres has the notion of semantic/logical equality (`{"a": 1, "b": 2}` and `{"b": 2, "a": 1}` are equivalent) and storage/binary equality (the binary representations of `{"a": 1, "b": 2}` and `{"b": 2, "a": 1}` maybe different if alternative representations of a value are not normalized). Similarly DocDB has its own notion of semantic and storage equality. In most cases, these notions of equality can be used interchangeably and with good reason: - There is a 1:1 correspondence between the data stored in different formats. That is, a datum’s representation in postgres has exactly one corresponding representation in DocDB that allows the datum to be transformed seamlessly between the formats. This also implies that two datums whose binary representations are identical in postgres will also have identical representations in DocDB. This property is ubiquitously exploited to pushdown postgres operations to DocDB, and to indeed use DocDB as a storage engine for postgres. - Postgres also normalizes the representation of a datum’s value when it is packed into a Datum prior to storage. That is, if a given value of a given data type has multiple representations (the json from the example above), postgres converts the value into a normalized representation, which allows semantic equality to be interchangeably used with storage equality (if multiple representations of a value are represented in-memory/on-disk identically, they will also be stored identically). For data types that are not normalized, postgres does not define an equality operator (`json` data type is not normalized and does not have an equality operator, while `jsonb` data type is normalized and has an equality operator) This leads to a couple of problems: - There are occasions where we may want to know if two datums are stored identically when the data type that the datums belong to, does not have an equality operator. On such occasions, there is a distinction between semantic (not defined) and storage equality (defined). - Postgres is a highly extensible database that allows users to define custom data types and equality operators. In user-defined scenarios, it is also possible to end up with a difference between semantic and storage equality. ### This revision We perform the following optimizations on UPDATE queries that rely on *some* notion of equality: 1. If a BEFORE UPDATE FOR EACH ROW trigger is defined, we skip redundant index updates by comparing the old (pre-update) and new (post-update) values of a column. 2. With D34040, we also have a framework to skip index updates and constraint checking in cases where the value of a column remains unchanged by the update process. Both of these optimizations rely on semantic equality today. However, they should rely on storage/binary equality to correctly handle the problems mentioned above: - A given data type may not define an equality operator. In the absence of storage equality, for correctness in such cases, we must assume that the columns of such data types always change in value. - A user-defined data type may have funky notions of semantic equality (and set membership). This can lead to correctness issues in cases such as partial indexes, when two representations of a given value are considered semantically equal, but are not stored identically (not normalized) and membership to the partial index relies on a membership function that is sensitive to the storage representation. (eg: `{"a": 1, "b": 2}` and `{"b": 2, "a": 1}` are not stored identically and a partial index is defined on `begins with '{”a”: 1’`) This revision switches to the use of storage equality for the above optimizations with the caveat that the function used for the comparison (`datumIsEqual`) does not support TOASTed storage. Jira: DB-12404 Test Plan: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule' ``` Reviewers: amartsinchyk, mihnea, smishra Reviewed By: amartsinchyk Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37384
- Loading branch information