Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] INSERT ... ON CONFLICT batching does not detect multiple operations on primary key arbiter #24784

Open
1 task done
karthik-ramanathan-3006 opened this issue Nov 5, 2024 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@karthik-ramanathan-3006
Copy link
Contributor

karthik-ramanathan-3006 commented Nov 5, 2024

Jira Link: DB-13883

Description

Consider the following scenario:

CREATE TABLE test (i INT PRIMARY KEY, j INT);
INSERT INTO test VALUES (1, 1);

SET yb_insert_on_conflict_read_batch_size TO 1024;
INSERT INTO test VALUES (1, 1), (1, 4) ON CONFLICT (i) DO UPDATE SET j = test.j + 1;

The above case should throw the following error:

ON CONFLICT DO UPDATE command cannot affect row a second time

Instead the command succeeds.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@karthik-ramanathan-3006 karthik-ramanathan-3006 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Nov 5, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Nov 5, 2024
karthik-ramanathan-3006 added a commit that referenced this issue Nov 6, 2024
…t 1 + Bug Fixes

Summary:
D38354 introduced batching support for INSERT ... ON CONFLICT queries.
This revision seeks to address a subset of the design comments on that revision. In particular this refactor aims to:
 - Better integrate with/reuse existing YBCTID libraries for identifying unique index tuples.
 - Fix inconsistent results produced by different batch sizes.
 - Fix visibility issues observed in queries with a pipeline pattern (multiple CTEs, each doing a ModifyTable, each with a returning clause, feeding into the next)

**Summary of changes**
- Reuse existing YBCTID container infrastructure in pggate for storing tuple ID of “on-conflict” tuples.
- Use separate containers for storing batch-read tuples and “just-inserted” tuples.
- Move logic to update map of “just-inserted” tuples from execIndexing (at the time of enqueueing the index request) to nodeModifyTable

**Reusing existing YBCTID container infrastructure**

This revision reuses the YBCTID infrastructure that is used by the foreign key and explicit row locking buffers that already exists today. To do this, the key columns of unique (and non-NULL) indexes are treated as YBCTIDs. A session-level container is created to store the mapping between the IDs of tuples read by the index scan and their corresponding table slots. This design allows us to:

- Reuse existing YBCTID helper functions for equality comparisons.
- Mitigate tuple visibility issues through the creation of the global map when multiple INSERT … ON CONFLICT CTEs are chained together.
- Inherit benefits from improvements to YBCTID processing logic in the future.

**Use of separate containers to store batch-read tuples and “just-inserted” tuples**

The index batch-read workflow requires mapping key columns of a unique index to the tuple returned by the index scan. To check if the same row is modified twice, identifiers (ie. key columns) to index tuples are required to be stored. While the data stored is similar (hash of the key columns of the index), the data can be stored in two separate structures: a **hash map** for the index batch-read, and a **hash set** for the list of modified tuples. This design allows us to:

- Separate the scaling of the two data structures. If no restriction is placed on the size of the hash set, we can detect "command cannot affect row a second time" errors across batches of writes.
- Remain consistent with the design of other buffers such as the foreign key buffer. The foreign key buffer currently has no restriction on how many YBCTID intents are placed within its hash set.
- Cleanly remove the **hash set** when DocDB natively supports detection of tuples that are modified multiple times by the same query.

**Rework logic of when entries are added/deleted to on-conflict map**

The previous design adds and deletes entries to/from the on-conflict map when the payload for the index insert/delete operation is prepared (in ExecIndexing.c). This suffers from the following drawbacks:

- Index updates (DELETEs + INSERTs) may be skipped. If the update of an arbiter index is skipped, it is incorrectly not added/deleted from the on-conflict map.
- Primary key and secondary index modifications (insert/delete/update) have two distinct paths. This means that the previous design would have had to replicate this logic in both code paths, which it did not. This duplication is not required in the current design.

This revision reworks the above design to add/delete entries to the map in two different places (in nodeModifyTable.c):

- During on-conflict checking IF it is decided if the tuple will be inserted. Note that at this stage, the DO UPDATE projections have not been carried out (so we don’t know whether the updated values change the index keys).
- After the updated values have been projected if it is decided that the tuple will be updated.

This design currently has the drawback that the index tuple will have to be constructed one extra time (`FormIndexDatum()`) per tuple for tuples that are to be updated. This is particularly significant for expression indexes as it would involve evaluating the expression an extra time. This can be fixed for non-expression indexes by directly copying over the index key columns from the tuple slot. However, the fix for expression indexes requires storing and propagating the index tuple which is a large change relative to the benefit it offers.

**Future Work**
A second revision is planned that will:
- Add support for RETURNING clause
- Add limited support for foreign key relationships
- Fix bugs related to partitioned tables
Jira: DB-13766, DB-13883

Test Plan:
On Almalinux 8:

```
#!/usr/bin/env bash
set -euo pipefail
./yb_build.sh fastdebug --gcc11
find java/yb-pgsql/src/test/java/org/yb/pgsql -name 'TestPgRegressInsertOnConflict*' \
| grep -oE 'TestPgRegress\w+' \
| while read -r testname; do
  ./yb_build.sh fastdebug --gcc11 --java-test "$testname" --sj
done
```

Reviewers: amartsinchyk, smishra, jason, telgersma

Reviewed By: jason, telgersma

Subscribers: jason, smishra, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D39023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants