Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support IMPORT PGDUMP into table with partial indexes #50225

Closed
mgartner opened this issue Jun 15, 2020 · 6 comments
Closed

sql: support IMPORT PGDUMP into table with partial indexes #50225

mgartner opened this issue Jun 15, 2020 · 6 comments
Assignees
Labels
A-partial-indexes Relating to partial indexes. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) docs-done docs-known-limitation T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-anchored-telemetry The issue number is anchored by telemetry references.

Comments

@mgartner
Copy link
Collaborator

mgartner commented Jun 15, 2020

Description

CockroachDB does not currently support IMPORTing into a table with partial indexes.

Workaround for IMPORTing into a table with partial indexes

  1. Drop any partial indexes defined on the table.
  2. Perform the IMPORT.
  3. Recreate the partial indexes.

Workaround for IMPORTing a PGDUMP with partial indexes

  1. Drop the partial indexes on the Postgres server.
  2. Recreate the PGDUMP.
  3. IMPORT the PGDUMP.
  4. Add partial indexes on the CockroachDB server.

Notes

In order to support this, the DatumRowConverter will need to evaluate partial index predicate expressions with the variables replaced with each row's values. I've left a TODO here.

Jira issue: CRDB-4140
Epic CRDB-8353

@mgartner mgartner added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-partial-indexes Relating to partial indexes. labels Jun 15, 2020
@mgartner
Copy link
Collaborator Author

Currently, there is no paradigm for evaluating expressions, like partial index predicates, during an IMPORT. IMPORT doesn't currently support computed columns. It also doesn't validate the rows satisfy CHECK constraints:

root@localhost:26257/defaultdb> create table t (a int, b int, check (a > 2));
CREATE TABLE

root@localhost:26257/defaultdb> show create table t;
  table_name |              create_statement
+------------+---------------------------------------------+
  t          | CREATE TABLE t (
             |     a INT8 NULL,
             |     b INT8 NULL,
             |     FAMILY "primary" (a, b, rowid),
             |     CONSTRAINT check_a CHECK (a > 2:::INT8)
             | )
(1 row)

root@localhost:26257/defaultdb> import into t (a, b) csv data ('nodelocal://1/test.c
sv');
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
+--------------------+-----------+--------------------+------+---------------+-------+
  574429359494791169 | succeeded |                  1 |    5 |             0 |   105
(1 row)

root@localhost:26257/defaultdb> select * from t;
  a  | b
+----+----+
   1 |  1
   2 |  2
   3 |  3
  11 | 11
  40 | 40
(5 rows)

Therefore, I think a suitable first step for partial indexes is to err when trying to IMPORT into a table with partial indexes. The work around would be to drop the partial index, issue the IMPORT, then re-add the partial index.

@mgartner mgartner changed the title row: support partial indexes in row converter row: support IMPORT into table with partial indexes Aug 24, 2020
@mgartner mgartner changed the title row: support IMPORT into table with partial indexes sql: support IMPORT into table with partial indexes Aug 24, 2020
@knz knz added the X-anchored-telemetry The issue number is anchored by telemetry references. label Aug 25, 2020
@avivimgn
Copy link

avivimgn commented Nov 10, 2021

Another workaround that has worked for us is to use pg_dump --section=pre-data --section=data, this will dump the db without all indices.

adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 22, 2021
This change adds logic to IMPORT INTO to support partial
indexes when ingesting data into an existing table. The
change relies on the `PartialIndexUpdateHelper` to
indicate whether a row should be added to a partial index
or not.

It is important to call out that this only adds support to
formats supporting IMPORT INTO i.e. CSV, Delimited, AVRO.
Bundle formats do not support partial indexes.

Informs: cockroachdb#50225

Release note (sql change): Users can now IMPORT INTO a table
with partial indexes from CSV, AVRO, and Delimited formats.
This was previously disallowed.
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 22, 2021
This change adds logic to IMPORT INTO to support partial
indexes when ingesting data into an existing table. The
change relies on the `PartialIndexUpdateHelper` to
indicate whether a row should be added to a partial index
or not.

It is important to call out that this only adds support to
formats supporting IMPORT INTO i.e. CSV, Delimited, AVRO.
Bundle formats do not support partial indexes.

Informs: cockroachdb#50225

Release note (sql change): Users can now IMPORT INTO a table
with partial indexes from CSV, AVRO, and Delimited formats.
This was previously disallowed.
@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Apr 14, 2022
@vy-ton
Copy link
Contributor

vy-ton commented Apr 14, 2022

fyi @rafiss another SQL incompatibility mentioned for IMPORT

craig bot pushed a commit that referenced this issue Aug 2, 2022
85036: kvserver,storage: Support range keys in CheckSSTConflicts r=erikgrinaker a=itsbilal

Add support for MVCC range tombstones when checking for
SST conflicts and calculating stats incrementally
in CheckSSTConflicts. This unfortunately blew up
logic in CheckSSTConflicts a lot, but most of that should
be gated behind conditionals that cause a low impact
on performance for non-rangekey sstables.

One part of #83405.

Release note: None.

85094: sqlccl: fix TestAuthenticateWithSessionRevivalToken flake r=knz a=rafiss

fixes #84899

The test must wait for the cluster setting to propagate to avoid flakes.

Release note: None

85244: importccl: support partial indexes in IMPORT INTO r=mgartner a=ecwall

This change adds logic to IMPORT INTO to support partial
indexes when ingesting data into an existing table. The
change relies on the `PartialIndexUpdateHelper` to
indicate whether a row should be added to a partial index
or not.

It is important to call out that this only adds support to
formats supporting IMPORT INTO i.e. CSV, Delimited, AVRO.
Bundle formats do not support partial indexes.

Informs: #50225

Release note (sql change): Users can now IMPORT INTO a table
with partial indexes from CSV, AVRO, and Delimited formats.
This was previously disallowed.

Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Evan Wall <[email protected]>
@rafiss rafiss closed this as completed Aug 2, 2022
@rafiss rafiss reopened this Aug 2, 2022
@ecwall
Copy link
Contributor

ecwall commented Aug 2, 2022

Workaround for IMPORTing into a table with partial indexes

This has been tested and merged in #85244, but PGDUMP has not yet been tested.

@ajwerner
Copy link
Contributor

@ecwall or @rafiss is it worth making a new issue to test pgdump and close this one?

@exalate-issue-sync exalate-issue-sync bot assigned ZhouXing19 and unassigned ecwall Oct 10, 2022
@rafiss rafiss changed the title sql: support IMPORT into table with partial indexes sql: support IMPORT PGDUMP into table with partial indexes Nov 16, 2022
@rafiss
Copy link
Collaborator

rafiss commented Feb 6, 2023

we won't schedule this work, as per #93660

@rafiss rafiss closed this as completed Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-partial-indexes Relating to partial indexes. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) docs-done docs-known-limitation T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-anchored-telemetry The issue number is anchored by telemetry references.
Projects
None yet
Development

No branches or pull requests

10 participants