Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import: support import with default next_val columns #54797

Closed
mwang1026 opened this issue Sep 25, 2020 · 2 comments
Closed

import: support import with default next_val columns #54797

mwang1026 opened this issue Sep 25, 2020 · 2 comments
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery

Comments

@mwang1026
Copy link

A lot of good discussion #48253

We started knocking off IMPORT of default expressions but didn't get to next_val for 20.2. Tracking issue to continue this work.

@mwang1026 mwang1026 added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-disaster-recovery labels Sep 25, 2020
@Anzoteh96
Copy link

Would be happy to hear how things turn out in the end! There's some partial work / outline from #52910 (hmm...close that maybe?), which hopefully can be helpful for anyone who continues this later. :)

craig bot pushed a commit that referenced this issue Dec 21, 2020
56473: importccl: add `nextval` support for IMPORT INTO CSV r=miretskiy,pbardea a=adityamaru

Previously, `nextval` was not supported as a default expression for a
non-targeted import into column. This change adds that functionality for
a CSV import.

There is a lot of great discussion about the approach to this problem at
#48253 (comment).

At a high level, on encountering a nextval(seqname) for the first time,
IMPORT will reserve a chunk of values for this sequence, and tie those
values to the (fileIdx, rowNum) which is a unique reference to a
particular row in a distributed import. The size of this chunk grows
exponentially based on how many times a single processor encounters a
nextval call for that particular sequence. The reservation of the chunk
piggy backs on existing methods which provide atomic, non-transactional
guarantees when it comes to increasing the value of a sequence.

Information about the reserved chunks is stored in the import job
progress details so as to ensure the following property:

If the import job were to be paused and then resumed, assuming all the
rows imported were not checkpointed, we need to ensure that the nextval
value for a previously processed (fileIdx, rowNum) is identical to the
value computed in the first run of the import job. This property is
necessary to prevent duplicate entries with the same key but different
value. We use the jobs progress details to check if we have a previously
reserved chunk of sequence values which can be used for the current
(fileIdx, rowNum).

Informs: #54797

Release note (sql change): IMPORT INTO for CSV now supports nextval as a
default expression of a non-targeted column.

Co-authored-by: Aditya Maru <[email protected]>
@mwang1026
Copy link
Author

@adityamaru is this done with #56473? If so can you close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery
Projects
No open projects
Archived in project
Development

No branches or pull requests

4 participants