-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IMPORT: support null as non-quoted string in import csv #19743
Comments
I took a look at doing this. There's a slight hiccup using golang's CSV reader. According to RFC 4180, the empty-quoted string @mjibson, @dianasaur323 Thoughts on whether or not we want to do this? Saving additional thoughts for later: cockroach/pkg/ccl/importccl/csv.go Line 410 in c49273d
// Map the column value to NULL if it matches the user-defined
// nullIf value or if the column is nullable and the raw string
// has a zero length. This distinguishes the empty string
// from the null string, e.g. (1,"",,4).
if nullif != nil && v == *nullif || col.IsNullable() && len(v) == 0 {
datums[i] = tree.DNull
} else {
datums[i], err = parser.ParseStringAs(col.Type.ToDatumType(), v, &evalCtx, cenv)
if err != nil {
return errors.Wrapf(err, "%s: row %d: parse %q as %s", batch.file, rowNum, col.Name, col.Type.SQLString())
}
} cockroach/pkg/ccl/importccl/csv_test.go Line 135 in c49273d
{
name: "empty string vs. zero string",
create: `
i int primary key,
s string,
s2 string not null
`,
csv: `1,,
2,"",""`,
query: map[string][][]string{
`SELECT i, s, s2 from d.t`: {
{"1", "NULL", ""},
{"2", "", ""},
},
},
}, |
It is possible we decide we just don't want to do this because the guarantees of COPY are different than CSV, even though they are similar. |
Closing since we haven't heard many requests for this recently. If anyone wants to see this feel free to reopen. |
A customer would like this feature supported, see this internal issue: Could not parse "NULL" as type bool or JSONB #1683. |
@rafiss Can we investigate what it would take to address this and not require the workaround mentioned in https://github.com/cockroachlabs/support/issues/1683#issuecomment-1172624382 |
It should be possible with around a week of work, since we forked the CSV reader logic and can control it. Implementation notes: The way to do it would be to modify how quoted strings versus unquoted strings are parsed. Then we'd probably need to change the return value of While investigating this, I also realized there is a bug in our implementation of
In Cockroach, this results in:
I think the fact that there's a bug in COPY too means we should prioritize a fix. |
@vy-ton I have a fix available, but how would you like to handle the backwards compatibility for this? Some specific questions:
|
Proposed guiding principles:
So applying this for COPY null handling:
@rafiss to answer your questions, let me know what you think
IMPORT should match CRDB/PG COPY
What does
I was thinking the default should match COPY and configuration needed for the existing behavior. Do you think that's too disruptive? |
If you do
That sounds fine to me. I'll mark it as a breaking change in the release notes |
Postgres
copy with csv
format differentiates a nullable text field that is null or an empty string by using""
as the empty string and an unquoted empty string (i.e., nothing) as null. This is round-trippable by default with the postgres copy. We should support this syntax.See https://forum.cockroachlabs.com/t/import-from-csv-fails-on-null-data-for-int-types/1067/1
Epic: CRDB-14049
Jira issue: CRDB-17420
The text was updated successfully, but these errors were encountered: