Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importer: IMPORT INTO ... DELIMITED DATA with collated strings does not work #107917

Closed
otan opened this issue Jul 31, 2023 · 2 comments · Fixed by #107918
Closed

importer: IMPORT INTO ... DELIMITED DATA with collated strings does not work #107917

otan opened this issue Jul 31, 2023 · 2 comments · Fixed by #107918
Labels
A-import Issues related to IMPORT syntax C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@otan
Copy link
Contributor

otan commented Jul 31, 2023

Running

create type countrylanguage_isofficial_enum as enum('T', 'F');                                              
CREATE TABLE public.countrylanguage (                                                                           
	countrycode VARCHAR(3) NOT NULL DEFAULT '':::STRING,                                                     
	language VARCHAR(30) COLLATE "en_US" NOT NULL DEFAULT '':::STRING,
	isofficial public.countrylanguage_isofficial_enum NOT NULL DEFAULT 'F':::public.countrylanguage_isofficial_enum,
	percentage DECIMAL(4,1) NOT NULL DEFAULT 0.0:::DECIMAL,                                                              
	CONSTRAINT countrylanguage_pkey PRIMARY KEY (countrycode ASC, language ASC),                           
	INDEX countrycode (countrycode ASC, language ASC)
);
 IMPORT INTO countrylanguage
 DELIMITED DATA (                                                                 
 'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world-data/countrylanguage.txt'                   
 )	WITH
 	fields_enclosed_by='"',
	 fields_escaped_by='\';

Yields

ERROR: https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world-data/countrylanguage.txt: error parsing row 1: error while parse "language" as VARCHAR(30) COLLATE en_US: column "dutch" does not exist (row: "ABW"	"Dutch"	"T"	"5.3")

But insert into countrylanguage values ('ABW', 'Dutch', 'T', 5.3); works, as does removing COLLATE "en_US" from the column definition.

Jira issue: CRDB-30257

@otan otan added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 31, 2023
@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Jul 31, 2023
@otan otan changed the title importer: IMPORT ... COLLATED STRING with DELIMITED DATA doesn't work importer: IMPORT INTO ... DELIMITED DATA with collated strings does not work Jul 31, 2023
@otan
Copy link
Contributor Author

otan commented Jul 31, 2023

sample of countrylanguage.txt, in case it gets lost:

"ABW"	"Dutch"	"T"	"5.3"
"ABW"	"English"	"F"	"9.5"
"ABW"	"Papiamento"	"F"	"76.7"
"ABW"	"Spanish"	"F"	"7.4"
"AFG"	"Balochi"	"F"	"0.9"
"AFG"	"Dari"	"T"	"32.1"

@otan
Copy link
Contributor Author

otan commented Jul 31, 2023

Problem is the use of parser.ParseExpr in parseAsTyp for CollatedString:

expr, err := parser.ParseExpr(s)
if err != nil {
return nil, err
}
semaCtx := tree.MakeSemaContext()
typedExpr, err := tree.TypeCheck(ctx, expr, &semaCtx, typ)

seems to think this is a tree.UnresolvedName which is correct. not sure why ParseDatumAsString handles collated strings through parseAsTyp though:

// the internal postgres string representation of arrays.
case types.ArrayFamily, types.CollatedStringFamily:
return parseAsTyp(ctx, evalCtx, t, s)
default:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-import Issues related to IMPORT syntax C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants