Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop changing the case for COPY TO option values #10853

Closed
tinfoil-knight opened this issue Jun 10, 2024 · 6 comments · Fixed by #11330
Closed

Stop changing the case for COPY TO option values #10853

tinfoil-knight opened this issue Jun 10, 2024 · 6 comments · Fixed by #11330
Labels
bug Something isn't working

Comments

@tinfoil-knight
Copy link
Contributor

Describe the bug

Currently, we're changing the case of every option value passed with the COPY statement:

if !(&key.contains('.')) {
// If config does not belong to any namespace, assume it is
// a format option and apply the format prefix for backwards
// compatibility.
let renamed_key = format!("format.{}", key);
options.insert(renamed_key.to_lowercase(), value_string.to_lowercase());
} else {
options.insert(key.to_lowercase(), value_string.to_lowercase());
}
}

This causes an issue in cases where the option values for some external integrations might be case sensitive (eg. like access keys).

@xinlifoobar found this while working on adding an integration to huggingface in #10792.

See previous discussion here: #9723 (comment)

To Reproduce

No response

Expected behavior

There should be some way to pass option values as is for certain integrations.

Additional context

No response

@alamb
Copy link
Contributor

alamb commented Jun 10, 2024

fyi @metesynnada

@xinlifoobar
Copy link
Contributor

Sorry if I mislead you. I encountered this issue in the CREATE EXTERNAL TABLE statement as

if !(&key.contains('.')) {
// If a config does not belong to any namespace, we assume it is
// a format option and apply the format prefix for backwards
// compatibility.
let renamed_key = format!("format.{}", key.to_lowercase());
options_map.insert(renamed_key, value_string.to_lowercase());
} else {
options_map.insert(key.to_lowercase(), value_string.to_lowercase());
}

How has such an issue typically been resolved previously? If adding a ParseOptions, e.g., enable_options_value_normalization, is a reasonable way to fix it, I could help there.

pub struct ParserOptions {
pub parse_float_as_decimal: bool,
pub enable_ident_normalization: bool,
}

@berkaysynnada
Copy link
Contributor

I believe we can remove the lowercase conversion here and instead add it when we set ConfigField's (via alter_with_string_hash_map) for the cases where we can ignore case sensitivity.

@berkaysynnada
Copy link
Contributor

If no one else is working on this issue, I'll take it. My plan is to remove the automatic conversion of values to lowercase and to create a lookup table for keys. This table will indicate the specific standardization methods.

@alamb
Copy link
Contributor

alamb commented Jul 5, 2024

Another potential option is to follow the model of the datafusion.sql_parser.enable_ident_normalization configuration setting to control if the options should be case sensitive or not. That might be the "least surprising" to people who are used (or not used) to SQL identifier normalization rules

I don't remember wll the usecases for configuration option case sensitivity, so please ignore this suggestion if it doesn't make sense

@xinlifoobar
Copy link
Contributor

enable_options_value_normalization

I draft #11330 to discuss this. Let me know your thoughts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants