-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT REVIEW] Table and schema level collations #48090
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
cloud-fan
pushed a commit
that referenced
this pull request
Dec 26, 2024
### What changes were proposed in this pull request? This change introduces table and view level collations support in Spark SQL, allowing CREATE TABLE, ALTER TABLE and CREATE VIEW commands to specify DEFAULT COLLATION to be used. For CREATE commands, this refers to all the underlying columns added as part of the table/view creation. For ALTER TABLE command, this refers to only newly created columns in the future, whereas existing ones are not affected, i.e. their collation remains the same. The PR has been modelled after the original changes made by stefankandic in #48090, with this PR covering table and view-level collations, whereas a follow up PR will be made covering schema-level collations. This PR is adding/extending the corresponding DDL commands for specifying table/view level collation, whereas a follow up PR will be created separately to leverage the table/view collation in order to determine default collations for input queries of DML commands. ### Why are the changes needed? From our internal users feedback, many people would like to be able to specify collation for their objects, instead of each individual columns. This change adds support for table and view level collations, whereas subsequent changes will add support for other objects such as schema-level collations. ### Does this PR introduce _any_ user-facing change? The change follows the agreed additions in syntax for collation support. The following syntax is now supported (**bold** parts denote additions): { { [CREATE OR] REPLACE TABLE | CREATE [EXTERNAL] TABLE [ IF NOT EXISTS ] } table_name [ table_specification ] [ USING data_source ] [ table_clauses ] [ AS query ] } table_specification ( { column_identifier column_type [ column_properties ] ] } [, ...] [ , table_constraint ] [...] ) table_clauses { OPTIONS clause | PARTITIONED BY clause | CLUSTER BY clause | clustered_by_clause | LOCATION path [ WITH ( CREDENTIAL credential_name ) ] | COMMENT table_comment | TBLPROPERTIES clause | **DEFAULT COLLATION table_collation_name |** WITH { ROW FILTER clause } } [...] CREATE [ OR REPLACE ] [ TEMPORARY ] VIEW [ IF NOT EXISTS ] view_name [ column_list ] [ schema_binding | COMMENT view_comment | TBLPROPERTIES clause | **DEFAULT COLLATION collation_name** ] [...] AS query ALTER TABLE table_name { ADD COLUMN clause | ALTER COLUMN clause | DROP COLUMN clause | RENAME COLUMN clause | **DEFAULT COLLATION clause | …** } ### How was this patch tested? Tests for the new syntax/functionality were added as part of the change. Also, some of the existing tests were extended/amended to cover the new DEFAULT COLLATION for table/view objects. ### Was this patch authored or co-authored using generative AI tooling? No Closes #49084 from dejankrak-db/object-level-collations. Authored-by: Dejan Krakovic <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?