Support quoted identifiers in Iceberg partitioning #12227

mdesmet · 2022-05-03T18:55:03Z

Description

Adds support for quoted identifiers in Iceberg partitioning.

Trino Iceberg allows tables to be created using quoted identifiers.

CREATE TABLE test AS SELECT 1 as "a quoted identifier";

However when a partitioning property is added these columns can't be declared.

CREATE TABLE test WITH(partitioning=ARRAY['a quoted identifier']) ... fails with error Invalid partition field declaration: a quoted identifier

Is this change a fix, improvement, new feature, refactoring, or other?

Fix

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Change to Iceberg connector

How would you describe this change to a non-technical end user or system administrator?

Related issues, pull requests, and links

resources. For example:

Fixes Allow defining Iceberg partitioning over a column with whitespace in its name #12226

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

findepi · 2022-05-04T11:56:05Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionFields.java

@@ -29,10 +29,13 @@

 public final class PartitionFields
 {
-    private static final String NAME = "[a-z_][a-z0-9_]*";
+    private static final String IDENTIFIER = "[[a-z]_][[a-z0-9]_]*";
+    private static final String QUOTED_IDENTIFIER = "(?:\"[^\"]*\")+";


a column name can contain a quotation mark itself (").
in SQL, it is denoted by repeating the character "a column with "" quotation mark"

It was indeed only partially implemented. I have adjusted the regex and added additional test cases. Build failure doesn't seem related java.util.concurrent.TimeoutException: Idle timeout 5000 ms. Build was green locally.

findinpath · 2022-05-07T04:25:20Z

Another place where the parsing of the partition field fails is the following:

CREATE TABLE iceberg.default.testp3  WITH (partitioning = ARRAY['truncate(name   , 1)']) AS SELECT * FROM tpch.sf1.nation WHERE nationkey < 10;
Query 20220506_135829_00008_wqaig failed: Invalid partition field declaration: truncate(name   , 1)
java.lang.IllegalArgumentException: Invalid partition field declaration: truncate(name   , 1)
	at io.trino.plugin.iceberg.PartitionFields.parsePartitionField(PartitionFields.java:73)
	at io.trino.plugin.iceberg.PartitionFields.parsePartitionFields(PartitionFields.java:54)
	at io.trino.plugin.iceberg.IcebergMetadata.getNewTableLayout(IcebergMetadata.java:542)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.getNewTableLayout(ClassLoaderSafeConnectorMetadata.java:118)

findepi · 2022-05-09T07:27:19Z

cc @electrum @phd3 @alexjo2144

findinpath · 2022-05-09T10:49:53Z

trino> CREATE TABLE iceberg.default.test AS SELECT 1 as "a quoted identifier";
CREATE TABLE: 1 row

Query 20220509_104608_00015_idcj4, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
1.45 [0 rows, 0B] [0 rows/s, 0B/s]

trino> CREATE TABLE iceberg.default.test2 WITH(partitioning=ARRAY['a quoted identifier']) as select * from iceberg.default.test;
Query 20220509_104756_00019_idcj4 failed: Invalid partition field declaration: a quoted identifier
java.lang.IllegalArgumentException: Invalid partition field declaration: a quoted identifier
	at io.trino.plugin.iceberg.PartitionFields.parsePartitionField(PartitionFields.java:92)
	at io.trino.plugin.iceberg.PartitionFields.parsePartitionFields(PartitionFields.java:57)
	at io.trino.plugin.iceberg.IcebergMetadata.getNewTableLayout(IcebergMetadata.java:538)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.getNewTableLayout(ClassLoaderSafeConnectorMetadata.java:118)
	at io.trino.metadata.MetadataManager.getNewTableLayout(MetadataManager.java:830)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitCreateTableAsSelect(StatementAnalyzer.java:853)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitCreateTableAsSelect(StatementAnalyzer.java:404)
	at io.trino.sql.tree.CreateTableAsSelect.accept(CreateTableAsSelect.java:96)
	at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:421)
	at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:384)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:79)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:71)
	at io.trino.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:269)
	at io.trino.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:193)
	at io.trino.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:808)
	at io.trino.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:135)
	at io.trino.$gen.Trino_380_3_g0537951____20220509_104303_2.call(Unknown Source)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)

mdesmet · 2022-05-09T11:12:45Z

trino> CREATE TABLE iceberg.default.test AS SELECT 1 as "a quoted identifier";
CREATE TABLE: 1 row

Query 20220509_104608_00015_idcj4, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
1.45 [0 rows, 0B] [0 rows/s, 0B/s]

trino> CREATE TABLE iceberg.default.test2 WITH(partitioning=ARRAY['a quoted identifier']) as select * from iceberg.default.test;
Query 20220509_104756_00019_idcj4 failed: Invalid partition field declaration: a quoted identifier

The currently supported syntax would be:

CREATE TABLE iceberg.default.test2 WITH(partitioning=ARRAY['"a quoted identifier"']) as select * from iceberg.default.test;

Without the quotes we currently fail as it has to comply with the standard identifier regex: [a-z_][a-z0-9_]*. According SQL spec it should be matched case insensitively. I think that's not yet implemented.

The reasoning behind would be that the array contains valid SQL strings that obey to the standard SQL spec.

alexjo2144 · 2022-05-09T20:04:11Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionFields.java

-    private static final String FUNCTION_ARGUMENT_NAME = "\\((" + NAME + ")\\)";
-    private static final String FUNCTION_ARGUMENT_NAME_AND_INT = "\\((" + NAME + "), *(\\d+)\\)";
+    private static final String IDENTIFIER = "[a-z_][a-z0-9_]*";
+    private static final String QUOTED_IDENTIFIER = "\"[^\"]*(?:(?:\"\")+[^\"]*)*\"";


Can you add a comment for each of the non-trivial regex patterns?

I think with @findepi's great suggestion, that regex is now a lot simpler. It was indeed a bit convoluted.

findepi · 2022-05-10T09:31:02Z

I am not yet convinced we need quotes at all, for partitioning values.
For example, month(some date) can be interpreted unambiguously (and so can be month(some column with unusual characters: )(,(()).

Do we envision the partition transforms to represent a language, e.g. have nested expression-like structure?

mdesmet · 2022-05-10T10:23:59Z

I am not yet convinced we need quotes at all, for partitioning values. For example, month(some date) can be interpreted unambiguously (and so can be month(some column with unusual characters: )(,(()).

Do we envision the partition transforms to represent a language, e.g. have nested expression-like structure?

truncate(test, 12, 12) vs truncate("test, 12", 12)

From a user perspective I think the second one is a lot more obvious, clearly separating arguments in standard SQL syntax.

Omitting the quotes will also not let us distinguish between quoted identifiers vs normal identifiers, which might get us in trouble once we truely support quoted identifiers. In the future SELECT 1 as "TEST" may not match with SELECT 1 as "test" if complying with SQL spec (currently Trino converts all columns to lowercase).

I would say this is new feature, so it should conform to the SQL identifier specs as mentioned by @kasiafi in #11163 (comment)

findepi · 2022-05-11T13:39:57Z

Omitting the quotes will also not let us distinguish between quoted identifiers vs normal identifiers, which might get us in trouble once we truely support quoted identifiers.

We don't need to. Partitioning specification is a mini-language and doesn't need to follow SQL identifier semantics (which are not as simple as they could be)

truncate(test, 12, 12) vs truncate("test, 12", 12)

From a user perspective I think the second one is a lot more obvious, clearly separating arguments in standard SQL syntax.

I agree. OTOH it's a fair price for putting commas in a column name. It's a bad idea and nothing will change that.

Anyway, sans apostrophes it hurts my eyes, so yeah, let's go with quotes

findepi · 2022-05-11T13:46:28Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionFields.java

-    private static final String FUNCTION_ARGUMENT_NAME_AND_INT = "\\((" + NAME + "), *(\\d+)\\)";
+    private static final String IDENTIFIER = "[a-z_][a-z0-9_]*";
+    private static final String QUOTED_IDENTIFIER = "\"[^\"]*(?:(?:\"\")+[^\"]*)*\"";
+    private static final String NAME = "\\s*(" + IDENTIFIER + "|" + QUOTED_IDENTIFIER + ")\\s*";


IDENTIFIER -> UNQUOTED_IDENTIFIER
NAME -> IDENTIFIER

findepi · 2022-05-11T13:47:52Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionFields.java

-    private static final String FUNCTION_ARGUMENT_NAME = "\\((" + NAME + ")\\)";
-    private static final String FUNCTION_ARGUMENT_NAME_AND_INT = "\\((" + NAME + "), *(\\d+)\\)";
+    private static final String IDENTIFIER = "[a-z_][a-z0-9_]*";
+    private static final String QUOTED_IDENTIFIER = "\"[^\"]*(?:(?:\"\")+[^\"]*)*\"";


Suggested change

private static final String QUOTED_IDENTIFIER = "\"[^\"]*(?:(?:\"\")+[^\"]*)*\"";

private static final String QUOTED_IDENTIFIER = "\"(?:\"\"|[^\"])*\"";

kasiafi · 2022-05-15T11:46:55Z

@mdesmet
I agree that it is reasonable to require quotes around any identifier which does not follow the pattern "[a-z_][a-z0-9_]*". I think that this will seem familiar to a Trino user, as this is how we handle identifiers in queries.

So:

CREATE TABLE iceberg.default.test2 WITH(partitioning=ARRAY['a quoted identifier']) as select * from iceberg.default.test;

should fail, but:

CREATE TABLE iceberg.default.test2 WITH(partitioning=ARRAY['"a quoted identifier"']) as select * from iceberg.default.test;

should pass.

I am only a bit concerned about the case. Before this change, only identifiers which were fully in lowercase would pass parsePartitionField(). After this change, also uppercase and mixed-case strings will pass, and then possibly fail later(?). Could you please add a test with partitioning on "X" while there is column x?
If we want to enforce that users pass lowercased names, we could check the case in parsePartitionField(). If we want to canonicalize for them, we could add lowercasing in toIdentifier().

findepi · 2022-05-16T15:06:03Z

I am only a bit concerned about the case. Before this change, only identifiers which were fully in lowercase would pass parsePartitionField(). After this change, also uppercase and mixed-case strings will pass,

Good point.
While we want to support non-lowercase identifiers in the future (#17), we don't want to be breaking backwards compatibility when doing so. We should allow only lowercase identifiers for now.

If we want to enforce that users pass lowercased names, we could check the case in parsePartitionField(). If we want to canonicalize for them, we could add lowercasing in toIdentifier().

I don't want SQL semantics here, let's be simpler. Let's require user to provide the exact same case as the column actually has (regardless of how it was created).

-- should work
CREATE TABLE t(x bigint) WITH (partitioning = ARRAY['x']);
CREATE TABLE t(x bigint) WITH (partitioning = ARRAY['"x"']);

-- should work, the column is actually `x`, not `X`
CREATE TABLE t(X bigint) WITH (partitioning = ARRAY['x']);

-- should work, the column is actually `x`, not `X` (until #17)
CREATE TABLE t("X" bigint) WITH (partitioning = ARRAY['x']); 

-- should fail, there is no column `X`. Until #17, the column is actually `x`.
CREATE TABLE t("X" bigint) WITH (partitioning = ARRAY['"X"']);

kasiafi · 2022-05-16T16:15:07Z

I don't want SQL semantics here, let's be simpler. Let's require user to provide the exact same case as the column actually has (regardless of how it was created).

I suggest that instead we require that users pass only lowercase names (through a check in parsePartitionField(). This solution will be the closest to the current semantics.

Comparing case-sensitive seems like something we might not want to do right now. Currently, we resolve column names case-insensitive, so let's be consistent.

findepi · 2022-05-17T12:56:55Z

I suggest that instead we require that users pass only lowercase names (through a check in parsePartitionField(). This solution will be the closest to the current semantics.

This is effectively what i want.

findepi · 2022-05-17T13:00:51Z

Noted conclusion under #12226 (comment)

findepi · 2022-05-17T13:01:58Z

@mdesmet please add test cases as indicated in #12227 (comment)

findepi · 2022-06-14T07:16:32Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionFields.java

-    private static final Pattern VOID_PATTERN = Pattern.compile("void" + FUNCTION_ARGUMENT_NAME);
+    private static final String UNQUOTED_IDENTIFIER = "[a-zA-Z_][a-zA-Z0-9_]*";
+    // We only support lowercase quoted identifiers for now.
+    // See https://github.com/trinodb/trino/issues/12226#issuecomment-1128839259


Link to #17

please add -- or better: remove the comment here, leaving only the one at fromIdentifier

findepi · 2022-06-14T07:20:48Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionFields.java

-    private static String fromIdentifier(String identifier)
+    // Currently, all Iceberg columns are stored in lowercase in the Iceberg metadata files.
+    // Unquoted identifiers are canonicalized to lowercase here which is not according ANSI SQL spec.
+    // Quoted identifiers are restricted to lowercase only through the regex pattern.


the [^A-Z] isn't sufficient for that.
What about Ą?

simplify regex, and use .toLowerCase(ENGLISH) in Java to verify parsed value is all-lower for now

Regex has been simplified and verification is done using .toLowercase(ENGLISH).

findepi · 2022-06-14T07:23:17Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

@@ -625,7 +626,7 @@ public void setTableComment(ConnectorSession session, ConnectorTableHandle table
    public Optional<ConnectorTableLayout> getNewTableLayout(ConnectorSession session, ConnectorTableMetadata tableMetadata)
    {
        Schema schema = toIcebergSchema(tableMetadata.getColumns());
-        PartitionSpec partitionSpec = parsePartitionFields(schema, getPartitioning(tableMetadata.getProperties()));
+        PartitionSpec partitionSpec = createPartitionSpec(schema, getPartitioning(tableMetadata.getProperties()));


When should i call createPartitionSpec and when parsePartitionFields?
they have similar names, and even more similar semantics.

Also, how does introduction of the wrapper (that throws TrinoException) related to adding quoted identifiers?
the parsing could fail even before the change, right? (eg unsupported transform, missing closing brace, etc)

I removed the wrapper and just catch the exception in parsePartitionFields now and rethrow as TrinoException. This makes the testing easier in BaseConnectorTest as we are expecting TrinoException there,

mdesmet · 2022-06-20T13:03:53Z

Following test failed in last build. It's actually related to #12626. I have setup the product tests to run with iceberg.unique-table-location=true to avoid having retries writing to the same location

TestIcebergHiveViewsCompatibility > testIcebergHiveViewsCompatibility [groups: storage_formats, hms_only, iceberg]
java.sql.SQLException: Query failed (#20220619_213321_01205_fz3sq): Cannot create a table on a non-empty location: hdfs://hadoop-master:9000/user/hive/warehouse/iceberg_table, set 'iceberg.unique-table-location=true' in your Iceberg catalog properties to use unique table locations for every table.

mdesmet · 2022-06-21T06:37:25Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java

@@ -151,7 +152,11 @@

 public final class IcebergUtil
 {
-    private static final Pattern SIMPLE_NAME = Pattern.compile("[a-z][a-z0-9]*");


It seems like SIMPLE_NAME tries to achieve the same as UNQUOTED_IDENTIFIER but not exactly following SQL semantics. Changing it had some impacts in exception checking in the tests.

findinpath · 2022-06-22T02:13:34Z

testing/trino-server-dev/etc/catalog/iceberg.properties

@@ -12,5 +12,8 @@ connector.name=iceberg
 hive.metastore.uri=thrift://localhost:9083
 hive.hdfs.socks-proxy=localhost:1180

+# Ensure test retries don't write to non-empty locations


What is the reasoning behind adding this setting for the DevelopmentServer Iceberg connector?

See my comment #12227 (comment)

We can never really ensure that all cleanup (finally stuff) is run eg. For example in case of Hive timeouts some table location might not be emptied, so I think this is the best way to handle this as this ensures every table will have its unique location. This can be moved to a separate PR if necessary.

findinpath · 2022-06-22T04:21:08Z

...roduct-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergSparkCompatibility.java

+        onSpark().executeQuery(format(
+                "CREATE TABLE %s (id INTEGER, `mIxEd_COL` STRING) USING ICEBERG",
+                sparkTableName));
+        assertQueryFailure(() -> onTrino().executeQuery("ALTER TABLE " + trinoTableName + " SET PROPERTIES partitioning = ARRAY['mIxEd_COL']"))


This is a bummer.
We should provide in Trino a way for the users to cope with such situations because otherwise the users would face a Spark lock-in for such situations.

I've created a PR in iceberg to address this problem. apache/iceberg#5110

To give you some context. The original PR I did handle this correctly. According SQL quoted identifier semantics, a quoted identifier should be matched case sensitively. The backticks in Spark behave the same as the quoted identifiers in SQL. IMHO this is not an issue with Iceberg.

There was some discussion about this feature as indeed in Trino the column names are converted into lowercase. Take for example this query

CREATE TABLE t("X" bigint) WITH (partitioning = ARRAY['"X"']);

Because the column name is converted to lowercase in Trino, this query would fail, as at that time the "X" has become x and the partitioning parsing logic fails to find this column. This is definitely confusing for the user. In a ALTER TABLE scenario however this is not true. The column will be known as "X". So we agreed on blocking this scenario for now, as mentioned on #12226 (comment)

Here is the code that explicitly blocks this, removing the lowercase verification would fix the Trino query above.

https://github.com/trinodb/trino/blob/2dd644fc00c636ecae168c81644761c44101327d/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java#L336-L343

IMHO this is not an issue with Iceberg.

I think that the PR apache/iceberg#5110 is more of a usability "improvement" and not a bug.

Please correct me if I'm wrong, I don't think it should be mandatory to specify the source column name in the same case in the table definition.

I think we better stick to SQL semantics.

Imagine following query perfectly valid syntax (not currently working in Trino, but actually working on snowflake):

CREATE TABLE t("X" bigint, "x" bigint) WITH (partitioning = ARRAY['"X"']);

Because of the quoted identifers we would know which x to match. I would think Iceberg partitioning spec should be exactly matched against the Iceberg schema and not impose a certain way of working.

findepi · 2022-06-22T10:38:00Z

@mdesmet mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from b95953f to 2dd644f 2 days ago

This apparently also rebased on current master.
What has changed within this PR?

mdesmet · 2022-06-22T17:48:53Z

This apparently also rebased on current master.
What has changed within this PR?

The build failed because of the refactor of SIMPLE_NAME in IcebergUtils and the impact on the exception in a few tests.

To summarise the changes:

Split commits further
Moved the parsing of quoted and unquoted identifier logic to IcebergUtil so it can easily reused in other code.
Catch exceptions in parsePartitioningFields instead of the extra method.
Fixed flakyness of tests since Prevent table creation on non-empty location for Iceberg tables #12626, this also failed the build once by setting iceberg.unique-table-location in product tests.

Let me know what you think.

findepi · 2022-06-24T15:49:04Z

Fixed flakyness of tests since Prevent table creation on non-empty location for Iceberg tables #12626, this also failed the build once by setting iceberg.unique-table-location in product tests.

What kind of problem is this fixing?

(btw this will become default in #12941, so we don't want to set this explicitly in product tests)

mdesmet · 2022-06-25T06:51:10Z

What kind of problem is this fixing?

With #12626, we throw an exception when trying to create a table and files exist on that location. Sometimes tests fail randomly and are retried without paths being cleaned. In this case testIcebergHiveViewsCompatibility failed and was retried. A random table suffix would have also fixed that issue.

TestIcebergHiveViewsCompatibility > testIcebergHiveViewsCompatibility [groups: storage_formats, hms_only, iceberg]
java.sql.SQLException: Query failed (#20220619_213321_01205_fz3sq): Cannot create a table on a non-empty location: hdfs://hadoop-master:9000/user/hive/warehouse/iceberg_table, set 'iceberg.unique-table-location=true' in your Iceberg catalog properties to use unique table locations for every table.

(btw this will become default in #12941, so we don't want to set this explicitly in product tests)

Anyway if this setting becomes default (which I definitely support), this is not an issue anymore. Will remove that commit.

findepi · 2022-06-27T14:29:24Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java

@@ -312,12 +317,38 @@ public static String quotedTableName(SchemaTableName name)

    private static String quotedName(String name)
    {
-        if (SIMPLE_NAME.matcher(name).matches()) {
+        if (UNQUOTED_IDENTIFIER_PATTERN.matcher(name).matches()) {


That changes semantics of the method.

previously, for My_Table we would output "My_Table".
now we output My_Table without quotes.

if the table name is actually My_Table, it needs to be referenced as "My_Table" in SQL,
so the output of this command no longer can be pasted into SQL.

Please revert the change here

You indeed point out a bug, that also applies to the fromColumnToIdentifier method. It is however the same semantics: we are taking something from metadata (a column or table name), and need to ensure that it can be pasted in an SQL editor, respecting SQL identifier semantics.

findepi · 2022-06-27T14:30:11Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java

            return name;
        }
        return '"' + name.replace("\"", "\"\"") + '"';
    }

+    public static String fromColumnToIdentifier(String column)


The method introduced here is unused (in the commit which introduced it), and it's unclear what's the context in which is should be used.
Squash the changes with the next commit.

findepi · 2022-06-27T14:30:49Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java

+        return quotedName(column);
+    }
+
+    public static String fromIdentifierToColumn(String identifier)


The method introduced here is unused (in the commit which introduced it), and it's unclear what's the context in which is should be used.
Squash the changes with the next commit.

Later it becomes used in PartitionFields which is the context in which it's comprehensible.
(otherwise the "identifier" will be misleading, as normally String identifier should not be expected to have any quotes inside (or the quotes be treated literal).

Move the method to PartitionFields

The placement in IcebergUtils had been discussed in #12872, the need to parse quoted identifiers also exist in other table properties (sort_order, orc_bloom_filter, ...).

findepi · 2022-07-25T15:13:27Z

(cannot comment at #12227 (comment))

The placement in IcebergUtils had been discussed in #12872,

I am aware of that PR.

Later it becomes used in PartitionFields which is the context in which it's comprehensible.

It's important for a shared method to have an easy to understand semantics (name, input and output types need to intuitively hint and what it does).
That's why i suggested scoping it down for now to a private method in PartitionFields

findepi · 2022-08-03T08:34:14Z

@mdesmet failures seem related.

mdesmet · 2022-08-04T06:22:28Z

@mdesmet failures seem related.

I have rebased with latest master and resolved the issues.

cla-bot bot added the cla-signed label May 3, 2022

findepi reviewed May 4, 2022

View reviewed changes

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from b0e3b42 to c9f8fc8 Compare May 4, 2022 15:43

alexjo2144 mentioned this pull request May 6, 2022

Support updating Iceberg table partitioning #12259

Merged

findinpath added the bug Something isn't working label May 7, 2022

findinpath self-requested a review May 8, 2022 05:35

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from c9f8fc8 to 0537951 Compare May 8, 2022 11:57

alexjo2144 reviewed May 9, 2022

View reviewed changes

findepi added the syntax-needs-review label May 10, 2022

findepi added enhancement New feature or request and removed syntax-needs-review bug Something isn't working labels May 11, 2022

findepi reviewed May 11, 2022

View reviewed changes

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from 0537951 to 5c085f2 Compare May 11, 2022 16:38

findepi mentioned this pull request May 17, 2022

Allow defining Iceberg partitioning over a column with whitespace in its name #12226

Closed

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch 2 times, most recently from ba315b8 to 384baa2 Compare May 18, 2022 20:20

findepi reviewed Jun 14, 2022

View reviewed changes

osscm mentioned this pull request Jun 15, 2022

Added support for sorted_by while creating iceberg table #12872

Closed

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch 2 times, most recently from 5577dab to 80b5e05 Compare June 19, 2022 20:29

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from b95953f to 2dd644f Compare June 20, 2022 17:37

mdesmet commented Jun 21, 2022

View reviewed changes

mdesmet requested a review from findepi June 21, 2022 06:38

findinpath reviewed Jun 22, 2022

View reviewed changes

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from 2dd644f to 89b4d46 Compare June 26, 2022 19:09

findepi reviewed Jun 27, 2022

View reviewed changes

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from 89b4d46 to 5675f13 Compare June 28, 2022 19:07

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from 5675f13 to 1ea29fc Compare July 27, 2022 15:28

findepi approved these changes Aug 2, 2022

View reviewed changes

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from e69642d to 95d16a9 Compare August 3, 2022 19:30

mdesmet and others added 2 commits August 4, 2022 00:46

Move lowercasing of Iceberg partitioning to PartitionFields

c8e13bd

Support for Iceberg partitioning with quoted and unquoted identifiers

18cd9b3

mdesmet force-pushed the feature/iceberg-partitioning-quoted-identifiers branch from 95d16a9 to 18cd9b3 Compare August 3, 2022 23:22

findepi merged commit 90a714b into trinodb:master Aug 8, 2022

findepi mentioned this pull request Aug 8, 2022

Release notes for 393 #13474

Closed

github-actions bot added this to the 393 milestone Aug 8, 2022

colebow mentioned this pull request Aug 8, 2022

Add Trino 393 release notes #13519

Merged

	private static final String QUOTED_IDENTIFIER = "\"[^\"](?:(?:\"\")+[^\"])*\"";
	private static final String QUOTED_IDENTIFIER = "\"(?:\"\"\|[^\"])*\"";

Support quoted identifiers in Iceberg partitioning #12227

Support quoted identifiers in Iceberg partitioning #12227

Conversation

mdesmet commented May 3, 2022

Description

Related issues, pull requests, and links

Documentation

Release notes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

findinpath commented May 7, 2022

findepi commented May 9, 2022

findinpath commented May 9, 2022

mdesmet commented May 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

findepi commented May 10, 2022

mdesmet commented May 10, 2022

findepi commented May 11, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kasiafi commented May 15, 2022

findepi commented May 16, 2022

kasiafi commented May 16, 2022

findepi commented May 17, 2022

findepi commented May 17, 2022

findepi commented May 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdesmet commented Jun 20, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdesmet Jun 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdesmet Jun 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

findepi commented Jun 22, 2022

mdesmet commented Jun 22, 2022

findepi commented Jun 24, 2022

mdesmet commented Jun 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

findepi commented Jul 25, 2022

findepi commented Aug 3, 2022

mdesmet commented Aug 4, 2022

mdesmet Jun 22, 2022 •

edited

Loading

mdesmet Jun 22, 2022 •

edited

Loading