-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle sort order with nested columns on iceberg table #22099
base: master
Are you sure you want to change the base?
Changes from all commits
c1e8819
7b188e7
49a0a78
0389108
8ad3745
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1602,9 +1602,8 @@ public void testSortingOnNestedField() | |
assertThat(query("CREATE TABLE " + tableName + " (nationkey BIGINT, row_t ROW(name VARCHAR, regionkey BIGINT, comment VARCHAR)) " + | ||
"WITH (sorted_by = ARRAY['\"row_t\".\"comment\"'])")) | ||
.failure().hasMessageContaining("Unable to parse sort field: [\"row_t\".\"comment\"]"); | ||
assertThat(query("CREATE TABLE " + tableName + " (nationkey BIGINT, row_t ROW(name VARCHAR, regionkey BIGINT, comment VARCHAR)) " + | ||
"WITH (sorted_by = ARRAY['\"row_t.comment\"'])")) | ||
.failure().hasMessageContaining("Column not found: row_t.comment"); | ||
assertUpdate("CREATE TABLE " + tableName + " (nationkey BIGINT, row_t ROW(name VARCHAR, regionkey BIGINT, comment VARCHAR)) " + | ||
"WITH (sorted_by = ARRAY['\"row_t.comment\"'])"); | ||
Comment on lines
+1605
to
+1606
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add test case where we verify the table files is sorted when using nested field. something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @krvikash I'm trying to add a test case for this and I having some problems/questions. The test case that I've created is the following: try (TestTable table = new TestTable(
getQueryRunner()::execute,
"test_sorted_table_using_nested_fields",
" (id INT, row_t ROW(name VARCHAR)) WITH (format = '" + format.name() + "', sorted_by = ARRAY[ '\"row_t.name\"' ])")) {
assertUpdate(
withSmallRowGroups,
"INSERT INTO " + table.getName() + "(id, row_t)" +
"SELECT id, ROW(CONCAT('v', CAST(id as VARCHAR))) as row_t FROM UNNEST(sequence(1, 500)) AS t(id)",
500);
for (Object filePath : computeActual("SELECT file_path from \"" + table.getName() + "$files\"").getOnlyColumnAsSet()) {
assertThat(isFileSorted(Location.of((String) filePath), "name")).isTrue();
}
assertQuery("SELECT * FROM " + table.getName(), "SELECT * FROM " + table.getName() + " ORDER BY id"); The method java.lang.IllegalArgumentException: expected one element but was: <row_t, name>
at com.google.common.collect.Iterators.getOnlyElement(Iterators.java:322)
at io.trino.plugin.iceberg.IcebergTestUtils.lambda$3(IcebergTestUtils.java:144)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:193)
at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:556)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:546)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:702)
at io.trino.plugin.iceberg.IcebergTestUtils.checkParquetFileSorting(IcebergTestUtils.java:145)
at io.trino.plugin.iceberg.catalog.jdbc.TestIcebergJdbcCatalogConnectorSmokeTest.isFileSorted(TestIcebergJdbcCatalogConnectorSmokeTest.java:188)
at io.trino.plugin.iceberg.BaseIcebergConnectorSmokeTest.testSortedTableUsingNestedField(BaseIcebergConnectorSmokeTest.java:551)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187) The assert java.lang.AssertionError: Execution of 'expected' query failed: SELECT * FROM test_sorted_table_using_nested_fields15ugt0uecl ORDER BY id
at io.trino.testing.QueryAssertions.assertDistributedQuery(QueryAssertions.java:322)
at io.trino.testing.QueryAssertions.assertQuery(QueryAssertions.java:187)
at io.trino.testing.QueryAssertions.assertQuery(QueryAssertions.java:160)
at io.trino.testing.AbstractTestQueryFramework.assertQuery(AbstractTestQueryFramework.java:350)
at io.trino.plugin.iceberg.BaseIcebergConnectorSmokeTest.testSortedTableUsingNestedField(BaseIcebergConnectorSmokeTest.java:553)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
Caused by: org.jdbi.v3.core.statement.UnableToCreateStatementException: org.h2.jdbc.JdbcSQLSyntaxErrorException: Table "TEST_SORTED_TABLE_USING_NESTED_FIELDS15UGT0UECL" not found; SQL statement: Seems that the expected and the actual query is executed in different query runners; The expected is executed on My question is how can I test this properly? Should I update the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
} | ||
|
||
@Test | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheusv can we have test coverage for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krvikash we already have a test case
BaseIcebergConnectorTest.testSortingOnNestedField:1413
that expects an exception (it's failing right now), would make test to just change it to expect a success instead of an error?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, yes.
This should be the purpose of this PR right? Allowing to define sort on nested fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the test. Can you folks please take a look? @findinpath @krvikash