-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Coral-Trino] Cast char fields, if necessary, to varchar type in the set operation - SqlCallTransformer
#442
Conversation
coral-trino/src/test/java/com/linkedin/coral/trino/rel2trino/TestUtils.java
Show resolved
Hide resolved
SqlCallTransformer
3a4dda3
to
37f39d3
Compare
|
||
Optional<RelDataType> selectNodeDataType; | ||
try { | ||
selectNodeDataType = Optional.of(deriveRelDatatype(sqlNode)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aastha25 in some cases while running the HiveToTrinoConverterTest#testViews
I stumbled on RuntimeException
issued from deriveRelDataType(SqlNode)
.
java.lang.RuntimeException: Failed to derive the RelDataType for SqlNode: CAST(row("b"."b1") as row("b1" varchar)) AS `b` with topSqlNode: SELECT `tablef`.`a`, `generic_project`(`tablef`.`b`, 'b', 'struct<b1:string>') AS `b`
FROM `test`.`tablef` AS `tablef`
at com.linkedin.coral.common.utils.TypeDerivationUtil.getRelDataType(TypeDerivationUtil.java:87)
at com.linkedin.coral.common.transformers.SqlCallTransformer.deriveRelDatatype(SqlCallTransformer.java:68)
Is this expected?
I currently workaround this limitation by just working with column types for which the type can actually be derived.
Also side note - Why throwing the very generic RuntimeException
in TypeDerivationUtil
instead of specialized exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findinpath were you able to isolate the test case which is leading to the failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When not catching the exception
try {
selectNodeDataType = Optional.of(deriveRelDatatype(sqlNode));
} catch (RuntimeException e) {
// The type derivation may fail for complex expressions
selectNodeDataType = Optional.empty();
}
, the following tests will be failing:
fuzzy_union_view_double_branch_evolved_different
fuzzy_union_view_more_than_two_branches_evolved
fuzzy_union_view_map_with_struct_value_evolved
fuzzy_union_view_array_with_struct_value_evolved
fuzzy_union_view_deeply_nested_struct_evolved
fuzzy_union_view_deeply_nested_complex_struct_evolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljfgem Can we fix the generic type derivation method and also throw the underlying error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will take a look.
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
List<SqlNode> updatedOperands = new ArrayList<>(sqlCall.getOperandList().size()); | ||
for (SqlNode operand : operandsList) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we iterate on the operandsList twice? Can we merge this loop and the above? I see below we cast to SqlSelect in the first loop we check if it is SqlSelect. In both we iterate on the selectNodes, etc. I think there is room for consolidation? PS. In any case let us reuse the naming conventions suggested above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to find the least restrictive type for each select item index from the operands - this is one looping.
Once we have the least restrictive type, we need another looping to verify whether the operands need to be updated.
I didn't find a better alternative to do this.
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
@KevinGe00 Can we run integration tests on this PR? Trino tests are enough. |
be01343
to
a2bf267
Compare
…set operation In case of dealing with Hive views which make use of set operation (e.g. UNION) ensure that the `char` fields from the inner SELECT statements have the same type as the output field types of the set operation. Due to wrong coercion between `varchar` and `char` in Trino, as described in trinodb/trino#9031 , a work-around needs to be applied in case of translating Hive views which contain a UNION dealing with char and varchar types. The work-around consists in the explicit cast of the field having char type towards varchar type corresponding of the set operation output type.
a2bf267
to
61aa89d
Compare
@wmoustafa addressed comments and rebased on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with @KevinGe00 that integration tests passed. Thanks @findinpath for working on this. Appreciate the spin-off from #438.
… in the set operation (linkedin#442)" This reverts commit 4bb38a1.
… in the set operation (linkedin#442)" This reverts commit 4bb38a1. # Conflicts: # coral-trino/src/main/java/com/linkedin/coral/trino/rel2trino/DataTypeDerivedSqlCallConverter.java
What changes are proposed in this pull request, and why are they necessary?
In case of dealing with Hive views which make use of set operation (e.g. UNION) ensure that the
char
fields from the inner SELECT statements have the same type as the output field types of the set operation.Due to wrong coercion between
varchar
andchar
in Trino, as described in trinodb/trino#9031, a work-around needs to be applied in case of translating Hive views which contain a UNION dealing with char and varchar types. The work-around consists in the explicit cast of the field having char type towards varchar type corresponding of the set operation output type.
This PR is a spin-off from #438 using a slighly different functionality to achieve a similar functionality.
From a technical design point of view, the present PR extends
SqlCallTransformer
to modify on the fly the select branch of the set operation which needs explicit casting for its char columns (which need to be changed to varchar type due to trinodb/trino#9031 Trino limitation.How was this patch tested?
The patch has been tested with automatic unit tests.