-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly delimit identifiers in DataType expression created from Type #8845
Conversation
8fde81a
to
d205867
Compare
05d1413
to
f24a51d
Compare
@@ -300,6 +301,19 @@ static DataType toDataType(TypeSignature typeSignature) | |||
} | |||
} | |||
|
|||
private static boolean requiresDelimiting(String identifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok for now, but technically, every name should be delimited. Once #17 is fixed, any names coming from a connector (including column types and field names in row types) will be considered as "already canonicalized", so they'll need to be surrounded in quotes when rendering as SQL to ensure proper matching semantics.
@@ -33,7 +32,7 @@ private void assertRoundTrip(String expression) | |||
{ | |||
assertThat(type(expression)) | |||
.ignoringLocation() | |||
.withComparatorForType(Comparator.comparing(identifier -> identifier.getValue().toLowerCase(Locale.ENGLISH)), Identifier.class) | |||
.withComparatorForType(Comparator.comparing(Identifier::getCanonicalValue), Identifier.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes semantics. Is it intentional? The canonical value per SQL standard rules uses upper case. For historical reasons, and until #17 is fixed, we continue to use lower case in many places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter here? I assumed that for sake of testing assertion whatever normalization (upper/lower) is fine. I changed to getCanonicalValue
mostly to have different matching logic for delimited (uppercasing) and non-delimited (no uppercasing) identifiers. Without the change, the test passes for field names that require delimiting even without changes to TypeSignatureTranslator
. I agree this is kinda a hack though. I can take that back and assume that regression testing is solely done via testImplicitCastToRowWithFieldsRequiringDelimitation()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f24a51d
to
2970a52
Compare
AC @martint . Let me know if this is good to go. |
CI: #8691 |
No description provided.