-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More SQL-like Substring #1635
More SQL-like Substring #1635
Conversation
This breaks compatibility with 4.1.x substring function |
Well, good that its not a release issue. Bad that our SUBSTRING is different to everyone else :-/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
public String substring(final String value, final int startIndex) { | ||
return value.substring(startIndex); | ||
@Udf(description = "Returns a substring of str that starts at pos " | ||
+ "and continues to the end of the string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also document that the pos starts from 1 not 0.
+ " extends to the character at endIndex -1.") | ||
public String substring(final String value, final int startIndex, final int endIndex) { | ||
return value.substring(startIndex, endIndex); | ||
@Udf(description = "Returns a substring of str that starts at pos and is of length len") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was expecting to see some shiny new @UdfParameter decoration here to help teh user know which arg is the length and which one the startPos ? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different branch. That's only available in master. This is targeted at 5.0. Though now its not release related we can change this to master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now on master branch - so annotations added!
# Conflicts: # ksql-engine/src/main/java/io/confluent/ksql/function/udf/string/Substring.java
…s can control via the configuration `ksql.functions.substring.legacy.args`.
@@ -187,6 +200,12 @@ private static ConfigDef configDef(final boolean current) { | |||
"", | |||
ConfigDef.Importance.LOW, | |||
KSQL_OUTPUT_TOPIC_NAME_PREFIX_DOCS | |||
).define( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add this to COMPATIBILITY_BREAKING_CONFIG_DEFS
on line 119 instead and set the write default to True and the read default to False. That way new queries can use the new SUBSTR behavior, and any existing queries will use the old behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, cool. Not noticed that before.
I've added it to COMPATIBILITY_BREAKING_CONFIG_DEFS
and removed it from the main def list.
@Override | ||
public void configure(final Map<String, ?> props) { | ||
final boolean legacyArgs = | ||
getProps(props, KSQ_FUNCTIONS_PROPERTY_PREFIX + "substring.legacy.args", false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use KSQL_FUNCTIONS_SUBSRTRING_LEGACY_ARGS_CONFIG here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
} | ||
|
||
private static int getStartIndex(final String value, final Integer pos) { | ||
return pos < 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If pos is 0 the substr call will throw an IndexOutOfBoundsException. Is that what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot! Test case added and code fixed.
@rodesai thanks for the review! I've pushed some changes to address your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…y, so as not to be a breaking change. Original code change is in confluentinc#1635.
Description
Switch
SUBSTRING
fromSUBSTRING(str, startIndex, endIndex)
with zero-based indexing toSUBSTRING(str, pos, len)
with:as used by other SQL providers.
Uses can set
ksql.functions.substring.legacy.args
to true, (either through server properties or session properties, to switch the implementation back to legacy mode, i.e.SUBSTRING(str, startIndex, endIndex)
and error handling that throws a lot of exceptions.Fixes #1634
Because I've extended the description of
SubString
, I've also improved the formatting of the output of theDESCRIBE FUNCTION x;
in the CLI, so that long descriptions are correctly split across lines and properly indented. New output looks like:Related release notes added in this PR: https://github.com/confluentinc/docs/pull/1095
Testing done
Unit tests added and JSON based functional test.
Reviewer checklist