-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add options to substring for start parameter being negative #508
Conversation
ACTION NEEDED Substrait follows the Conventional Commits The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add an option "ERROR"? Are there any implementations that simply reject negative indices?
extensions/functions_string.yaml
Outdated
The `negative_start` option applies to the `start` parameter. `WRAP_FROM_END` means | ||
the returned substring will start from the end of the `input`. The last character | ||
has an index of -1. `LEFT_OF_BEGINNING` means the returned substring will start from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description of WRAP_FROM_END
seems a little confusing.
the returned substring will start from the end of
input
.
The returned substring won't start at the end right? For example, substr("hello", -3, 1) returns "l" which doesn't start at the end of the "hello"
The last character has an index of -1
This is correct, but it is not clear to me that we count backwards, in other words, it is not clear that -2 is the second from last character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the description to make things more clear
That makes sense, just added that option. |
extensions/functions_string.yaml
Outdated
@@ -51,6 +51,13 @@ scalar_functions: | |||
description: >- | |||
Extract a substring of a specified `length` starting from position `start`. | |||
A `start` value of 1 refers to the first characters of the string. | |||
|
|||
The `negative_start` option applies to the `start` parameter. `WRAP_FROM_END` means | |||
the returned substring will start from the end of the `input` and move backwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "means the index will start from the end..."? It still kind of sounds like the string itself is moving backwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch. updated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. However, I do not believe this is a breaking change. There is no such thing as required options in Substrait (there are required enum arguments but this is not specified that way, and I don't believe it should be).
I believe that adding new optional arguments is a non-breaking change. If the argument is not specified then consumers are free to continue doing whatever they are doing today (e.g. any behavior is valid).
@richtia can you rebase this? Do you agree it is a non-breaking change? |
26068b8
to
cbaad26
Compare
Done! Yeah, that makes sense that it shouldn't be a breaking change. |
cbaad26
to
7663708
Compare
The substring function
start
parameter has different behavior between differentbackends when it's value negative. Sqlite will wrap around the end of the input string,
whereas postgres will start from the left of the input string.