Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add options to substring for start parameter being negative #508

Merged
merged 3 commits into from
Jul 21, 2023

Conversation

richtia
Copy link
Member

@richtia richtia commented Jun 14, 2023

The substring function start parameter has different behavior between different
backends when it's value negative. Sqlite will wrap around the end of the input string,
whereas postgres will start from the left of the input string.

@github-actions
Copy link

ACTION NEEDED

Substrait follows the Conventional Commits
specification
for
release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add an option "ERROR"? Are there any implementations that simply reject negative indices?

Comment on lines 55 to 57
The `negative_start` option applies to the `start` parameter. `WRAP_FROM_END` means
the returned substring will start from the end of the `input`. The last character
has an index of -1. `LEFT_OF_BEGINNING` means the returned substring will start from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description of WRAP_FROM_END seems a little confusing.

the returned substring will start from the end of input.

The returned substring won't start at the end right? For example, substr("hello", -3, 1) returns "l" which doesn't start at the end of the "hello"

The last character has an index of -1

This is correct, but it is not clear to me that we count backwards, in other words, it is not clear that -2 is the second from last character.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the description to make things more clear

@richtia
Copy link
Member Author

richtia commented Jun 22, 2023

Do we want to add an option "ERROR"? Are there any implementations that simply reject negative indices?

That makes sense, just added that option.

@richtia richtia requested a review from westonpace June 22, 2023 15:50
@@ -51,6 +51,13 @@ scalar_functions:
description: >-
Extract a substring of a specified `length` starting from position `start`.
A `start` value of 1 refers to the first characters of the string.

The `negative_start` option applies to the `start` parameter. `WRAP_FROM_END` means
the returned substring will start from the end of the `input` and move backwards.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "means the index will start from the end..."? It still kind of sounds like the string itself is moving backwards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch. updated!

westonpace
westonpace previously approved these changes Jun 29, 2023
Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. However, I do not believe this is a breaking change. There is no such thing as required options in Substrait (there are required enum arguments but this is not specified that way, and I don't believe it should be).

I believe that adding new optional arguments is a non-breaking change. If the argument is not specified then consumers are free to continue doing whatever they are doing today (e.g. any behavior is valid).

@westonpace
Copy link
Member

@richtia can you rebase this? Do you agree it is a non-breaking change?

@richtia
Copy link
Member Author

richtia commented Jul 17, 2023

@richtia can you rebase this? Do you agree it is a non-breaking change?

Done! Yeah, that makes sense that it shouldn't be a breaking change.

@westonpace westonpace merged commit 281dc0f into substrait-io:main Jul 21, 2023
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants