Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSQL Functions and Arrays should use 1-based indexing #1659

Closed
blueedgenick opened this issue Jul 27, 2018 · 3 comments · Fixed by #4057
Closed

KSQL Functions and Arrays should use 1-based indexing #1659

blueedgenick opened this issue Jul 27, 2018 · 3 comments · Fixed by #4057

Comments

@blueedgenick
Copy link
Contributor

Almost all common SQL systems count character-positions within strings starting from one, not zero. (Hive is the only common exception, which kinda betrays it's origins as a tool of java programmers). KSQL should also follow this common pattern in order to ease adoption.
Additionally, those "newer" SQL-based systems which support more complex types like Arrays - e.g. Presto - will count element positions within those arrays also starting at position one, not zero.

Looking at what's currently implemented / requested / in-flight in KSQL, this seems to impact the following:

@big-andy-coates
Copy link
Contributor

big-andy-coates commented Jul 28, 2018

How about adding a single property ksql.functions.arrayget.legacy.base or something, (which defaults to false, but people wanting old functionality can change to true), and this setting should only effect array access-by-index. SUBSTRING will have a legacy mode that will switch the meaning of its args and its indexing base back to pre-5.1 style. All the pending ones just work in an SQLy way from the get go.

Personally, I prefer this to having some property to control if the index-base is one or zero. Which would require all new functions added to adhere to this too.

@big-andy-coates
Copy link
Contributor

big-andy-coates commented Jul 28, 2018

This is related to #1634, whose PR looks to fix SUBSTRING. The PR #1635 fixes SUBSTRING: arrays are now base-one indexed by default.

For consistency, the fix for this issue should also use the ksql.functions.<something> configuration pattern. Maybe something like a boolean ksql.functions.arrayget.legacy.base configuration or something like that. Then, in the future, if we move array access into a UDF called ArrayGet, it will all still work.

@almazik
Copy link

almazik commented Sep 25, 2018

I would prefer the current implementation with 0-based indexes and notes in the documentation reminding that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants