Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WindowStart and WindowEnd UDFs (#1993) #2090

Conversation

big-andy-coates
Copy link
Contributor

@big-andy-coates big-andy-coates commented Oct 24, 2018

Description

  • add WindowStartTime UDAF

(cherry picked from commit 82b603d)

Original PR: #1993

* add WindowStartTime UDAF

(cherry picked from commit 82b603d)
@big-andy-coates big-andy-coates requested review from JimGalasyn and a team as code owners October 24, 2018 19:53
@@ -272,6 +272,21 @@ counting/aggregation step per region.
WINDOW SESSION (60 SECONDS) \
GROUP BY regionid;

Sometimes you may want to include the bounds of the current window in the result so that it is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Sometimes you may want to include the bounds of the current window in the result so that it is
Sometimes, you may want to include the bounds of the current window in the result, so that it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -272,6 +272,21 @@ counting/aggregation step per region.
WINDOW SESSION (60 SECONDS) \
GROUP BY regionid;

Sometimes you may want to include the bounds of the current window in the result so that it is
more easily accessible to consumers of the data. The statement below extracts the start and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
more easily accessible to consumers of the data. The statement below extracts the start and
more easily accessible to consumers of the data. The following statement extracts the start and

Copy link
Member

@JimGalasyn JimGalasyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a couple of suggestions.

+------------------------+---------------------------+------------+---------------------------------------------------------------------+
| WindowEnd | ``WindowEnd()`` | Stream | Extract the end time of the current window, in milliseconds. |
| | | Table | If the query is not windowed the function will return null. |
+------------------------+---------------------------+------------+---------------------------------------------------------------------+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concern about presenting this as an Aggregate Fn / UDAF, at least to the end-user. It's a little bit of a grey-zone and i understand why, mechanically, it's easier to implement this way but semantically it's not a "reducing function" that operates to combine the values from multiple grouped values. Rather, it's a function which operates on the group as a whole, once it's been reduced to a single row. In my mind that makes this smell different than a UDAF.

I guess it's also not really a scalar fn either - although it's closer to being that than an aggregator - so possibly a third category of "Other Functions" is the best way to handle it. WDYT ?
Also, i realise i should probably have made this comment on the original PR but only just noticed it....

Copy link
Contributor Author

@big-andy-coates big-andy-coates Oct 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know what you mean, though don't have a strong feeling on this in terms of the documentation. So happy to hear from others.

I'm also looking to extend this functionality to more than just aggregates, so will look to update the docs more then if we've not come to any decision here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see two places where a user would be looking for this information:

  1. In a section on Windowing (aka windowed aggregations), because "I am working with windows, so show me in one place all info I need on how to deal with windows".
  2. In a section on built-in "Functions" (KDF, KDAF, etc.), because "I remember it's a function call to get this info, so I'll check the place in the docs where all functions are listed".

We don't have a dedicated docs section yet that covers Windowing (cf. (1) above), but @JimGalasyn is working on that.

@big-andy-coates big-andy-coates merged commit 839b049 into confluentinc:5.1.x Oct 24, 2018
@big-andy-coates big-andy-coates deleted the cherry_pick_window_ufds_5.1 branch October 24, 2018 21:51
big-andy-coates added a commit to big-andy-coates/ksql that referenced this pull request Oct 24, 2018
…2090)

* add WindowStartTime UDAF

(cherry picked from commit 839b049)
big-andy-coates added a commit that referenced this pull request Oct 24, 2018
* add WindowStartTime UDAF

(cherry picked from commit 839b049)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants