-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added stringtodate and datetostring UDFs #1890
Added stringtodate and datetostring UDFs #1890
Conversation
@confluentinc It looks like @andybryant just signed our Contributor License Agreement. 👍 Always at your service, clabot |
6fa7979
to
c07fdb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
c07fdb5
to
35c3507
Compare
adb5cc0
to
780fae4
Compare
@@ -375,7 +375,7 @@ public void shouldListFunctions() { | |||
|
|||
// not going to check every function | |||
assertThat(functionList.getFunctions(), hasItems( | |||
new SimpleFunctionInfo("TIMESTAMPTOSTRING", FunctionType.scalar), | |||
new SimpleFunctionInfo("EXTRACTJSONFIELD", FunctionType.scalar), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functions that use the new Udf annotation don't show up in this list. Changing the test to use another Kudf based function.
780fae4
to
1c4291a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andybryant, this is really cool that you've taken the time to raise this.
Sorry for the delay in getting it reviewed.
I've left a few comments and suggestions. Take a look and see what you think.
Thanks,
Andy
@@ -1071,6 +1071,14 @@ Scalar functions | |||
+------------------------+------------------------------------------------------------+---------------------------------------------------+ | |||
| CONCAT | ``CONCAT(col1, '_hello')`` | Concatenate two strings. | | |||
+------------------------+------------------------------------------------------------+---------------------------------------------------+ | |||
| DATETOSTRING | ``DATETOSTRING(START_DATE, 'yyyy-MM-dd')`` | Converts an INT date value into | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Converts an integer representation of a date into..."
@@ -1140,6 +1148,12 @@ Scalar functions | |||
+------------------------+------------------------------------------------------------+---------------------------------------------------+ | |||
| ROUND | ``ROUND(col1)`` | Round a value to the nearest BIGINT value. | | |||
+------------------------+------------------------------------------------------------+---------------------------------------------------+ | |||
| STRINGTODATE | ``STRINGTODATE(col1, 'yyyy-MM-dd')`` | Converts a string value in the given | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Converts a string representation of a date in..."
" into an integer representing the days since epoch."
import io.confluent.ksql.util.timestamp.StringToTimestampParser; | ||
|
||
public class StringToTimestamp implements Kudf { | ||
@UdfDescription(name = "stringtotimestamp", author = "Confluent", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Converts a string representation of a date ..."
(You're wording on the new UDFs is much better than the wording on the old ones!)
if (timestampParser == null) { | ||
timestampParser = new StringToTimestampParser(args[1].toString()); | ||
} | ||
@Udf(description = "Converts a string value in the given format into the BIGINT value" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Converts a string representation of a date..."
+ " days since epoch using the given format pattern." | ||
+ " The format pattern should be in the format expected by" | ||
+ " java.time.format.DateTimeFormatter") | ||
public int stringToDate(final String formattedDate, final String formatPattern) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you look to make use of @UdfParameter
on all of your UDF params please. The actual names of the fields are not currently used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to using @UdfParameter
for all 4 UDFs
@Test | ||
public void shouldThrowIfFormatInvalid() { | ||
expectedException.expect(KsqlFunctionException.class); | ||
expectedException.expect(UncheckedExecutionException.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exception type again.
} | ||
|
||
@Test | ||
public void shouldBeThreadSafe() { | ||
public void shouldWorkWithManyDifferentFormatters() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combine with above again.
"CREATE STREAM TS AS select id, stringtodate(date, format) as ts from test;" | ||
], | ||
"inputs": [ | ||
{"topic": "test_topic", "key": 0, "value": "0,zero,2018-05-11,yyyy-MM-dd", "timestamp": 0}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a literal in one of these formats please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,31 @@ | |||
{ | |||
"comments": [ | |||
"You can specify multiple statements per test case, i.e., to set up the various streams needed", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop this cut & paste comments
section if you don't mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done (in both json files)
"CREATE STREAM DATE_STREAM AS select ID, datetostring(START_DATE, DATE_FORMAT) as CUSTOM_FORMATTED_START_DATE from test;" | ||
], | ||
"inputs": [ | ||
{"topic": "test_topic", "key": 1, "value": {"ID": 1, "START_DATE": 17662, "DATE_FORMAT": "yyyy-MM-dd"}, "timestamp": 0}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a literal in one of these formats please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the last one is literal
1c4291a
to
bea71d0
Compare
Thanks for the feedback - lots of good ideas there. I've pushed another commit that I think addresses all your comments. I kept the 3 separate commits for now to make reviewing easier, but can squash them down to one commit once your happy with it. Cheers |
da8d37e
to
d999593
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andybryant LGTM.
Sorry for the delay in reviewing!
Just need another engineer to double check. I'll look to get this cherry picked into 5.1. release.
public String dateToString( | ||
@UdfParameter(value = "epochDays", | ||
description = "The Epoch Day to convert," | ||
+ " based on the epoch 1970-01-01") final Integer epochDays, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use int
rather than Integer
as it'll throw an NPE on null anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andybryant for submitting this PR! I've left one question, but otherwise LGTM
final Timestamp timestamp = new Timestamp(epochMilli); | ||
final DateTimeFormatter formatter = formatters.get(formatPattern); | ||
return timestamp.toInstant() | ||
.atZone(ZoneId.systemDefault()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to use the system timezone? i.e., dates are all UTC in KSQL. So this is deviating from the norm. if this is indeed the case that we want this, then can you update the description etc to make it clear what is happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old StringToTimestamp
implementation used system default time zone when no time zone was provided. I can change to UTC and update the docs, but wouldn't this be considered a breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've assumed we want to keep the old behaviour for now and have updated the docs accordingly.
e49e2dc
to
c5df215
Compare
Fixed up docs based on @dguy feedback and switched |
Thanks @andybryant can you please merge with the latest master. Assuming it passes we can then merge it. Thanks for your patience |
confluentinc#1851) Added stringtodate and datetostring UDFs (confluentinc#1851)
c5df215
to
0b98da1
Compare
@dguy - updated to latest master. Cheers Andy |
Thanks for the contribution @andybryant! |
Awesome - thanks @dguy |
Description
Added
stringtodate
anddatetostring
UDFs to convert to and from a formatted date string and an integer representing days since epoch. This format is used by Kafka Connect to represent Date objects with no time or time zone component.In a separate commit on the same PR I also updated
stringtotimestamp
andtimestamptostring
to use the new UDF api and to support dynamically supplied format patterns.Testing done
Added tests for all new functionality and ran locally. Generated and views updated docs locally.
Reviewer checklist