Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: expression support for insert values #3612

Merged
merged 4 commits into from
Oct 19, 2019

Conversation

agavra
Copy link
Contributor

@agavra agavra commented Oct 17, 2019

Fixes #3610
Partially Fixes #3591

Description

Add support for arbitrary expressions in INSERT VALUES, which will allow:

  • negative numbers
  • arrays/maps
  • UDFs
  • All the other fun things you can imagine!

Testing done

  • RQTT tests
  • Unit testing
  • local testing
ksql> CREATE STREAM complex (a ARRAY<INTEGER>, m MAP<STRING, INTEGER>) WITH (value_format='JSON', partitions=1, KAFKA_TOPIC='complex');

 Message
----------------
 Stream created
----------------
ksql> INSERT INTO complex (a, m) VALUES (AS_ARRAY(1, -1), AS_MAP(AS_ARRAY('key'), AS_ARRAY(1)));
ksql> SELECT * FROM complex;
Table 'COMPLEX' is not materialized. KSQL currently only supports static queries on materialized aggregate tables. i.e. those created by a 'CREATE TABLE AS SELECT <fields>, <aggregate_functions> FROM <sources> GROUP BY <key>' style statement.
ksql> SELECT * FROM complex EMIT CHANGES;
+-----------------------------------------------------+-----------------------------------------------------+-----------------------------------------------------+-----------------------------------------------------+
|ROWTIME                                              |ROWKEY                                               |A                                                    |M                                                    |
+-----------------------------------------------------+-----------------------------------------------------+-----------------------------------------------------+-----------------------------------------------------+
|1571330771859                                        |null                                                 |[1, -1]                                              |{key=1}                                              |

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@agavra agavra requested a review from a team as a code owner October 17, 2019 16:50
Copy link
Contributor

@purplefox purplefox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A couple of comments.

@@ -15,13 +15,19 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class only seems only to be used by the InsertValuesExecutor, i.e. its not some kind of general type coercer (?) so perhaps it should be renamed and live near the InsertValuesExecutor to make it's purpose clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm looking at the history, I remember that it was originally like that and then @big-andy-coates refactored it to extract out a coerce that could be used general purpose in #2953. I'm usually a fan of making things general purpose only if they are needed to be, but I'll let Andy chime in before making that change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment we have code that coerces values all over the shot, (e.g. the UDF coercion code that Tim recently removed, but also others in the serde code and some more somewhere that coerces GenericRow), this class is only currently used in one place, but the intent is that we slowly move the other areas to use one common class to handle coercion of our SQL types, so that we get consistent behaviour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, this class currently does two things:

  1. It handles our implicit conversion from STRING to DECIMAL - which I really don't like. I hate such magic.
  2. It handles our implicit casting rules.

Personally, I'd like to remove the first thing from our system entirely. We have a major release coming up imminently, which gives us the opportunity to make a breaking change and handle decimals as first class citizens. Personally, I think the literal 12.453835 in a statement should be a decimal, (as per our SqlBase rules), but its currently a double. We can choose to change this in the release after 5.4.

Food for thought.

@@ -62,6 +77,43 @@
return optional(result);
}

private static <T> Optional<T> coerceArray(final Object value, final SqlArray targetType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method seems to expect a List as value, so shouldn't we type value as List instead of Object for better type safety? Also return type seems to be List too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also type param doesn't seem to be doing anything useful unless I am missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll type it as a list and remove the type parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually keeping the signature without specifying List anywhere. If I add List to the parameter, then I need to cast it and check mismatch in doCoerce, which will cause the number of ifs to grow beyond what checkstyle likes. If I have it in the return, then I'll just need to cast it again when returning it and since it's a private helper method we don't gain anything by having the extra type safety.

return doCoerce(value, targetType);
}

private static <T> Optional<T> doCoerce(final Object value, final SqlType targetType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the type parameter really add anything here? It's not used in the params or in the body of the method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd usually consider type params that are only used in the return value (not the params) as bad practice. They basically just hide the cast that you would otherwise have to do, but the cast is not safe and you will get a ClassCastException if you use it in a context where the actual type is different from the expected type. Much better IMHO to make the cast explicit by the user, so the compiler can flag an unchecked cast warning so you are aware. There is a discussion on this here https://stackoverflow.com/questions/36477444/java-object-return-type-vs-generic-methods

Copy link
Contributor Author

@agavra agavra Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have me convinced :) I'll change that to use the explicit cast. (Interestingly enough, all of the usage just returns Object anyway, so there's no explicit casting necessary...)

return optional(coerced.build());
}

private static <T> Optional<T> coerceMap(final Object value, final SqlMap targetType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above but for Map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - I think it makes more sense not to type it as a Map, but I removed the type parameter.

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agavra

This is looking good. Few points made below. Plus a general comment on one of the underlying issues: #3610 (comment)

@agavra agavra merged commit 37f9763 into confluentinc:master Oct 19, 2019
@agavra agavra deleted the insert_exp branch October 19, 2019 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

INSERT VALUES fails on negative numbers Support Array constructor
3 participants