Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSQL doesn't prevent/warn column aliases and object names that are function names #1765

Closed
rmoff opened this issue Aug 21, 2018 · 2 comments
Closed

Comments

@rmoff
Copy link
Contributor

rmoff commented Aug 21, 2018

If you use a function name (built in or UDF) as an CSAS/CTAS object name, and/or column alias, KSQL doesn't stop or warn you.

Whilst this is possible, it's eminently silly and we should save users from doing stupid things by mistake:

ksql> CREATE TABLE COUNT AS SELECT EVENTID, COUNT(*) AS COUNT FROM CARSENSOR GROUP BY EVENTID;

 Message
---------------------------
 Table created and running
---------------------------
ksql> DESCRIBE COUNT;

Name                 : COUNT
 Field   | Type
-------------------------------------
 ROWTIME | BIGINT           (system)
 ROWKEY  | VARCHAR(STRING)  (system)
 EVENTID | INTEGER
 COUNT   | BIGINT
-------------------------------------
For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>;
ksql>
ksql> SELECT COUNT FROM COUNT;
173
^CQuery terminated
ksql>

It's also inconsistent with how other reserved words are handled, where an error (albeit an unhelpful one) is thrown:

ksql> CREATE TABLE COUNT AS SELECT EVENTID, COUNT(*) AS CREATE FROM CARSENSOR GROUP BY EVENTID;
line 1:51: mismatched input 'CREATE' expecting {'ADD', 'APPROXIMATE', 'AT', 'CONFIDENCE', 'NO', 'SUBSTRING', 'POSITION', 'TINYINT', 'SMALLINT', 'INTEGER', 'DATE', 'TIME', 'TIMESTAMP', 'INTERVAL', 'YEAR', 'MONTH', 'DAY', 'HOUR', 'MINUTE', 'SECOND', 'ZONE', 'OVER', 'PARTITION', 'RANGE', 'ROWS', 'PRECEDING', 'FOLLOWING', 'CURRENT', 'ROW', 'STRUCT', 'VIEW', 'REPLACE', 'GRANT', 'REVOKE', 'PRIVILEGES', 'PUBLIC', 'OPTION', 'EXPLAIN', 'ANALYZE', 'FORMAT', 'TYPE', 'TEXT', 'GRAPHVIZ', 'LOGICAL', 'DISTRIBUTED', 'TRY', 'SHOW', 'TABLES', 'SCHEMAS', 'CATALOGS', 'COLUMNS', 'COLUMN', 'USE', 'PARTITIONS', 'FUNCTIONS', 'FUNCTION', 'TO', 'SYSTEM', 'BERNOULLI', 'POISSONIZED', 'TABLESAMPLE', 'RESCALED', 'ARRAY', 'MAP', 'SET', 'RESET', 'SESSION', 'DATA', 'START', 'TRANSACTION', 'COMMIT', 'ROLLBACK', 'WORK', 'ISOLATION', 'LEVEL', 'SERIALIZABLE', 'REPEATABLE', 'COMMITTED', 'UNCOMMITTED', 'READ', 'WRITE', 'ONLY', 'CALL', 'NFD', 'NFC', 'NFKD', 'NFKC', 'IF', 'NULLIF', 'COALESCE', IDENTIFIER, DIGIT_IDENTIFIER, QUOTED_IDENTIFIER, BACKQUOTED_IDENTIFIER}
Caused by: org.antlr.v4.runtime.InputMismatchException
ksql> CREATE TABLE CREATE AS SELECT EVENTID, COUNT(*) AS COUNT FROM CARSENSOR GROUP BY EVENTID;
line 1:14: no viable alternative at input 'CREATE TABLE CREATE'
Caused by: org.antlr.v4.runtime.NoViableAltException
ksql>
@agavra
Copy link
Contributor

agavra commented Oct 28, 2019

I'm not sure I understand why this is a bug @rmoff - I feel like I should be allowed to specify schemas that happen to share the name of a function (e.g. if my data has count as a field, why can't my schema without backquoting it? that seems artificially restricting our parser).

We can differentiate between function calls and identifiers because function calls have parens. This is similar to the concept of a "non-reserved" word in SQL standard (https://www.postgresql.org/docs/current/sql-keywords-appendix.html)

@agavra agavra self-assigned this Oct 28, 2019
@rmoff
Copy link
Contributor Author

rmoff commented Nov 4, 2019

OK, I'm convinced :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants