-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: report window type and query status better from API #4313
fix: report window type and query status better from API #4313
Conversation
This commit: 1. exposes the window type of the key of a query/source, i.e. `HOPPING`, `TUMBLING` `SESSION` or none. 2. makes the status of a query easier to find. 3. fixes a bug that meant the statement text of a query was not displayed in the CLI. BREAKING CHANGE: The response from the RESTful API has changed for some commands with this commit: the `SourceDescription` type no longer has a `format` field. Instead it has `keyFormat` and `valueFormat` fields. ## `SHOW QUERY` changes: Response now includes a `state` property for each query that indicates the state of the query. e.g. ```json { "queryString" : "create table OUTPUT as select * from INPUT;", "sinks" : [ "OUTPUT" ], "id" : "CSAS_OUTPUT_0", "state" : "Running" } ``` The CLI output was: ``` ksql> show queries; Query ID | Kafka Topic | Query String CSAS_OUTPUT_0 | OUTPUT | CREATE STREAM OUTPUT WITH (KAFKA_TOPIC='OUTPUT', PARTITIONS=1, REPLICAS=1) AS SELECT * FROM INPUT INPUT EMIT CHANGES; CTAS_CLICK_USER_SESSIONS_5 | CLICK_USER_SESSIONS | CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; For detailed information on a Query run: EXPLAIN <Query ID>; ``` and is now: ``` Query ID | Status | Kafka Topic | Query String CSAS_OUTPUT_0 | RUNNING | OUTPUT | CREATE STREAM OUTPUT WITH (KAFKA_TOPIC='OUTPUT', PARTITIONS=1, REPLICAS=1) AS SELECT *FROM INPUT INPUTEMIT CHANGES; For detailed information on a Query run: EXPLAIN <Query ID>; ``` Note the addition of the `Status` column and the fact that `Query String` is now longer being written across multiple lines. ## `DESCRIBE <source>;` changes: old CLI output: ``` ksql> describe CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) USERID | INTEGER COUNT | BIGINT For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>; ``` New CLI output: ``` ksql> describe CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) (Window type: SESSION) USERID | INTEGER COUNT | BIGINT For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>; ``` Note the addition of the `Window Type` information. The extended version of the command has also changed. Old output: ``` ksql> describe extended CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Type : TABLE Key field : USERID Key format : STRING Timestamp field : Not set - using <ROWTIME> Value Format : JSON Kafka topic : CLICK_USER_SESSIONS (partitions: 1, replication: 1) Statement : CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) USERID | INTEGER COUNT | BIGINT Queries that write from this TABLE ----------------------------------- CTAS_CLICK_USER_SESSIONS_5 (RUNNING) : CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; For query topology and execution plan please run: EXPLAIN <QueryId> Local runtime statistics ------------------------ (Statistics of the local KSQL server interaction with the Kafka topic CLICK_USER_SESSIONS) ``` New output: ``` ksql> describe extended CLICK_USER_SESSIONS; Name : CLICK_USER_SESSIONS Type : TABLE Key field : USERID Timestamp field : Not set - using <ROWTIME> Key format : KAFKA Value format : JSON Kafka topic : CLICK_USER_SESSIONS (partitions: 1, replication: 1) Statement : CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNT FROM CLICKSTREAM CLICKSTREAM WINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERID EMIT CHANGES; Field | Type ROWTIME | BIGINT (system) ROWKEY | INTEGER (system) (Window type: SESSION) USERID | INTEGER COUNT | BIGINT Queries that write from this TABLE ----------------------------------- CTAS_CLICK_USER_SESSIONS_5 (RUNNING) : CREATE TABLE CLICK_USER_SESSIONS WITH (KAFKA_TOPIC='CLICK_USER_SESSIONS', PARTITIONS=1, REPLICAS=1) AS SELECT CLICKSTREAM.USERID USERID, COUNT(*) COUNTFROM CLICKSTREAM CLICKSTREAMWINDOW SESSION ( 300 SECONDS ) GROUP BY CLICKSTREAM.USERIDEMIT CHANGES; For query topology and execution plan please run: EXPLAIN <QueryId> Local runtime statistics ------------------------ (Statistics of the local KSQL server interaction with the Kafka topic CLICK_USER_SESSIONS) ``` Note: the change from `Key format` of `STRING` to `KAFKA`. The output of `Window Type` information for windowed schemas and outputing sql statements on a single line.
Thanks @big-andy-coates! Couple of comments:
|
Personally, I think it's fine/a-good-thing that the SQL is multi-line when its not in a table with potentially multiple rows of data. Putting that another way, the However, for the read and write queries, where there can be multiple rows, I think multi-line SQL makes it hard to read. Have the SQL in a single line means one-line per query, which IMHO is much easier to understand.
Sure, though TBH it's just copying whatever text is provided. There are other existing tests that ensure
Good point, I'll test manually. I'm not sure why you didn't approve this PR; none of your concerns seem to warrant not approving. Was this intentional? |
Ack, merged without that manual testing... sorry. Will do it now... |
Description
While testing the primitive keys stuff manually I can across a couple of issues, inconsistencies, poor UX and bugs, which this commit looks to fix.
This commit:
HOPPING
,TUMBLING
SESSION
or none.BREAKING CHANGE: The response from the RESTful API has changed for some commands with this commit: the
SourceDescription
type no longer has aformat
field. Instead it haskeyFormat
andvalueFormat
fields.SHOW QUERY
changes:Response now includes a
state
property for each query that indicates the state of the query.e.g.
The CLI output was:
and is now:
Note the addition of the
Status
column and the fact thatQuery String
is now longer being written across multiple lines.DESCRIBE <source>;
changes:old CLI output:
New CLI output:
Note the addition of the
Window Type
information.DESCRIBE EXTENDED <source>;
changes:Old output:
New output:
Note: the change from
Key format
ofSTRING
toKAFKA
. The output ofWindow Type
information for windowed schemas and outputing sql statements on a single line.Testing done
Describe the testing strategy. Unit and integration tests are expected for any behavior changes.
Reviewer checklist