-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: enhance
PRINT TOPIC
's format detection (#4551)
* feat: enhance `PRINT TOPIC`'s format detection fixes: #4258 With this change `PRINT TOPIC` has enhanced detection for the key and value formats of a topic. The command starts with a list of known formats for both the key and the value and refines this list as it sees more data. As the list of possible formats is refined over time the command will output the reduced list. For example, you may see output such as: ``` ksql> PRINT some_topic FROM BEGINNING; Key format: JSON or SESSION(KAFKA_STRING) or HOPPING(KAFKA_STRING) or TUMBLING(KAFKA_STRING) or KAFKA_STRING Value format: JSON or KAFKA_STRING rowtime: 12/21/18 23:58:42 PM PSD, key: stream/CLICKSTREAM/create, value: {statement":"CREATE STREAM clickstream (_time bigint,time varchar, ip varchar, request varchar, status int, userid int, bytes bigint, agent varchar) with (kafka_topic = 'clickstream', value_format = 'json');","streamsProperties":{}} rowtime: 12/21/18 23:58:42 PM PSD, key: table/EVENTS_PER_MIN/create, value: {"statement":"create table events_per_min as select userid, count(*) as events from clickstream window TUMBLING (size 10 second) group by userid EMIT CHANGES;","streamsProperties":{}} Key format: KAFKA_STRING ... ``` In the last line of the above output the command has narrowed the key format down as it has proceeded more data. The command has also been updated to only detect valid UTF8 encoded text as type `JSON` or `KAFKA_STRING`. This is inline with how KSQL would later deserialize the data. If no known format can successfully deserialize the data it is printed as a combination of ASCII characters and hex encoded bytes.
- Loading branch information
1 parent
77df772
commit a3fae28
Showing
21 changed files
with
1,673 additions
and
1,049 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,189 @@ | ||
--- | ||
layout: page | ||
title: Schemas in ksqlDB | ||
tagline: Use schemas in your queries | ||
tagline: Defining the structure of your data | ||
description: Learn how schemas work with ksqlDB | ||
keywords: ksqldb, schema, evolution, avro | ||
--- | ||
|
||
Data sources like streams and tables have an associated schema. This schema defines the columns | ||
available in the data, just like a the columns in a traditional SQL database table. | ||
|
||
## Key vs Value columns | ||
|
||
KsqlDB supports both key and value columns. These map to the data held in the keys and values of the | ||
underlying {{ site.ak }} topic. | ||
|
||
A column is defined by a combination of its [name](#valid-identifiers), its [SQL data type](#sql-data-type), | ||
and possibly a namespace. | ||
|
||
Key columns have a `KEY` namespace suffix. Key columns have the following restrictions: | ||
* The can only be a single key column, currently. | ||
* The key column must be named `ROWKEY` in the KSQL schema. | ||
|
||
Value columns have no namespace suffix. There can be one or more value columns amd the value columns | ||
can have any name. | ||
|
||
For example, the following declares a schema with a single `INT` key column and several value | ||
columns: | ||
|
||
```sql | ||
ROWKEY INT KEY, ID BIGINT, STRING NAME, ADDRESS ADDRESS_TYPE | ||
``` | ||
|
||
## Valid Identifiers | ||
|
||
Column and field names must be valid identifiers. | ||
|
||
Unquoted identifiers will be treated as upper-case, for example `col0` is equivalent to `COL0`, and | ||
must contain only alpha-numeric and underscore characters. | ||
|
||
Identifiers containing invalid character, or where case needs to be preserved, can be quoted using | ||
back-tick quotes, for example ``col0``. | ||
|
||
## SQL data types | ||
|
||
The following SQL types are supported by ksqlDB: | ||
|
||
* [Primitive types](#primitive-types) | ||
* [Decimal type](#decimal-type) | ||
* [Array type](#array-type) | ||
* [Map type](#map-type) | ||
* [Struct type](#struct-type) | ||
* [Custom types](#custom-types) | ||
|
||
### Primitive types | ||
|
||
Supported primitive types are: | ||
|
||
* `BOOLEAN`: a binary value | ||
* `INT`: 32-bit signed integer | ||
* `BIGINT`: 64-bit signed integer | ||
* `DOUBLE`: double precision (64-bit) IEEE 754 floating-point number | ||
* `STRING`: a unicode character sequence (UTF8) | ||
|
||
### Decimal type | ||
|
||
The `DECIMAL` type can store numbers with a very large number of digits and perform calculations exactly. | ||
It is recommended for storing monetary amounts and other quantities where exactness is required. | ||
However, arithmetic on decimals is slow compared to integer and floating point types. | ||
|
||
`DECIMAL` types have a _precision_ and _scale_. | ||
The scale is the number of digits in the fractional part, to the right of the decimal point. | ||
The precision is the total number of significant digits in the whole number, that is, | ||
the number of digits on both sides of the decimal point. | ||
For example, the number `765.937500` has a precision of 9 and a scale of 6. | ||
|
||
To declare a column of type `DECIMAL` use the syntax: | ||
|
||
```sql | ||
DECIMAL(precision, scale) | ||
``` | ||
|
||
The precision must be positive, the scale zero or positive. | ||
|
||
### Array type | ||
|
||
The `ARRAY` type defines a variable-length array of elements. All elements in the array must be of | ||
the same type. | ||
|
||
To declare an `ARRAY` use the syntax: | ||
|
||
``` | ||
ARRAY<element-type> | ||
``` | ||
|
||
The _element-type_ of an another [SQL data type](#sql-data-types). | ||
|
||
For example, the following creates an array of `STRING`s: | ||
|
||
```sql | ||
ARRAY<STRING> | ||
``` | ||
|
||
Instances of an array can be created using the syntax: | ||
|
||
``` | ||
ARRAY[value [, value]*] | ||
``` | ||
|
||
For example, the following creates an array with three `INT` elements: | ||
|
||
```sql | ||
ARRAY[2, 4, 6] | ||
``` | ||
|
||
### Map type | ||
|
||
The `MAP` type defines a variable-length collection of key-value pairs. All keys in the map must be | ||
of the same type. All values in the map must be of the same type. | ||
|
||
To declare a `MAP` use the syntax: | ||
|
||
``` | ||
MAP<key-type, element-type> | ||
``` | ||
|
||
The _key-type_ must currently be `STRING` while the _value-type_ can an any other [SQL data type](#sql-data-types). | ||
|
||
For example, the following creates a map with `STRING` keys and values: | ||
|
||
```sql | ||
MAP<STRING, STRING> | ||
``` | ||
|
||
Instances of a map can be created using the syntax: | ||
|
||
``` | ||
MAP(key := value [, key := value]*) | ||
``` | ||
|
||
For example, the following creates a map with three key-value pairs: | ||
|
||
```sql | ||
MAP('a' := 1, 'b' := 2, 'c' := 3) | ||
``` | ||
|
||
### Struct type | ||
|
||
The `STRUCT` type defines a list of named fields, where each field can have any [SQL data type](#sql-data-types). | ||
|
||
To declare a `STRUCT` use the syntax: | ||
|
||
``` | ||
STRUCT<field-name field-type [, field-name field-type]*> | ||
``` | ||
|
||
The _field-name_ can be any [valid identifier](#valid-identifiers). The _field-type_ can be any | ||
valid [SQL data type](#sql-data-types). | ||
|
||
For example, the following creates a struct with an `INT` field called `FOO` and a `BOOLEAN` field | ||
call `BAR`: | ||
|
||
```sql | ||
STRUCT<FOO INT, BAR BOOLEAN> | ||
``` | ||
|
||
Instances of a struct can be created using the syntax: | ||
|
||
``` | ||
STRUCT(field-name := field-value [, field-name := field-value]*) | ||
``` | ||
|
||
For example, the following creates a struct with fields called `FOO` and `BAR` and sets their values | ||
to `10` and `true`, respectively: | ||
|
||
```sql | ||
STRUCT('FOO' := 10, 'BAR' := true) | ||
``` | ||
|
||
### Custom types | ||
|
||
KsqlDB supports custom types using the `CREATE TYPE` statements. | ||
See the [`CREATE TYPE` docs](../developer-guide/ksqldb-reference/create-type) for more information. | ||
|
||
TODO: | ||
|
||
- overview of how schemas work in ksqlDB | ||
- overview of data types | ||
- overview of serialization | ||
- schema evolution with ksqlDB and Avro | ||
|
||
|
||
Page last revised on: {{ git_revision_date }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.