Skip to content

Commit

Permalink
Improve and expand client protocol docs
Browse files Browse the repository at this point in the history
Include the new spooling protocol and its configuration for CLI and JDBC
driver.
  • Loading branch information
mosabua committed Oct 18, 2024
1 parent d545e45 commit 0ff72f7
Show file tree
Hide file tree
Showing 7 changed files with 213 additions and 35 deletions.
164 changes: 164 additions & 0 deletions docs/src/main/sphinx/admin/properties-client-protocol.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Client protocol properties

bla bla intro with links to clients

(prop-protocol-v1)=
## v1 protocol

blabla intro

### `protocol.v1.alternate-header-name`

**Type:** [](prop-type-string)

The 351 release of Trino changes the HTTP client protocol headers to start with
`X-Trino-`. Clients for versions 350 and lower expect the HTTP headers to
start with `X-Presto-`, while newer clients expect `X-Trino-`. You can support these
older clients by setting this property to `Presto`.

The preferred approach to migrating from versions earlier than 351 is to update
all clients together with the release, or immediately afterwards, and then
remove usage of this property.

Ensure to use this only as a temporary measure to assist in your migration
efforts.

### `protocol.v1.prepared-statement-compression.length-threshold`

- **Type:** [](prop-type-integer)
- **Default value:** `2048`

Prepared statements that are submitted to Trino for processing, and are longer
than the value of this property, are compressed for transport via the HTTP
header to improve handling, and to avoid failures due to hitting HTTP header
size limits.

### `protocol.v1.prepared-statement-compression.min-gain`

- **Type:** [](prop-type-integer)
- **Default value:** `512`

Prepared statement compression is not applied if the size gain is less than the
configured value. Smaller statements do not benefit from compression, and are
left uncompressed.

(prop-protocol-spooling)=
## Spooling protocol

intro and more - do we want a separate page in the admin section like we do for
FTE instead or in addition? maybe a separate page since the file system details
need to be configured

### `protocol.spooling.worker-access`

- **Type:** [](prop-type-boolean)
- **Default value:** `false`

Use worker nodes to retrieve data from spooling location


### `protocol.spooling.direct-storage-access`

- **Type:** [](prop-type-boolean)
- **Default value:** `true`

Retrieve segments directly from the spooling location

### `protocol.spooling.direct-storage-fallback`

- **Type:** [](prop-type-boolean)
- **Default value:** `false`

Fallback segment retrieval through the coordinator when direct storage access is
not possible.

### `protocol.spooling.initial-segment-size`

- **Type:** [](prop-type-data-size)
- **Default value:** 8MB

Initial size of the spooled segments in bytes

### `protocol.spooling.maximum-segment-size`

- **Type:** [](prop-type-data-size)
- **Default value:** 16MB

tbd

### `protocol.spooling.inline-segments`

- **Type:** [](prop-type-boolean)
- **Default value:** `false`

Allow protocol to inline data

### `protocol.spooling.shared-secret-key`

- **Type:** [](prop-type-string)

256 bit, base64-encoded secret key used to secure segment identifiers.


(prop-spooling-filesystem)=
## Spooling filesystem

mabye this should go onto another page .. but I actually think not .. it fits
here unless there is any other usage for the spooling file system besides the
spooling protocol


### `fs.azure.enabled`

- **Type:** [](prop-type-boolean)

tbd, link to azure object storage for more details, exclusive to other


### `fs.s3.enabled`

- **Type:** [](prop-type-boolean)


### `fs.gcs.enabled`

- **Type:** [](prop-type-boolean)


### `fs.location`



### `fs.layout`


layout class, some sort of `SIMPLE` or `PARTITIONED`

Spooling segments file system layout


### `fs.segment.ttl`

Maximum duration for the client to retrieve spooled segment before it expires


### `fs.segment.encryption`


Encrypt segments with ephemeral keys


### `fs.segment.pruning.enabled`

Prune expired segments periodically


### `fs.segment.pruning.interval`

Interval to prune expired segments


### `fs.segment.pruning.batch-size`


Prune expired segments in batches of provided size
35 changes: 0 additions & 35 deletions docs/src/main/sphinx/admin/properties-general.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,41 +33,6 @@ across nodes in the cluster. It can be disabled, when it is known that the
output data set is not skewed, in order to avoid the overhead of hashing and
redistributing all the data across the network.

## `protocol.v1.alternate-header-name`

**Type:** `string`

The 351 release of Trino changes the HTTP client protocol headers to start with
`X-Trino-`. Clients for versions 350 and lower expect the HTTP headers to
start with `X-Presto-`, while newer clients expect `X-Trino-`. You can support these
older clients by setting this property to `Presto`.

The preferred approach to migrating from versions earlier than 351 is to update
all clients together with the release, or immediately afterwards, and then
remove usage of this property.

Ensure to use this only as a temporary measure to assist in your migration
efforts.

## `protocol.v1.prepared-statement-compression.length-threshold`

- **Type:** {ref}`prop-type-integer`
- **Default value:** `2048`

Prepared statements that are submitted to Trino for processing, and are longer
than the value of this property, are compressed for transport via the HTTP
header to improve handling, and to avoid failures due to hitting HTTP header
size limits.

## `protocol.v1.prepared-statement-compression.min-gain`

- **Type:** {ref}`prop-type-integer`
- **Default value:** `512`

Prepared statement compression is not applied if the size gain is less than the
configured value. Smaller statements do not benefit from compression, and are
left uncompressed.

(file-compression)=
## File compression and decompression

Expand Down
1 change: 1 addition & 0 deletions docs/src/main/sphinx/admin/properties.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ properties, refer to the {doc}`connector documentation </connector/>`.
:titlesonly: true
General <properties-general>
Client protocol <properties-client-protocol>
HTTP server <properties-http-server>
Resource management <properties-resource-management>
Query management <properties-query-management>
Expand Down
4 changes: 4 additions & 0 deletions docs/src/main/sphinx/client.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,7 @@ The Trino project maintains the following other client libraries:

In addition, other communities and vendors provide [numerous other client
libraries, drivers, and applications](https://trino.io/ecosystem/client)

Configure support for the [spooling protocol on the cluster](prop-protocol-spooling)
and your client to improve performance for client interactions with higher data
transfer demands.
30 changes: 30 additions & 0 deletions docs/src/main/sphinx/client/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,36 @@ Query 20200707_170726_00030_2iup9 failed: line 1:25: Column 'region' cannot be r
SELECT nationkey, name, region FROM tpch.sf1.nation LIMIT 3
```

(cli-spooled-protocol)=
## Spooled protocol


--encoding=<encoding> Experimental spooled protocol encoding [available: json, json+zstd, json+lz4]

validate encoding?

CLI options


--encoding-id - renamed to --encoding yet?

validate encoding in CLI?


segment expiry configuration

spolling location configuration
accessbile from all nodes


sizing of spooling storage location
management of old spooled data
survives cluster restart? Or is picked up cleanly after restart?


https://www.linkedin.com/posts/mateuszgajewski_trino-trino-performance-activity-7244658731991392258-0udu?utm_source=share&utm_medium=member_desktop


(cli-output-format)=
## Output formats

Expand Down
7 changes: 7 additions & 0 deletions docs/src/main/sphinx/client/jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,3 +262,10 @@ may not be specified using both methods.
network overhead and uses smaller HTTP headers and requires Trino 431 or
greater.
:::


(cli-spooled-protocol)=
## Spooled protocol


--encoding=<encoding> Experimental spooled protocol encoding [available: json, json+zstd, json+lz4]
7 changes: 7 additions & 0 deletions docs/src/main/sphinx/develop/client-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,3 +260,10 @@ subsequent requests to be consistent with the response headers received.
Class `io.trino.client.ProtocolHeaders` in module `trino-client` in the
`client` directory of Trino source enumerates all the HTTP request and
response headers allowed by the Trino client REST API.


(spooled-protocol)=
## Spooled protocol

https://www.linkedin.com/posts/mateuszgajewski_trino-trino-performance-activity-7244658731991392258-0udu?utm_source=share&utm_medium=member_desktop

0 comments on commit 0ff72f7

Please sign in to comment.