Skip to content

Commit

Permalink
Improve docs for Parquet file format to indicate we support all compr…
Browse files Browse the repository at this point in the history
…ession/encodings (#553)
  • Loading branch information
phillipleblanc authored Oct 21, 2024
1 parent 90c0b84 commit 15d44c7
Showing 1 changed file with 28 additions and 2 deletions.
30 changes: 28 additions & 2 deletions spiceaidocs/docs/reference/file_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,35 @@ pagination_prev: 'reference/index'
pagination_next: null
---

Spice currently supports CSV and Parquet data file-formats. Support for Iceberg and other file-formats are on the roadmap.
Spice currently supports CSV and Parquet data file-formats for data connectors that can read files from a file system or cloud object storage (i.e. [`s3://`](../components/data-connectors/s3.md), [`abfs://`](../components/data-connectors/abfs.md), [`file://`](../components/data-connectors/file.md), etc.). Support for Iceberg and other file-formats are on the roadmap.

The parameters supported for specific file-format are detailed on this page.
The parameters supported for specific file-formats are detailed on this page.

## Parquet

Spice automatically supports reading any Parquet file, regardless of the compression codec or data encoding used.

Compression codecs:

- [`UNCOMPRESSED`](https://parquet.apache.org/docs/file-format/data-pages/compression/#uncompressed)
- [`SNAPPY`](https://parquet.apache.org/docs/file-format/data-pages/compression/#snappy)
- [`GZIP`](https://parquet.apache.org/docs/file-format/data-pages/compression/#gzip)
- [`LZO`](https://parquet.apache.org/docs/file-format/data-pages/compression/#lzo)
- [`BROTLI`](https://parquet.apache.org/docs/file-format/data-pages/compression/#brotli)
- [`LZ4`](https://parquet.apache.org/docs/file-format/data-pages/compression/#lz4) (deprecated in favor of `LZ4_RAW`)
- [`LZ4_RAW`](https://parquet.apache.org/docs/file-format/data-pages/compression/#lz4_raw)
- [`ZSTD`](https://parquet.apache.org/docs/file-format/data-pages/compression/#zstd)

Data encodings:

- [`PLAIN`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#plain-plain--0)
- [`PLAIN_DICTIONARY` / `RLE_DICTIONARY`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8)
- [`RLE`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#run-length-encoding--bit-packing-hybrid-rle--3)
- [`BIT_PACKED`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#bit-packed-deprecated-bit_packed--4) (deprecated in favor of `RLE`)
- [`DELTA_BINARY_PACKED`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#delta-binary-packing-delta_binary_packed--5)
- [`DELTA_LENGTH_BYTE_ARRAY`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#delta-length-byte-array-delta_length_byte_array--6)
- [`DELTA_BYTE_ARRAY`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#delta-strings-delta_byte_array--7)
- [`BYTE_STREAM_SPLIT`](https://parquet.apache.org/docs/file-format/data-pages/encodings/#byte-stream-split-byte_stream_split--9)

## CSV

Expand Down

0 comments on commit 15d44c7

Please sign in to comment.