Skip to content

Commit

Permalink
docs Update for WriterProperties
Browse files Browse the repository at this point in the history
  • Loading branch information
sherlockbeard authored and rtyler committed Aug 20, 2024
1 parent 649c63b commit 9825361
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 1 deletion.
4 changes: 4 additions & 0 deletions docs/api/delta_writer.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ search:

::: deltalake.write_deltalake

::: deltalake.BloomFilterProperties

::: deltalake.ColumnProperties

::: deltalake.WriterProperties

## Convert to Delta Tables
Expand Down
35 changes: 34 additions & 1 deletion docs/usage/writing/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,4 +69,37 @@ In this case, you can use a `predicate` to overwrite only the relevant records o
Data written must conform to the same predicate, i.e. not contain any records that don't match the `predicate` condition,
otherwise the operation will fail

{{ code_example('operations', 'replace_where', ['replaceWhere'])}}
{{ code_example('operations', 'replace_where', ['replaceWhere'])}}

## Using Writer Properites

You can customize the Rust Parquet writer by using the [WriterProperties](../../api/delta_writer.md#deltalake.WriterProperties). Additionally, you can apply extra configurations through the [BloomFilterProperties](../../api/delta_writer.md#deltalake.BloomFilterProperties) and [ColumnProperties](../../api/delta_writer.md#deltalake.ColumnProperties) data classes.


Here's how you can do it:
``` python
from deltalake import BloomFilterProperties, ColumnProperties, WriterProperties, write_deltalake
import pyarrow as pa

wp = WriterProperties(
statistics_truncate_length=200,
default_column_properties=ColumnProperties(
bloom_filter_properties=BloomFilterProperties(True, 0.2, 30)
),
column_properties={
"value_non_bloom": ColumnProperties(bloom_filter_properties=None),
},
)

table_path = "/tmp/my_table"

data = pa.table(
{
"id": pa.array(["1", "1"], pa.string()),
"value": pa.array([11, 12], pa.int64()),
"value_non_bloom": pa.array([11, 12], pa.int64()),
}
)

write_deltalake(table_path, data, writer_properties=wp)
```

0 comments on commit 9825361

Please sign in to comment.