Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partial serde support for ParquetWriterOptions #8627

Merged
merged 11 commits into from
Dec 23, 2023
15 changes: 15 additions & 0 deletions datafusion/proto/proto/datafusion.proto
Original file line number Diff line number Diff line change
Expand Up @@ -1206,13 +1206,28 @@ message PartitionColumn {
message FileTypeWriterOptions {
oneof FileType {
JsonWriterOptions json_options = 1;
ParquetWriterOptions parquet_options = 2;
}
}

message JsonWriterOptions {
CompressionTypeVariant compression = 1;
}

message ParquetWriterOptions {
WriterProperties writer_properties = 1;
}

message WriterProperties {
int32 data_page_size_limit = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the rust fields are usize, is there any reason to use in32 in the protobuf encoding?

Maybe it would make more sense to use i64 or u64 here instead 🤔

https://protobuf.dev/programming-guides/proto2/#scalar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have updated to u32.

Copy link
Member Author

@andygrove andygrove Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I misread / misunderstood your comment. I changed to u32 on the basis that these should not need to support negative numbers, but I may need to research this more to make sure that is the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, in Rust they are usize, so can't be negative.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now changed to u64 to better match usize.

int32 dictionary_page_size_limit = 2;
int32 data_page_row_count_limit = 3;
int32 write_batch_size = 4;
int32 max_row_group_size = 5;
string writer_version = 6;
string created_by = 7;
}

message FileSinkConfig {
reserved 6; // writer_mode

Expand Down
316 changes: 316 additions & 0 deletions datafusion/proto/src/generated/pbjson.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading