Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save operational params in the same way with delta io #1054

Merged
merged 3 commits into from
Jan 17, 2023

Conversation

ismoshkov
Copy link
Contributor

Description

Currently writing "operationParameters" in commit info is misaligned with delta io connector.

Here the sample of structure which is used in delta io.

So the goal of this PR is to align with delta io approach and the PR do two thins: convert all values to string and delete keys with null values.

Related Issue(s)

Closes issue #1017

rtyler
rtyler previously approved these changes Jan 8, 2023
@rtyler rtyler added enhancement New feature or request binding/rust Issues for the Rust crate labels Jan 8, 2023
@@ -493,7 +493,7 @@ async fn test_commit_info() -> Result<(), Box<dyn Error>> {
assert_eq!(last_commit["readVersion"], json!(version));
assert_eq!(
last_commit["operationParameters"]["targetSize"],
json!(2_000_000)
json!("2000000")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see a test for partitionValues. Could you add that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you please clarify your request? I ask because "partitionValues" is not part of the commit info and operating params that have been changed in this PR.
Perhaps you want to check that "partitionBy" as part of the operating params is built correctly?

If you want exactly "partitionValues", I can figure out how to check that. But it looks like it should be in a separate PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry. I meant "partitionBy".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added separated unit test, because the most of commit info sections doesn't pass operation info. So they doesn't contains operational parameters which I can use to check.

* fix commit info building
Copy link
Collaborator

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ismoshkov!

@wjones127 wjones127 merged commit 4741a89 into delta-io:main Jan 17, 2023
chitralverma pushed a commit to chitralverma/delta-rs that referenced this pull request Mar 17, 2023
# Description
Currently writing "operationParameters" in commit info is misaligned
with delta io connector.


[Here](https://github.com/delta-io/delta/blob/36a7edb8cf507e713700ba827c5fb5ad32b9163e/core/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala#L695)
the sample of structure which is used in delta io.

So the goal of this PR is to align with delta io approach and the PR do
two thins: convert all values to string and delete keys with null
values.

# Related Issue(s)
Closes [issue delta-io#1017](delta-io#1017)

Co-authored-by: Ilya Moshkov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate delta-rs-crate enhancement New feature or request rust
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transaction log JSON formatting issue when writing data via Python bindings
4 participants