Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Implement deserialize for Python objects serialized as sequences #3339

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

kevinzwang
Copy link
Member

visit_seq is used when using serde_json to serialize/deserialize Rust objects, since the byte buffer is just stored as a list of numbers in JSON.

@kevinzwang kevinzwang self-assigned this Nov 20, 2024
@github-actions github-actions bot added the bug Something isn't working label Nov 20, 2024
Copy link

codspeed-hq bot commented Nov 20, 2024

CodSpeed Performance Report

Merging #3339 will degrade performances by 52.79%

Comparing kevin/pyobject-deser-seq (31ed37c) with main (7922d2d)

Summary

❌ 2 regressions
✅ 15 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main kevin/pyobject-deser-seq Change
test_iter_rows_first_row[100 Small Files] 188.3 ms 229.2 ms -17.87%
test_show[100 Small Files] 14.8 ms 31.4 ms -52.79%

Copy link

codecov bot commented Nov 20, 2024

Codecov Report

Attention: Patch coverage is 0% with 11 lines in your changes missing coverage. Please review.

Project coverage is 77.17%. Comparing base (a9bf7c0) to head (31ed37c).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
src/common/py-serde/src/python.rs 0.00% 11 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3339      +/-   ##
==========================================
- Coverage   77.37%   77.17%   -0.20%     
==========================================
  Files         677      678       +1     
  Lines       82864    83228     +364     
==========================================
+ Hits        64113    64235     +122     
- Misses      18751    18993     +242     
Files with missing lines Coverage Δ
src/common/py-serde/src/python.rs 72.05% <0.00%> (-5.72%) ⬇️

... and 41 files with indirect coverage changes

---- 🚨 Try these New Features:

Copy link
Contributor

@jaychia jaychia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, but should we really be using JSON as the serde mechanism instead of bytes?

@kevinzwang
Copy link
Member Author

Seems fine, but should we really be using JSON as the serde mechanism instead of bytes?

I had this question too, but we can address that later. @samster25 wrote the IO Config pickling so maybe he should know the rationale.

@kevinzwang kevinzwang merged commit ec24c80 into main Nov 20, 2024
42 of 44 checks passed
@kevinzwang kevinzwang deleted the kevin/pyobject-deser-seq branch November 20, 2024 18:47
@samster25
Copy link
Member

I had this question too, but we can address that later. @samster25 wrote the IO Config pickling so maybe he should know the rationale.

I did serde_json at the start since we all had were ints and strings and it was easy to read as a human and debug. However given that we now have bytes, pickles and whatnot inside of the payload, I think it makes sense to just encode it as bytes instead.

@jaychia
Copy link
Contributor

jaychia commented Nov 22, 2024

@kevinzwang maybe good as a quick follow-up to switch us to using bytes

@kevinzwang
Copy link
Member Author

@kevinzwang maybe good as a quick follow-up to switch us to using bytes

#3400 PTAL!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants