-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Go] DeltaBitPack deltaBitWidths overflow #38399
Comments
Would you mind provide the test data which might trigger the bug here? |
storage_000000000.parquet.txt |
arrow/go/parquet/internal/encoding/delta_bit_packing.go Lines 175 to 176 in ac581fd
probably you forgot "if err != nil return" |
I've found C++ is able to read file like this, let me investigate it. |
I've find out the reason. Let me submit a fixing |
@Hattonuri I've submit a fixing here: #38413 You can have a try at the new code. |
Now it works! Thank you |
…tData (#38413) ### Rationale for this change As #38399 says. DeltaBitPack will corrupt when we meet a column chunk with more than one page. During first page decoding, it works well. But when the second page comes, the `d.usedFirst` haven't been reset, which cause the bug. ### What changes are included in this PR? 1. Some style enhancement 2. Bug fix ### Are these changes tested? Currently not ### Are there any user-facing changes? bugfix * Closes: #38399 Authored-by: mwish <[email protected]> Signed-off-by: Matt Topol <[email protected]>
…ter SetData (apache#38413) ### Rationale for this change As apache#38399 says. DeltaBitPack will corrupt when we meet a column chunk with more than one page. During first page decoding, it works well. But when the second page comes, the `d.usedFirst` haven't been reset, which cause the bug. ### What changes are included in this PR? 1. Some style enhancement 2. Bug fix ### Are these changes tested? Currently not ### Are there any user-facing changes? bugfix * Closes: apache#38399 Authored-by: mwish <[email protected]> Signed-off-by: Matt Topol <[email protected]>
…ter SetData (apache#38413) ### Rationale for this change As apache#38399 says. DeltaBitPack will corrupt when we meet a column chunk with more than one page. During first page decoding, it works well. But when the second page comes, the `d.usedFirst` haven't been reset, which cause the bug. ### What changes are included in this PR? 1. Some style enhancement 2. Bug fix ### Are these changes tested? Currently not ### Are there any user-facing changes? bugfix * Closes: apache#38399 Authored-by: mwish <[email protected]> Signed-off-by: Matt Topol <[email protected]>
…ter SetData (apache#38413) ### Rationale for this change As apache#38399 says. DeltaBitPack will corrupt when we meet a column chunk with more than one page. During first page decoding, it works well. But when the second page comes, the `d.usedFirst` haven't been reset, which cause the bug. ### What changes are included in this PR? 1. Some style enhancement 2. Bug fix ### Are these changes tested? Currently not ### Are there any user-facing changes? bugfix * Closes: apache#38399 Authored-by: mwish <[email protected]> Signed-off-by: Matt Topol <[email protected]>
…tData (#38413) ### Rationale for this change As apache/arrow#38399 says. DeltaBitPack will corrupt when we meet a column chunk with more than one page. During first page decoding, it works well. But when the second page comes, the `d.usedFirst` haven't been reset, which cause the bug. ### What changes are included in this PR? 1. Some style enhancement 2. Bug fix ### Are these changes tested? Currently not ### Are there any user-facing changes? bugfix * Closes: #38399 Authored-by: mwish <[email protected]> Signed-off-by: Matt Topol <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
I wrote some int64 using c++ ::parquet::Encoding::DELTA_BINARY_PACKED
and then tried to read this with go but got:
I assume that this line lacks boundaries check
arrow/go/parquet/internal/encoding/delta_bit_packing.go
Line 228 in ac581fd
Component(s)
Go
The text was updated successfully, but these errors were encountered: