Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dictionary size computation in ORC writer #7737

Merged

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Mar 26, 2021

Fixes #7661

Corrects the field order in std::accumulate that computes the string column size w.r.t encoding.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Mar 26, 2021
@devavret
Copy link
Contributor

Issue not filed

Fixes #7661

@vuule vuule added bug Something isn't working non-breaking Non-breaking change cuIO cuIO issue labels Mar 26, 2021
@vuule vuule changed the title Fix dictionary size computation error in ORC writer Fix dictionary size computation in ORC writer Mar 26, 2021
@vuule vuule marked this pull request as ready for review March 27, 2021 00:34
@vuule vuule requested a review from a team as a code owner March 27, 2021 00:34
@vuule vuule self-assigned this Mar 27, 2021
@vuule
Copy link
Contributor Author

vuule commented Mar 27, 2021

rerun tests

@vuule vuule added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Mar 27, 2021
@codecov
Copy link

codecov bot commented Mar 27, 2021

Codecov Report

Merging #7737 (9a08dcf) into branch-0.19 (7871e7a) will increase coverage by 0.65%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7737      +/-   ##
===============================================
+ Coverage        81.86%   82.52%   +0.65%     
===============================================
  Files              101      101              
  Lines            16884    17458     +574     
===============================================
+ Hits             13822    14407     +585     
+ Misses            3062     3051      -11     
Impacted Files Coverage Δ
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
python/cudf/cudf/core/column/lists.py 87.68% <0.00%> (-3.72%) ⬇️
python/cudf/cudf/core/column/decimal.py 92.95% <0.00%> (-1.92%) ⬇️
python/cudf/cudf/core/abc.py 87.23% <0.00%> (-1.14%) ⬇️
python/cudf/cudf/core/column/numerical.py 94.83% <0.00%> (-0.20%) ⬇️
python/cudf/cudf/core/column/column.py 87.61% <0.00%> (-0.15%) ⬇️
python/cudf/cudf/utils/utils.py 85.36% <0.00%> (-0.07%) ⬇️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 78.71% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
... and 45 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2693e0...9a08dcf. Read the comment docs.

@vuule
Copy link
Contributor Author

vuule commented Mar 27, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 44adf97 into rapidsai:branch-0.19 Mar 27, 2021
@vuule vuule deleted the bug-orc-writer-dictionary-cost branch March 27, 2021 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Higher memory footprint when writing strings to orc
4 participants