Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG FIX: Raise appropriate strings error when concatenating strings column #8290

Merged
merged 6 commits into from
Jun 1, 2021

Conversation

skirui-source
Copy link
Contributor

Fixes: #8228

@skirui-source skirui-source added bug Something isn't working Python Affects Python cuDF API. labels May 20, 2021
@skirui-source skirui-source self-assigned this May 20, 2021
@skirui-source skirui-source marked this pull request as ready for review May 25, 2021 04:16
@skirui-source skirui-source requested a review from a team as a code owner May 25, 2021 04:16
try:
col = libcudf.concat.concat_columns(objs)
except RuntimeError as e:
if "concatenated rows exceeds size_type range" in str(e):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error from libcudf is not specific to strings columns. The number of rows in any column cannot exceed size_type. This means concatenating 2 integer columns each with rows lengths greater than size_type/2 would throw this error from libcudf.

The specific message for strings below only occurs for strings columns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generalized the error to "total size of output is too large for a cudf column", is that sufficient?

@JohnZed
Copy link
Contributor

JohnZed commented May 27, 2021

@skirui-source / @davidwendt / @quasiben - is this needed for 21.06 or should it be pushed back?

@skirui-source skirui-source added the non-breaking Non-breaking change label May 27, 2021
@shwina shwina changed the base branch from branch-21.06 to branch-21.08 May 27, 2021 23:24
@codecov
Copy link

codecov bot commented May 28, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@7231e3b). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #8290   +/-   ##
===============================================
  Coverage                ?   82.83%           
===============================================
  Files                   ?      109           
  Lines                   ?    17901           
  Branches                ?        0           
===============================================
  Hits                    ?    14828           
  Misses                  ?     3073           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7231e3b...b40e952. Read the comment docs.

Copy link
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@shwina
Copy link
Contributor

shwina commented Jun 1, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 24f8bd9 into rapidsai:branch-21.08 Jun 1, 2021
@skirui-source skirui-source deleted the strconcat branch October 19, 2021 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Raise appropriate strings error when concatenating strings column
5 participants