-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]Optimize pg.add_data for vector properties #3022
[REVIEW]Optimize pg.add_data for vector properties #3022
Conversation
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks for adding the benchmark too. I had one question below:
cp.arange(start=0, stop=len(data) + 1, step=n_cols), dtype="int32" | ||
) | ||
mask_col = cp.full(shape=n_rows, fill_value=True) | ||
mask = cudf._lib.transform.bools_to_mask(as_column(mask_col)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to be a public API, or is there a risk cuDF could change it and break us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a risk that cuDF could change it later on but I could not come up with something without it.
@gpucibot merge |
This PR fixes #2991
Optimization around 4000x . See benchmarks.
On this PR:
On Mainline: