Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Strings scatter detail function #3440

Merged
merged 4 commits into from
Nov 22, 2019

Conversation

davidwendt
Copy link
Contributor

@davidwendt davidwendt commented Nov 21, 2019

Special scatter function for handling strings columns.
Closes #3437

The caller must set the null mask in the output column.

This is needed for #3354

@davidwendt davidwendt requested a review from a team as a code owner November 21, 2019 18:05
@davidwendt davidwendt self-assigned this Nov 21, 2019
@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf++ strings strings issues (C++ and Python) labels Nov 21, 2019
@codecov
Copy link

codecov bot commented Nov 21, 2019

Codecov Report

Merging #3440 into branch-0.11 will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##           branch-0.11    #3440   +/-   ##
============================================
  Coverage        87.36%   87.36%           
============================================
  Files               49       49           
  Lines             9295     9295           
============================================
  Hits              8121     8121           
  Misses            1174     1174

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd61041...df37104. Read the comment docs.

@davidwendt davidwendt changed the title [WIP] Strings scatter detail function [REVIEW] Strings scatter detail function Nov 21, 2019
@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Nov 21, 2019
@davidwendt
Copy link
Contributor Author

rerun tests

@trevorsm7
Copy link
Contributor

trevorsm7 commented Nov 21, 2019

Can the scatter interface be changed from scatter( SourceIterator begin, SourceIterator MapIterator scatter_map, ... to something like scatter( SourceIterator begin, MapIterator scatter_map, size_type scatter_size, ...? The issue is that the scatter map isn't required to be the same size as the source (can't be larger but it can be shorter).
Actually, nevermind. I can pass begin + scatter_size for the SourceIterator end, so it doesn't matter.

*---------------------------------------------------------------------------**/
std::unique_ptr<cudf::column> slice( strings_column_view strings,
*/
std::unique_ptr<cudf::column> slice( strings_column_view const& strings,
size_type start, size_type end=-1,
size_type step=1,
cudaStream_t stream=0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe swap the order of stream and mr for consistency with other APIs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require changing code that currently calls this which is not related to this PR.
I would rather do a separate PR to fix this.

Copy link
Contributor

@trevorsm7 trevorsm7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good and working as expected with PR 3354

@harrism harrism merged commit b429813 into rapidsai:branch-0.11 Nov 22, 2019
@davidwendt davidwendt deleted the fea-strings-scatter branch November 22, 2019 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Strings scatter function needed for cudf
3 participants