Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] group_by.replace_nulls fails if the column is not nullable #8504

Closed
revans2 opened this issue Jun 11, 2021 · 3 comments
Closed

[BUG] group_by.replace_nulls fails if the column is not nullable #8504

revans2 opened this issue Jun 11, 2021 · 3 comments
Assignees
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.

Comments

@revans2
Copy link
Contributor

revans2 commented Jun 11, 2021

I put this as a bug because it is unexpected.

In my use case I can work around the issue because my data is already sorted by the groups. If it wasn't and I had one column that wasn't nullable in with others that were I would then have to write a conditional path that would sort the data myself ahead of time and pull out the columns that don't need null replacement and hope that replace_nulls would not change the order of the data. Would be nice if replace_nulls could just copy the column if needed.

@revans2 revans2 added bug Something isn't working Needs Triage Need team to review and classify labels Jun 11, 2021
@ttnghia ttnghia self-assigned this Jun 13, 2021
@ttnghia ttnghia removed the Needs Triage Need team to review and classify label Jun 13, 2021
@ttnghia
Copy link
Contributor

ttnghia commented Jun 14, 2021

Can you clarify the issue, please? I dig into the replace_null code and see that nullable columns are already handled thus I'm not sure what's wrong here? As we can see from the code below, all columns are always sorted regardless of their nullability. If a column is not nullable, it will just be copied (after sorted).

      bool nullable       = values.column(i).nullable();
      auto final_mr       = nullable ? rmm::mr::get_current_device_resource() : mr;
      auto grouped_values = helper().grouped_values(values.column(i), stream, final_mr);
      return nullable ? detail::group_replace_nulls(
                          *grouped_values, group_labels, replace_policies[i], stream, mr)
                      : std::move(grouped_values);

@harrism harrism changed the title [BUG] group_by.replace_nulls fails if the collumn is not nullable [BUG] group_by.replace_nulls fails if the column is not nullable Jun 15, 2021
@harrism harrism added the libcudf Affects libcudf (C++/CUDA) code. label Jun 15, 2021
@ttnghia
Copy link
Contributor

ttnghia commented Jul 13, 2021

@revans2 is there any chance that this PR fixes your issue: #8699?

@revans2
Copy link
Contributor Author

revans2 commented Jul 14, 2021

Not totally sure what happened I have tried to reproduce this and cannot. I am just going to close it and will reopen it if I see the issue again.

@revans2 revans2 closed this as completed Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

3 participants