Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace toTitle with capitalize for GpuInitCap #2838

Merged
merged 15 commits into from
Jul 7, 2021

Conversation

firestarman
Copy link
Collaborator

Replace toTitle with capitalize for GpuInitCap to align with the behavior of Spark InitCap.

closes #2786
closes #120

Signed-off-by: Firestarman [email protected]

@firestarman firestarman changed the title Replace toTitle with capitalize for GpuInitCap [WIP]Replace toTitle with capitalize for GpuInitCap Jun 29, 2021
@firestarman firestarman changed the title [WIP]Replace toTitle with capitalize for GpuInitCap [WIP] Replace toTitle with capitalize for GpuInitCap Jun 29, 2021
@firestarman
Copy link
Collaborator Author

firestarman commented Jun 29, 2021

This PR depends on rapidsai/cudf#8624, and it is WIP, because the doc can not be updated due to build failures.

@firestarman firestarman self-assigned this Jun 29, 2021
@firestarman firestarman added cudf_dependency An issue or PR with this label depends on a new feature in cudf bug Something isn't working labels Jun 29, 2021
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we should cover at least some of the characters mentioned in the bug

Yes, many characters are now covered. IIUC the only issue is that there are some corner-case characters that could be mishandled based on characters that were added to the Unicode standard between versions. If cudf and JVM aren't targeting the same version of the Unicode standard then those characters occurring in the string can cause a mismatch between GPU and CPU output.

firestarman and others added 4 commits July 2, 2021 09:53
….scala


Improve the incompat doc

Co-authored-by: Jason Lowe <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Since tests failed due to some special charaters

Signed-off-by: Firestarman <[email protected]>
@firestarman firestarman changed the title [WIP] Replace toTitle with capitalize for GpuInitCap Replace toTitle with capitalize for GpuInitCap Jul 2, 2021
@firestarman firestarman marked this pull request as ready for review July 2, 2021 06:00
@firestarman
Copy link
Collaborator Author

build

@firestarman
Copy link
Collaborator Author

firestarman commented Jul 2, 2021

Filed the bug rapidsai/cudf#8644 to track the failures for capitalize with some special characters .
Seems the fix rapidsai/cudf#4520 for rapidsai/cudf#3132 only works for upper and lower.

Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
@firestarman
Copy link
Collaborator Author

build

Signed-off-by: Firestarman <[email protected]>
@firestarman
Copy link
Collaborator Author

build

@jlowe jlowe merged commit 7f524d9 into NVIDIA:branch-21.08 Jul 7, 2021
@firestarman firestarman deleted the initcap branch July 8, 2021 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf
Projects
None yet
2 participants