Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cudf strings is_title API #9380

Merged
merged 14 commits into from
Oct 12, 2021
Merged

Conversation

davidwendt
Copy link
Contributor

Closes #5265

This PR adds the libcudf cudf::strings::is_title() function and the cudf python istitle() function for strings column/series. This includes corresponding gtest and pytest for this feature.

As mentioned in #5265 this function is equivalent for Pandas and pyspark which follow the logic referenced here https://pandas.pydata.org/docs/reference/api/pandas.Series.str.istitle.html

@davidwendt davidwendt added feature request New feature or request 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) non-breaking Non-breaking change labels Oct 5, 2021
@davidwendt davidwendt self-assigned this Oct 5, 2021
@davidwendt davidwendt requested review from a team as code owners October 5, 2021 17:35
@github-actions github-actions bot added the Python Affects Python cuDF API. label Oct 5, 2021
cpp/tests/strings/case_tests.cpp Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 5, 2021

Codecov Report

Merging #9380 (8cc2a19) into branch-21.12 (ab4bfaa) will decrease coverage by 0.03%.
The diff coverage is 0.00%.

❗ Current head 8cc2a19 differs from pull request most recent head 430b759. Consider uploading reports for the commit 430b759 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9380      +/-   ##
================================================
- Coverage         10.79%   10.75%   -0.04%     
================================================
  Files               116      116              
  Lines             18869    19482     +613     
================================================
+ Hits               2036     2096      +60     
- Misses            16833    17386     +553     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/strings/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/datetime.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/lists.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/numerical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <0.00%> (ø)
... and 70 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 56eb91a...430b759. Read the comment docs.

Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice overall! Some suggestions attached.

cpp/src/strings/capitalize.cu Outdated Show resolved Hide resolved
cpp/src/strings/capitalize.cu Show resolved Hide resolved
cpp/src/strings/capitalize.cu Outdated Show resolved Hide resolved
cpp/src/strings/capitalize.cu Outdated Show resolved Hide resolved
python/cudf/cudf/core/column/string.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/column/string.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/column/string.py Outdated Show resolved Hide resolved
@davidwendt davidwendt requested a review from bdice October 11, 2021 13:52
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwendt Thanks! I left one small suggestion to rename "alpha" to "cased", otherwise LGTM. I learned a lot about the many strange behaviors of Unicode while reviewing.

@beckernick
Copy link
Member

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 5e46c7e into rapidsai:branch-21.12 Oct 12, 2021
@davidwendt davidwendt deleted the fea-is-title branch October 12, 2021 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API. strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support for .str.istitle api
5 participants