Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gbenchmark for nvtext ngrams functions #7693

Merged
merged 2 commits into from
Mar 29, 2021

Conversation

davidwendt
Copy link
Contributor

Reference #5696
Creates a gbenchmark for nvtext::generate_ngrams() and nvtext::generate_character_ngrams() functions.
The benchmarks measures various string lengths and number of rows.
The nvtext::generate_ngrams() was refactored to use the more efficient make_strings_children which improved its performance by about 50%.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 23, 2021
@davidwendt davidwendt self-assigned this Mar 23, 2021
@davidwendt davidwendt requested review from a team as code owners March 23, 2021 19:47
@github-actions github-actions bot added the CMake CMake build issue label Mar 23, 2021
@codecov
Copy link

codecov bot commented Mar 23, 2021

Codecov Report

Merging #7693 (3d384ee) into branch-0.19 (7871e7a) will increase coverage by 0.62%.
The diff coverage is n/a.

❗ Current head 3d384ee differs from pull request most recent head d3155bc. Consider uploading reports for the commit d3155bc to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7693      +/-   ##
===============================================
+ Coverage        81.86%   82.49%   +0.62%     
===============================================
  Files              101      101              
  Lines            16884    17416     +532     
===============================================
+ Hits             13822    14367     +545     
+ Misses            3062     3049      -13     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/categorical.py 91.97% <ø> (+0.58%) ⬆️
python/cudf/cudf/core/column/column.py 87.86% <ø> (+0.10%) ⬆️
python/cudf/cudf/core/column/datetime.py 89.63% <ø> (+0.54%) ⬆️
python/cudf/cudf/core/column/decimal.py 92.75% <ø> (-2.12%) ⬇️
python/cudf/cudf/core/column/lists.py 92.50% <ø> (+1.10%) ⬆️
python/cudf/cudf/core/column/numerical.py 94.83% <ø> (-0.20%) ⬇️
python/cudf/cudf/core/column/string.py 86.79% <ø> (+0.30%) ⬆️
python/cudf/cudf/core/column/timedelta.py 88.57% <ø> (+0.33%) ⬆️
python/cudf/cudf/core/column_accessor.py 96.01% <ø> (+0.70%) ⬆️
python/cudf/cudf/core/dataframe.py 90.90% <ø> (+0.43%) ⬆️
... and 61 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5cd90a0...d3155bc. Read the comment docs.

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one question about that return std::move

cpp/src/text/generate_ngrams.cu Show resolved Hide resolved
@davidwendt
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit d9103c4 into rapidsai:branch-0.19 Mar 29, 2021
@davidwendt davidwendt deleted the benchmark-nvtext-ngrams branch March 29, 2021 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants