Skip to content

Commit

Permalink
Refactor libcudf strings::replace to use make_strings_children utility (
Browse files Browse the repository at this point in the history
#7384)

Reference #7370 

This PR simplifies the current `cudf::strings::replace` (non-regex) functions by refactoring to use the more efficient `make_strings_children` utility. This refactoring improves performance by about 2x on these APIs as measured by the gbenchmark PR #7369.

<details>
  <summary>Baseline gbenchmark for replace-scalar</summary>

```
---------------------------------------------------------------------------------------------------------------------
Benchmark                                                           Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------
StringReplaceScalar/replace_scalar/4096/32/manual_time          0.308 ms        0.316 ms         2345 bytes_per_second=224.631M/s
StringReplaceScalar/replace_scalar/4096/128/manual_time          1.01 ms         1.03 ms          684 bytes_per_second=269.171M/s
StringReplaceScalar/replace_scalar/4096/512/manual_time          7.35 ms         7.38 ms           95 bytes_per_second=149.028M/s
StringReplaceScalar/replace_scalar/4096/2048/manual_time         74.1 ms         74.2 ms            9 bytes_per_second=58.9153M/s
StringReplaceScalar/replace_scalar/4096/8192/manual_time         1170 ms         1170 ms            1 bytes_per_second=14.8457M/s
StringReplaceScalar/replace_scalar/32768/32/manual_time         0.314 ms        0.333 ms         2225 bytes_per_second=1.7147G/s
StringReplaceScalar/replace_scalar/32768/128/manual_time         1.16 ms         1.18 ms          604 bytes_per_second=1.83688G/s
StringReplaceScalar/replace_scalar/32768/512/manual_time         7.56 ms         7.58 ms           92 bytes_per_second=1.12604G/s
StringReplaceScalar/replace_scalar/32768/2048/manual_time        80.8 ms         80.9 ms            9 bytes_per_second=432.314M/s
StringReplaceScalar/replace_scalar/32768/8192/manual_time        1526 ms         1521 ms            1 bytes_per_second=91.3563M/s
StringReplaceScalar/replace_scalar/262144/32/manual_time        0.430 ms        0.449 ms         1622 bytes_per_second=10.0357G/s
StringReplaceScalar/replace_scalar/262144/128/manual_time        1.94 ms         1.96 ms          361 bytes_per_second=8.80298G/s
StringReplaceScalar/replace_scalar/262144/512/manual_time        18.1 ms         18.0 ms           39 bytes_per_second=3.77253G/s
StringReplaceScalar/replace_scalar/262144/2048/manual_time        227 ms          227 ms            3 bytes_per_second=1.20334G/s
StringReplaceScalar/replace_scalar/2097152/32/manual_time        2.48 ms         2.50 ms          282 bytes_per_second=13.9373G/s
StringReplaceScalar/replace_scalar/2097152/128/manual_time       11.8 ms         11.9 ms           59 bytes_per_second=11.5245G/s
StringReplaceScalar/replace_scalar/2097152/512/manual_time        101 ms          101 ms            7 bytes_per_second=5.42976G/s
StringReplaceScalar/replace_scalar/16777216/32/manual_time       22.2 ms         22.2 ms           31 bytes_per_second=12.4258G/s
```

</details>

<details>
  <summary>gbenchmark results for refactored replace-scalar</summary>

```
---------------------------------------------------------------------------------------------------------------------
Benchmark                                                           Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------
StringReplaceScalar/replace_scalar/4096/32/manual_time          0.144 ms        0.162 ms         4871 bytes_per_second=481.559M/s
StringReplaceScalar/replace_scalar/4096/128/manual_time         0.428 ms        0.446 ms         1633 bytes_per_second=634.055M/s
StringReplaceScalar/replace_scalar/4096/512/manual_time          2.65 ms         2.67 ms          263 bytes_per_second=413.561M/s
StringReplaceScalar/replace_scalar/4096/2048/manual_time         28.8 ms         28.8 ms           24 bytes_per_second=151.733M/s
StringReplaceScalar/replace_scalar/4096/8192/manual_time          479 ms          479 ms            2 bytes_per_second=36.2387M/s
StringReplaceScalar/replace_scalar/32768/32/manual_time         0.161 ms        0.178 ms         4347 bytes_per_second=3.35237G/s
StringReplaceScalar/replace_scalar/32768/128/manual_time        0.466 ms        0.484 ms         1502 bytes_per_second=4.57268G/s
StringReplaceScalar/replace_scalar/32768/512/manual_time         2.94 ms         2.96 ms          238 bytes_per_second=2.89405G/s
StringReplaceScalar/replace_scalar/32768/2048/manual_time        37.4 ms         37.4 ms           19 bytes_per_second=933.899M/s
StringReplaceScalar/replace_scalar/32768/8192/manual_time         567 ms          565 ms            1 bytes_per_second=245.929M/s
StringReplaceScalar/replace_scalar/262144/32/manual_time        0.316 ms        0.334 ms         2198 bytes_per_second=13.6601G/s
StringReplaceScalar/replace_scalar/262144/128/manual_time        1.39 ms         1.41 ms          498 bytes_per_second=12.237G/s
StringReplaceScalar/replace_scalar/262144/512/manual_time        12.8 ms         12.9 ms           54 bytes_per_second=5.30963G/s
StringReplaceScalar/replace_scalar/262144/2048/manual_time        157 ms          157 ms            4 bytes_per_second=1.73861G/s
StringReplaceScalar/replace_scalar/2097152/32/manual_time        1.84 ms         1.86 ms          379 bytes_per_second=18.7409G/s
StringReplaceScalar/replace_scalar/2097152/128/manual_time       9.50 ms         9.52 ms           74 bytes_per_second=14.3717G/s
StringReplaceScalar/replace_scalar/2097152/512/manual_time       84.7 ms         84.7 ms            8 bytes_per_second=6.44185G/s
StringReplaceScalar/replace_scalar/16777216/32/manual_time       14.0 ms         14.0 ms           50 bytes_per_second=19.6828G/s
```

</details>

Improvements for #7370 should base off of these changes.

Authors:
  - David (@davidwendt)

Approvers:
  - Jason Lowe (@jlowe)
  - @nvdbaranec
  - Mark Harris (@harrism)

URL: #7384
  • Loading branch information
davidwendt authored Feb 17, 2021
1 parent b090d96 commit ad6901c
Showing 1 changed file with 117 additions and 169 deletions.
Loading

0 comments on commit ad6901c

Please sign in to comment.