-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] RangeIndex
shouldn't be materialized in cudf.concat
#9200
Comments
Did I break this in my recent refactoring somewhere? I distinctly recall you bringing up and fixing this issue at some point. |
Ah no I see you fixed it for concatenation of standalone |
Yes, that's correct. Not a regression due to refactoring. |
This PR optimizes `cudf.concat` when `axis=0` by not materializing `RangeIndex` objects present as index to the `Dataframe` objects. Partially addresses #9200, This is 1/2 of full optimizations. A follow-up PR to optimize `axis=1` will be opened as there are multiple large changes. Here is a benchmark: On `branch-21.10`: ```ipython IPython 7.27.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: import cudf In [2]: df = cudf.DataFrame({'a':[1, 2, 3]*100}) In [3]: df2 = cudf.DataFrame({'a':[1, 2, 3]*100}, index=cudf.RangeIndex(300, 600)) In [4]: %timeit cudf.concat([df, df2]) 806 µs ± 8.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` This PR: ```ipython IPython 7.27.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: import cudf In [2]: df = cudf.DataFrame({'a':[1, 2, 3]*100}) In [3]: df2 = cudf.DataFrame({'a':[1, 2, 3]*100}, index=cudf.RangeIndex(300, 600)) In [4]: %timeit cudf.concat([df, df2]) 434 µs ± 4.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Ashwin Srinath (https://github.com/shwina) URL: #9222
Part of this issue has been resolved in #9222 and addressing other part of the issue for |
Fixes: #9223, #9200, #9411 This PR: - [x] Reduces memory pressure by avoiding index materialization incase of `RangeIndex` when `axis=1`. - [x] Fixes the correctness of all `axis=1` cases in `cudf.concat`, and thus enabling stricter index type checks in associated pytests. - [x] Cache `distinct_count` value of `Column` in `_distinct_count` to improve performance. - [x] Introduced `Column._clear_cache` to have a single method that clears all the caches values related to a `Column`. - [x] Implemented `Index.union`, `Index.intersection` & `Index.has_duplicates`. - [x] Implemented `is_numeric`, `is_boolean`, `is_integer`, `is_floating`, `is_object`, `is_categorical`& `is_interval` APIs in `Index`. - [x] Optimizes `cudf.concat` for `axis=1` by utilizing above mentioned changes, here are benchmarks: ```python ------------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs0]': 2 tests ------------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-inner-1-objs0] (THIS-PR) 209.9802 (1.0) 2,429.9941 (1.0) 222.9479 (1.0) 41.3467 (1.0) 224.5191 (1.0) 12.1914 (1.81) 12;32 4,485.3529 (1.0) 2985 1 test_concat_axis_1[False-inner-1-objs0] (branch-21.12) 1,807.7570 (8.61) 5,023.1239 (2.07) 1,868.9510 (8.38) 246.0487 (5.95) 1,830.1200 (8.15) 6.7296 (1.0) 20;74 535.0595 (0.12) 520 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs1]': 2 tests ---------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-inner-1-objs1] (THIS-PR) 19.3856 (1.0) 25.1846 (1.0) 19.7466 (1.0) 0.9687 (13.33) 19.5381 (1.0) 0.2784 (6.09) 2;2 50.6416 (1.0) 50 1 test_concat_axis_1[False-inner-1-objs1] (branch-21.12) 30.7169 (1.58) 31.1239 (1.24) 30.7672 (1.56) 0.0727 (1.0) 30.7480 (1.57) 0.0457 (1.0) 2;1 32.5021 (0.64) 33 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs2]': 2 tests ---------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-inner-1-objs2] (THIS-PR) 19.4794 (1.0) 20.0249 (1.0) 19.5933 (1.0) 0.1462 (1.0) 19.5117 (1.0) 0.1412 (1.07) 10;4 51.0378 (1.0) 51 1 test_concat_axis_1[False-inner-1-objs2] (branch-21.12) 30.8203 (1.58) 31.9644 (1.60) 30.9485 (1.58) 0.1959 (1.34) 30.9026 (1.58) 0.1319 (1.0) 1;1 32.3118 (0.63) 33 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs3]': 2 tests ---------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-inner-1-objs3] (THIS-PR) 1.2168 (1.0) 3.3944 (1.0) 1.2505 (1.0) 0.0893 (1.0) 1.2349 (1.0) 0.0388 (1.0) 15;23 799.6555 (1.0) 707 1 test_concat_axis_1[False-inner-1-objs3] (branch-21.12) 44.4625 (36.54) 45.9180 (13.53) 45.1017 (36.07) 0.3472 (3.89) 45.1007 (36.52) 0.4618 (11.90) 7;0 22.1721 (0.03) 23 1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs4]': 2 tests ------------------------------------------------------------------------ Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ test_concat_axis_1[False-inner-1-objs4] (branch-21.12) 95.7450 (1.0) 97.5205 (1.0) 96.5405 (1.0) 0.5931 (1.13) 96.5431 (1.0) 1.0256 (1.17) 4;0 10.3583 (1.0) 11 1 test_concat_axis_1[False-inner-1-objs4] (THIS-PR) 106.3069 (1.11) 107.8606 (1.11) 107.0745 (1.11) 0.5239 (1.0) 107.0633 (1.11) 0.8757 (1.0) 3;0 9.3393 (0.90) 10 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs5]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-inner-1-objs5] (branch-21.12) 276.2022 (1.0) 278.3065 (1.0) 277.3080 (1.0) 0.9845 (1.0) 277.5305 (1.0) 1.8682 (1.03) 2;0 3.6061 (1.0) 5 1 test_concat_axis_1[False-inner-1-objs5] (THIS-PR) 304.1699 (1.10) 307.0704 (1.10) 305.4101 (1.10) 1.1629 (1.18) 305.2463 (1.10) 1.8148 (1.0) 2;0 3.2743 (0.91) 5 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs6]': 2 tests ------------------------------------------------------------------------------ Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-inner-1-objs6] (THIS-PR) 554.7500 (1.0) 669.7820 (1.0) 566.2571 (1.0) 13.3221 (1.0) 561.7749 (1.0) 5.2570 (1.0) 85;94 1,765.9823 (1.0) 748 1 test_concat_axis_1[False-inner-1-objs6] (branch-21.12) 3,956.2921 (7.13) 4,395.6251 (6.56) 4,015.7610 (7.09) 66.9272 (5.02) 3,993.7040 (7.11) 76.8616 (14.62) 28;8 249.0188 (0.14) 241 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs7]': 2 tests ---------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-inner-1-objs7] (THIS-PR) 72.6492 (1.0) 74.1472 (1.0) 73.3672 (1.0) 0.4783 (1.0) 73.4728 (1.0) 0.7316 (1.0) 5;0 13.6301 (1.0) 14 1 test_concat_axis_1[False-inner-1-objs7] (branch-21.12) 98.6850 (1.36) 100.1399 (1.35) 99.5267 (1.36) 0.6551 (1.37) 99.9600 (1.36) 1.1940 (1.63) 4;0 10.0476 (0.74) 10 1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs0]': 2 tests --------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ test_concat_axis_1[False-outer-1-objs0] (branch-21.12) 213.2710 (1.0) 275.2030 (1.0) 223.5803 (1.01) 7.3814 (1.15) 222.9400 (1.02) 12.9229 (5.86) 719;17 4.4727 (0.99) 2875 1 test_concat_axis_1[False-outer-1-objs0] (THIS-PR) 214.6652 (1.01) 290.9640 (1.06) 220.4459 (1.0) 6.4177 (1.0) 218.0159 (1.0) 2.2046 (1.0) 419;512 4.5363 (1.0) 2731 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs1]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-outer-1-objs1] (THIS-PR) 140.9027 (1.0) 141.7782 (1.0) 141.4213 (1.0) 0.3324 (1.0) 141.4934 (1.0) 0.5372 (1.0) 4;0 7.0711 (1.0) 8 1 test_concat_axis_1[False-outer-1-objs1] (branch-21.12) 174.4978 (1.24) 175.9156 (1.24) 174.9014 (1.24) 0.5408 (1.63) 174.6700 (1.23) 0.5511 (1.03) 1;0 5.7175 (0.81) 6 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs2]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-outer-1-objs2] (THIS-PR) 149.0907 (1.0) 151.3939 (1.0) 149.6573 (1.0) 0.8207 (5.56) 149.2920 (1.0) 0.6782 (3.85) 1;1 6.6819 (1.0) 7 1 test_concat_axis_1[False-outer-1-objs2] (branch-21.12) 183.9202 (1.23) 184.3218 (1.22) 184.0712 (1.23) 0.1477 (1.0) 184.0646 (1.23) 0.1760 (1.0) 2;0 5.4327 (0.81) 6 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs3]': 2 tests -------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-outer-1-objs3] (THIS-PR) 1.1996 (1.0) 1.6017 (1.0) 1.2270 (1.00) 0.0297 (1.45) 1.2170 (1.0) 0.0374 (3.69) 29;13 815.0022 (1.00) 719 1 test_concat_axis_1[False-outer-1-objs3] (branch-21.12) 1.2096 (1.01) 1.6363 (1.02) 1.2259 (1.0) 0.0205 (1.0) 1.2199 (1.00) 0.0102 (1.0) 88;106 815.7473 (1.0) 762 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs4]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-outer-1-objs4] (THIS-PR) 582.8973 (1.0) 586.0131 (1.0) 583.9782 (1.0) 1.2053 (1.0) 583.5081 (1.0) 1.2076 (1.0) 1;0 1.7124 (1.0) 5 1 test_concat_axis_1[False-outer-1-objs4] (branch-21.12) 785.9871 (1.35) 790.6360 (1.35) 787.4976 (1.35) 1.8293 (1.52) 786.8087 (1.35) 1.7791 (1.47) 1;0 1.2698 (0.74) 5 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs5]': 2 tests ------------------------------------------------------------------- Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-outer-1-objs5] (THIS-PR) 1.9260 (1.0) 1.9343 (1.0) 1.9299 (1.0) 0.0031 (1.0) 1.9299 (1.0) 0.0038 (1.0) 2;0 0.5182 (1.0) 5 1 test_concat_axis_1[False-outer-1-objs5] (branch-21.12) 2.1733 (1.13) 2.1830 (1.13) 2.1777 (1.13) 0.0039 (1.26) 2.1784 (1.13) 0.0058 (1.53) 2;0 0.4592 (0.89) 5 1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs6]': 2 tests --------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-outer-1-objs6] (THIS-PR) 554.3760 (1.0) 632.9010 (1.02) 575.7529 (1.02) 16.1334 (2.40) 566.0525 (1.00) 31.2359 (7.54) 545;0 1.7369 (0.98) 1442 1 test_concat_axis_1[False-outer-1-objs6] (branch-21.12) 556.5900 (1.00) 622.5759 (1.0) 566.3433 (1.0) 6.7226 (1.0) 564.7328 (1.0) 4.1408 (1.0) 114;89 1.7657 (1.0) 1497 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs7]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[False-outer-1-objs7] (THIS-PR) 596.3256 (1.0) 600.5619 (1.0) 597.9632 (1.0) 1.6437 (1.0) 597.7408 (1.0) 2.1454 (1.0) 1;0 1.6723 (1.0) 5 1 test_concat_axis_1[False-outer-1-objs7] (branch-21.12) 654.1722 (1.10) 666.8746 (1.11) 657.2377 (1.10) 5.4777 (3.33) 654.3422 (1.09) 4.8897 (2.28) 1;1 1.5215 (0.91) 5 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs0]': 2 tests ------------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs0] (THIS-PR) 222.4192 (1.0) 312.4340 (1.0) 233.9587 (1.0) 12.3266 (1.0) 226.9410 (1.0) 17.2716 (1.0) 150;17 4,274.2589 (1.0) 896 1 test_concat_axis_1[True-inner-1-objs0] (branch-21.12) 1,831.1338 (8.23) 5,528.5210 (17.70) 2,174.9929 (9.30) 411.6862 (33.40) 2,110.6380 (9.30) 890.1195 (51.54) 77;1 459.7716 (0.11) 293 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs1]': 2 tests --------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs1] (THIS-PR) 19.3491 (1.0) 23.9291 (1.0) 20.4857 (1.0) 1.4031 (13.72) 19.5300 (1.0) 2.5649 (19.47) 14;0 48.8145 (1.0) 40 1 test_concat_axis_1[True-inner-1-objs1] (branch-21.12) 30.9140 (1.60) 31.3545 (1.31) 31.0313 (1.51) 0.1023 (1.0) 31.0049 (1.59) 0.1318 (1.0) 6;1 32.2255 (0.66) 30 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs2]': 2 tests --------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs2] (THIS-PR) 19.3977 (1.0) 22.6105 (1.0) 19.6793 (1.0) 0.6127 (1.0) 19.5005 (1.0) 0.2517 (1.0) 3;3 50.8148 (1.0) 49 1 test_concat_axis_1[True-inner-1-objs2] (branch-21.12) 31.0002 (1.60) 37.2946 (1.65) 31.4314 (1.60) 1.1519 (1.88) 31.1185 (1.60) 0.2629 (1.04) 2;3 31.8153 (0.63) 32 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs3]': 2 tests ---------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs3] (THIS-PR) 1.2086 (1.0) 3.2895 (1.0) 1.2670 (1.0) 0.0809 (1.0) 1.2712 (1.0) 0.0247 (1.0) 6;42 789.2781 (1.0) 685 1 test_concat_axis_1[True-inner-1-objs3] (branch-21.12) 44.0268 (36.43) 45.0905 (13.71) 44.4070 (35.05) 0.2370 (2.93) 44.3967 (34.92) 0.2955 (11.95) 6;1 22.5190 (0.03) 24 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs4]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs4] (branch-21.12) 94.6051 (1.0) 96.7158 (1.0) 95.3382 (1.0) 0.5723 (1.59) 95.1666 (1.0) 0.5416 (1.0) 3;1 10.4890 (1.0) 11 1 test_concat_axis_1[True-inner-1-objs4] (THIS-PR) 104.9262 (1.11) 105.8423 (1.09) 105.3436 (1.10) 0.3590 (1.0) 105.2455 (1.11) 0.5744 (1.06) 2;0 9.4927 (0.91) 6 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs5]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs5] (branch-21.12) 273.0914 (1.0) 273.9324 (1.0) 273.4240 (1.0) 0.3789 (1.0) 273.1949 (1.0) 0.6226 (1.0) 1;0 3.6573 (1.0) 5 1 test_concat_axis_1[True-inner-1-objs5] (THIS-PR) 298.2814 (1.09) 300.4248 (1.10) 299.5427 (1.10) 0.8678 (2.29) 299.7728 (1.10) 1.3431 (2.16) 2;0 3.3384 (0.91) 5 1 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs6]': 2 tests ------------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs6] (THIS-PR) 560.6860 (1.0) 664.2400 (1.0) 586.7618 (1.0) 17.0820 (1.0) 596.3098 (1.0) 31.9778 (1.0) 605;3 1,704.2692 (1.0) 1399 1 test_concat_axis_1[True-inner-1-objs6] (branch-21.12) 3,963.3820 (7.07) 7,186.5108 (10.82) 4,081.5076 (6.96) 322.1541 (18.86) 4,015.4392 (6.73) 120.8268 (3.78) 5;9 245.0075 (0.14) 229 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs7]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-inner-1-objs7] (THIS-PR) 72.7404 (1.0) 74.9232 (1.0) 73.5822 (1.0) 0.7077 (4.98) 73.7620 (1.0) 1.0375 (9.03) 5;0 13.5903 (1.0) 13 1 test_concat_axis_1[True-inner-1-objs7] (branch-21.12) 100.0205 (1.38) 100.4437 (1.34) 100.1622 (1.36) 0.1422 (1.0) 100.1149 (1.36) 0.1149 (1.0) 2;2 9.9838 (0.73) 10 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs0]': 2 tests ---------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs0] (THIS-PR) 227.8399 (1.0) 3,832.8250 (13.42) 250.7014 (1.05) 70.9327 (17.37) 255.2045 (1.07) 23.8101 (10.25) 5;8 3.9888 (0.96) 2684 1 test_concat_axis_1[True-outer-1-objs0] (branch-21.12) 235.2530 (1.03) 285.5239 (1.0) 239.8447 (1.0) 4.0831 (1.0) 238.6939 (1.0) 2.3230 (1.0) 243;256 4.1694 (1.0) 2670 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs1]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs1] (THIS-PR) 141.7261 (1.0) 145.9042 (1.0) 142.7198 (1.0) 1.7906 (18.95) 141.9498 (1.0) 1.3718 (10.05) 1;1 7.0067 (1.0) 5 1 test_concat_axis_1[True-outer-1-objs1] (branch-21.12) 175.0254 (1.23) 175.2591 (1.20) 175.1198 (1.23) 0.0945 (1.0) 175.0752 (1.23) 0.1364 (1.0) 1;0 5.7104 (0.81) 5 1 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs2]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs2] (THIS-PR) 149.5332 (1.0) 150.3494 (1.0) 149.9293 (1.0) 0.2652 (1.0) 149.8476 (1.0) 0.3105 (1.0) 2;0 6.6698 (1.0) 7 1 test_concat_axis_1[True-outer-1-objs2] (branch-21.12) 183.8074 (1.23) 184.6288 (1.23) 184.2467 (1.23) 0.3398 (1.28) 184.2170 (1.23) 0.5747 (1.85) 3;0 5.4275 (0.81) 6 1 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs3]': 2 tests -------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs3] (THIS-PR) 1.2082 (1.0) 1.9830 (1.44) 1.2325 (1.0) 0.0367 (2.09) 1.2202 (1.0) 0.0377 (1.70) 21;5 811.3756 (1.0) 696 1 test_concat_axis_1[True-outer-1-objs3] (branch-21.12) 1.2231 (1.01) 1.3767 (1.0) 1.2394 (1.01) 0.0176 (1.0) 1.2321 (1.01) 0.0221 (1.0) 160;12 806.8524 (0.99) 727 1 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs4]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs4] (THIS-PR) 574.2238 (1.0) 576.4085 (1.0) 575.5754 (1.0) 0.8308 (1.0) 575.7421 (1.0) 0.9577 (1.0) 2;0 1.7374 (1.0) 5 1 test_concat_axis_1[True-outer-1-objs4] (branch-21.12) 770.7027 (1.34) 772.6688 (1.34) 771.6322 (1.34) 0.9549 (1.15) 771.0687 (1.34) 1.6949 (1.77) 2;0 1.2960 (0.75) 5 1 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs5]': 2 tests ------------------------------------------------------------------- Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs5] (THIS-PR) 1.9025 (1.0) 1.9095 (1.0) 1.9074 (1.0) 0.0028 (1.0) 1.9082 (1.0) 0.0023 (1.0) 1;1 0.5243 (1.0) 5 1 test_concat_axis_1[True-outer-1-objs5] (branch-21.12) 2.1330 (1.12) 2.1428 (1.12) 2.1374 (1.12) 0.0039 (1.42) 2.1375 (1.12) 0.0062 (2.75) 2;0 0.4679 (0.89) 5 1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs6]': 2 tests -------------------------------------------------------------------------- Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs6] (branch-21.12) 558.6550 (1.0) 641.9669 (1.0) 570.2701 (1.0) 11.1347 (1.0) 566.8140 (1.0) 5.0180 (1.0) 141;153 1.7536 (1.0) 1498 1 test_concat_axis_1[True-outer-1-objs6] (THIS-PR) 563.2618 (1.01) 663.0530 (1.03) 594.9855 (1.04) 15.4747 (1.39) 600.2941 (1.06) 8.7381 (1.74) 399;373 1.6807 (0.96) 1394 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs7]': 2 tests ----------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_concat_axis_1[True-outer-1-objs7] (THIS-PR) 597.2443 (1.0) 600.4502 (1.0) 598.6500 (1.0) 1.4581 (3.99) 598.3555 (1.0) 2.6978 (4.06) 1;0 1.6704 (1.0) 5 1 test_concat_axis_1[True-outer-1-objs7] (branch-21.12) 653.1495 (1.09) 653.9721 (1.09) 653.5529 (1.09) 0.3653 (1.0) 653.4739 (1.09) 0.6643 (1.0) 2;0 1.5301 (0.92) 5 1 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ``` Associated benchmarks are being added here: vyasr/cudf_benchmarks#1 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #9333
Completely Fixed by #9333 |
Describe the bug
There is a performance overhead in
cudf.concat
when the type of index in dataframes isRangeIndex
because theRangeIndex
objects are being materialized. This is also a behavioural deviation from whatpandas
does, i.e., don't materializeRangeIndex
if they can be homogeneously concatenated.Steps/Code to reproduce bug
Expected behavior
Returned dataframe should have a
RangeIndex
.Environment overview (please complete the following information)
The text was updated successfully, but these errors were encountered: