Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25476][SPARK-25510][TEST] Refactor AggregateBenchmark and add a new trait to better support Dataset and DataFrame API #22484

Closed
wants to merge 12 commits into from
154 changes: 154 additions & 0 deletions sql/core/benchmarks/AggregateBenchmark-results.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
================================================================================================
aggregate without grouping
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

agg w/o group: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
agg w/o group wholestage off 40468 / 44934 51.8 19.3 1.0X
agg w/o group wholestage on 932 / 949 2249.6 0.4 43.4X


================================================================================================
stat functions
================================================================================================

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davies Do you know how to generate there benchmark:

Using ImperativeAggregate (as implemented in Spark 1.6):
Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz
stddev: Avg Time(ms) Avg Rate(M/s) Relative Rate
-------------------------------------------------------------------------------
stddev w/o codegen 2019.04 10.39 1.00 X
stddev w codegen 2097.29 10.00 0.96 X
kurtosis w/o codegen 2108.99 9.94 0.96 X
kurtosis w codegen 2090.69 10.03 0.97 X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

stddev: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
stddev wholestage off 6664 / 6775 15.7 63.6 1.0X
stddev wholestage on 907 / 923 115.6 8.6 7.3X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

kurtosis: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
kurtosis wholestage off 31339 / 31357 3.3 298.9 1.0X
kurtosis wholestage on 982 / 997 106.8 9.4 31.9X


================================================================================================
aggregate with linear keys
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Aggregate w keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 9667 / 9936 8.7 115.2 1.0X
codegen = T hashmap = F 4537 / 5353 18.5 54.1 2.1X
codegen = T hashmap = T 924 / 946 90.8 11.0 10.5X


================================================================================================
aggregate with randomized keys
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Aggregate w keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 9298 / 9300 9.0 110.8 1.0X
codegen = T hashmap = F 5357 / 5405 15.7 63.9 1.7X
codegen = T hashmap = T 1705 / 1860 49.2 20.3 5.5X


================================================================================================
aggregate with string key
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Aggregate w string key: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 4019 / 4083 5.2 191.7 1.0X
codegen = T hashmap = F 2743 / 2797 7.6 130.8 1.5X
codegen = T hashmap = T 1907 / 1955 11.0 91.0 2.1X


================================================================================================
aggregate with decimal key
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Aggregate w decimal key: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 3229 / 3313 6.5 154.0 1.0X
codegen = T hashmap = F 2732 / 2826 7.7 130.3 1.2X
codegen = T hashmap = T 435 / 447 48.2 20.7 7.4X


================================================================================================
aggregate with multiple key types
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

Aggregate w multiple keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 6326 / 6800 3.3 301.6 1.0X
codegen = T hashmap = F 3915 / 4050 5.4 186.7 1.6X
codegen = T hashmap = T 2972 / 3054 7.1 141.7 2.1X


================================================================================================
max function bytecode size of wholestagecodegen
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

max function bytecode size: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 436 / 518 1.5 664.6 1.0X
codegen = T hugeMethodLimit = 10000 259 / 406 2.5 394.8 1.7X
codegen = T hugeMethodLimit = 1500 458 / 733 1.4 698.4 1.0X


================================================================================================
cube
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

cube: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
cube wholestage off 2957 / 2985 1.8 564.1 1.0X
cube wholestage on 1146 / 1675 4.6 218.5 2.6X


================================================================================================
hash and BytesToBytesMap
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

BytesToBytesMap: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
UnsafeRowhash 224 / 244 93.8 10.7 1.0X
murmur3 hash 114 / 134 184.2 5.4 2.0X
fast hash 61 / 66 342.8 2.9 3.7X
arrayEqual 150 / 229 139.5 7.2 1.5X
Java HashMap (Long) 127 / 305 164.7 6.1 1.8X
Java HashMap (two ints) 99 / 156 212.6 4.7 2.3X
Java HashMap (UnsafeRow) 1048 / 1472 20.0 50.0 0.2X
LongToUnsafeRowMap (opt=false) 517 / 771 40.6 24.7 0.4X
LongToUnsafeRowMap (opt=true) 90 / 107 234.1 4.3 2.5X
BytesToBytesMap (off Heap) 1417 / 1451 14.8 67.6 0.2X
BytesToBytesMap (on Heap) 934 / 1014 22.4 44.6 0.2X
Aggregate HashMap 38 / 42 558.6 1.8 6.0X


Loading