Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25486][TEST] Refactor SortBenchmark to use main method #22495

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions sql/core/benchmarks/SortBenchmark-results.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
================================================================================================
radix sort
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

radix sort 25000000: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
reference TimSort key prefix array 11770 / 11960 2.1 470.8 1.0X
reference Arrays.sort 2106 / 2128 11.9 84.3 5.6X
radix sort one byte 93 / 100 269.7 3.7 126.9X
radix sort two bytes 171 / 179 146.0 6.9 68.7X
radix sort eight bytes 659 / 664 37.9 26.4 17.9X
radix sort key prefix array 1024 / 1053 24.4 41.0 11.5X


Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ package org.apache.spark.sql.execution.benchmark

import java.util.{Arrays, Comparator}

import org.apache.spark.benchmark.Benchmark
import org.apache.spark.benchmark.{Benchmark, BenchmarkBase}
import org.apache.spark.unsafe.array.LongArray
import org.apache.spark.unsafe.memory.MemoryBlock
import org.apache.spark.util.collection.Sorter
Expand All @@ -28,12 +28,15 @@ import org.apache.spark.util.random.XORShiftRandom

/**
* Benchmark to measure performance for aggregate primitives.
* To run this:
* build/sbt "sql/test-only *benchmark.SortBenchmark"
*
* Benchmarks in this file are skipped in normal builds.
* {{{
* To run this benchmark:
* 1. without sbt: bin/spark-submit --class <this class> <spark sql test jar>
* 2. build/sbt "sql/test:runMain <this class>"
* 3. generate result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain <this class>"
* Results will be written to "benchmarks/<this class>-results.txt".
* }}}
*/
class SortBenchmark extends BenchmarkWithCodegen {
object SortBenchmark extends BenchmarkBase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yucai . BenchmarkWithCodegen is different from BenchmarkBase. Can we keep BenchmarkWithCodegen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun SortBenchmark does not use any function provided in BenchmarkWithCodegen, so I remove it.
Another option is like #22484 did, make BenchmarkWithCodegen extend BenchmarkBase, and then SortBenchmark can extend BenchmarkWithCodegen.
Do you prefer the 2nd way?

BTW, congratulations! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it. +1 for the current one. Thanks, @yucai .


private def referenceKeyPrefixSort(buf: LongArray, lo: Int, hi: Int, refCmp: PrefixComparator) {
val sortBuffer = new LongArray(MemoryBlock.fromLongArray(new Array[Long](buf.size().toInt)))
Expand All @@ -54,10 +57,10 @@ class SortBenchmark extends BenchmarkWithCodegen {
new LongArray(MemoryBlock.fromLongArray(extended)))
}

ignore("sort") {
def sortBenchmark(): Unit = {
val size = 25000000
val rand = new XORShiftRandom(123)
val benchmark = new Benchmark("radix sort " + size, size)
val benchmark = new Benchmark("radix sort " + size, size, output = output)
benchmark.addTimerCase("reference TimSort key prefix array") { timer =>
val array = Array.tabulate[Long](size * 2) { i => rand.nextLong }
val buf = new LongArray(MemoryBlock.fromLongArray(array))
Expand Down Expand Up @@ -114,20 +117,11 @@ class SortBenchmark extends BenchmarkWithCodegen {
timer.stopTiming()
}
benchmark.run()
}

/*
Running benchmark: radix sort 25000000
Java HotSpot(TM) 64-Bit Server VM 1.8.0_66-b17 on Linux 3.13.0-44-generic
Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz

radix sort 25000000: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------
reference TimSort key prefix array 15546 / 15859 1.6 621.9 1.0X
reference Arrays.sort 2416 / 2446 10.3 96.6 6.4X
radix sort one byte 133 / 137 188.4 5.3 117.2X
radix sort two bytes 255 / 258 98.2 10.2 61.1X
radix sort eight bytes 991 / 997 25.2 39.6 15.7X
radix sort key prefix array 1540 / 1563 16.2 61.6 10.1X
*/
override def benchmark(): Unit = {
runBenchmark("radix sort") {
sortBenchmark()
}
}
}