-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-25299: add CI infrastructure and SortShuffleWriterBenchmark #498
SPARK-25299: add CI infrastructure and SortShuffleWriterBenchmark #498
Conversation
- *link-in-build-sbt-cache | ||
- *restore-ivy-cache | ||
- *restore-build-binaries-cache | ||
- *restore-home-sbt-cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if i'm using a superset or a subset of the useful caches or some random other combination... just sort of copied things that seemed related to sbt because that's what i'm running in this step
…on this branch" This reverts commit 13703fa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
================================================================================================
SortShuffleWriter writer
================================================================================================
Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.4.0-96-generic
Intel(R) Xeon(R) CPU @ 2.30GHz
SortShuffleWriter without spills: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
small dataset without spills 9 14 5 0.1 8722.2 1.0X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.4.0-96-generic
Intel(R) Xeon(R) CPU @ 2.30GHz
SortShuffleWriter with spills: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
no map side combine 18639 18960 359 0.4 2777.4 1.0X
with map side aggregation 18622 19117 928 0.4 2774.9 1.0X
with map side sort 18792 18979 155 0.4 2800.2 1.0X
ping @squito @vanzin @ifilonenko: was hoping to get your eye on this (as well as #507 and #508) before we merge to the feature branch. thanks! Edit: Also, updated https://docs.google.com/document/d/15W5qjpGQqxlZjdLHxSBYT8tRSd1DOPK3mNQgAGijDco/edit# with some results from CircleCi vs AWS. There seems to be noticeable variance for both, but the range is not ridiculous, so I think it's still reasonable to use as a benchmark so long as we run the circle tests often with small frequent PRs, and possibly for longer on AWS. |
I'm not sure which part you're asking for us to review?
For the latter my only concern is that the benchmark seems to be reading from the same disk it's writing to, so you may get variable results depending on how the underlying disk and file system behaves. |
I can handle the CI stuff, and that is only for our fork of Spark seeing that we certainly will be using Jenkins against upstream. I don't think we even need the benchmarking to be run in upstream, as on Jenkins the resource isolation is more or less nonexistent so we can't get any consistency. I think everything else is fair game to review.
@yifeih do we have any way to check what machine and disk type we get, and whether or not we can make that consistent? |
@vanzin thanks for the comments. I think I was mostly looking for the last point about the benchmarks, and you bring up a good point. I'm looking into whether we can get more info on the machine and disk type, but I think I might need to be admin on our circleCI account to lock it down to a specific instance type. I'll check with the internal team that manages this tomorrow. |
We discovered that the hardware is subject to change based on CircleCI's discretion. In the majority of cases it should remain consistent, but in the cases where we notice significant deviations we can just run the benchmarks before and after patches locally to verify. I think it's still worthwhile to have this feature branch run the benchmarks, with the caveat that we'll treat the results with a grain of salt. |
I think the main part of my comment was whether you could run the benchmarks so that data is read from a different drive than it's written to. Otherwise you may mostly be measuring the drive seek speed... |
I'm not entirely sure I follow - assuming the drive stays consistent between runs, we should only be benchmarking differences in code logic we introduce, which might include us accessing the disk in more or less optimal ways between runs. But is there potential variance based on the behavior of the FS cache? We're not performing disk read in this benchmark as of yet because data is generated in memory, generated on the heap before we start the timer. |
Maybe I'm misreading the code, but isn't the iterator that feeds the shuffle writer reading from disk? (see |
Ah I misread - I spoke offline to @yifeih and we can generate data on the fly without putting it on disk. |
It would make it such that we also include the time it takes to generate the data in the benchmark - hopefully there isn't too much variance in the behavior of the RNG programmatic unit. Still using a fixed random seed, of course. |
…to yh/add-benchmarks-and-ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
================================================================================================
SortShuffleWriter writer
================================================================================================
Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
SortShuffleWriter without spills: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
small dataset without spills 10 15 5 0.1 9729.9 1.0X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 on Linux 4.15.0-1014-gcp
Intel(R) Xeon(R) CPU @ 2.30GHz
SortShuffleWriter with spills: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
no map side combine 14046 14203 122 0.5 2093.1 1.0X
with map side aggregation 13971 14066 101 0.5 2081.9 1.0X
with map side sort 14007 14062 72 0.5 2087.2 1.0X
See https://docs.google.com/document/d/1NQW1XgJ6bwktjq5iPyxnvasV9g-XsauiRRayQcGLiik/edit#