Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

Add performance tests for flatMap operation in OperationContainer #740

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

ttim
Copy link
Collaborator

@ttim ttim commented Jul 6, 2017

This PR adds tests for flatMap / flatMap composition in context of OperationContainer. Tests are implemented using JMH.

Tests are not ideal, but they give us some estimation of costs our abstractions introduce.
Most interesting overheads: IntermediateFlatMap with FlatMapOperation introduces about 1s per 1KK operation overhead. FlatMapOperation composition adds another ~1-2s per 1KK.

@ttim ttim requested review from johnynek and pankajroark July 6, 2017 22:51
@codecov-io
Copy link

codecov-io commented Jul 6, 2017

Codecov Report

Merging #740 into develop will decrease coverage by 2.1%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop    #740      +/-   ##
==========================================
- Coverage    72.21%   70.1%   -2.11%     
==========================================
  Files          153     158       +5     
  Lines         3736    3824      +88     
  Branches       204     205       +1     
==========================================
- Hits          2698    2681      -17     
- Misses        1038    1143     +105
Impacted Files Coverage Δ
...d/online/executor/ComposedFlatMapPerformance.scala 0% <0%> (ø)
...er/summingbird/online/executor/SimpleFlatMap.scala 0% <0%> (ø)
.../executor/ComposedFlatMapWithTimePerformance.scala 0% <0%> (ø)
...witter/summingbird/online/executor/TestUtils.scala 0% <0%> (ø)
...d/online/executor/IdentityFlatMapPerformance.scala 0% <0%> (ø)
...ain/scala/com/twitter/summingbird/TestGraphs.scala 77.7% <0%> (-10.83%) ⬇️
...ala/com/twitter/summingbird/scalding/Service.scala 73.84% <0%> (-4.62%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fecdfa3...f3e4845. Read the comment docs.

Copy link
Collaborator

@johnynek johnynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like composition in these operations is costly.

Note, the optimizer rules could have composed these things for us. I don't know why we wouldn't want to use them.

The main issue was that we didn't work out tracking the naming through the compositions correctly before. But if we can use the optimizer rules we can solve these problems once and use it in scalding, memory, concurrent memory, storm, etc...

}
}
while (left != 0) {
Thread.sleep(TickDelay.toMillis)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty skeptical we are comparing apples to apples when we compare this to non OperationContainers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skeptical as well, but I wanted to see at least some number for non OperationContainer execution. In the end I don't think it should be anyhow actionable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread.sleep adds noise to the result. The blackhole provides a method to simulate CPU usage. I think it's consumeCpu

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've just noticed that this is actually to check when it's completed. Maybe you could use a CountDownLatch for it?

None

override def execute(state: S,data: I): TraversableOnce[(Chain[S], Try[TraversableOnce[O]])] =
Some((Chain.single(state), Success(f(data))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might put Try(f(data)) here just to catch any issues with the f that could exist since we have try anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@johnynek
Copy link
Collaborator

johnynek commented Jul 7, 2017

I'm fine with merging this, but should we add a test that the SimpleFlatMap is identical in performance to the IntermediateFlatMap? Someone might use it as is, and currently it is untested.

Is this a candidate for running in our actual planner, or just a limit to shoot for?

@ttim
Copy link
Collaborator Author

ttim commented Jul 7, 2017

@johnynek According to tests SimpleFlatMap is much more performant than IntermediateFlatMap =)
Anyway I moved it to perf module for now (and made it private). Let's see later do we want to use it instead of IntermediateFlatMap in special cases.

@johnynek
Copy link
Collaborator

johnynek commented Jul 7, 2017

Sorry, I meant identical in function, I realize it is much faster.

@ttim
Copy link
Collaborator Author

ttim commented Jul 7, 2017

@johnynek yeah =) issue is - it's not identical because our FlatMapOperation able to handle leftJoin in a same way as flatMaps, so it covers only simple cases.
I guess it's safe to merge, because it's private from external users for now.


/**
* Command to run (should be executed from sbt with project summingbird-online-perf):
* jmh:run -i 10 -wi 10 -f1 -t1 ComposedFlatMapPerformance
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding these options:

-prof gc to enable the GC profiler
-gc to force a GC between runs. Without it, 10 warmup and actual iterations might not be enough to exclude GC noise
-foe true to fail the benchmark if the code fails. I've seen benchmarks that only test that the benchmark is wrong :)
-jvmArgs "-Xms1G -Xmx1G" to make sure that the allocated memory is the same across different dev environments

@BenchmarkMode(Array(Mode.AverageTime))
@OutputTimeUnit(TimeUnit.MILLISECONDS)
def measureDirect(bh: Blackhole): Unit =
input.foreach(composed.apply(_).foreach(bh.consume))
Copy link

@fwbrasil fwbrasil Jul 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function allocation here (composed.apply(_)) introduces some noise. Maybe it's worth extracting it to a class val

var left = inputs.length
for (
input <- inputs;
emitted <- container.execute(InputState[Int](1), input)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a sense of how expensive container.execute is? I ask because testOperationContainer introduces some noise so its overhead could dominate the final result.

@johnynek
Copy link
Collaborator

@ttim we are starting to really push on summingbird performance.

Any more work here? Would it make sense to have a video sync sometime soon as we are currently investing in summingbird at Stripe (but also considering if that makes sense or to shift investment somewhat).

If we can share effort, that is a big win. If we each are just going to work privately on our forks maybe each should make their own call.

Thoughts @pankajroark ?

@pankajroark
Copy link
Contributor

@johnynek Great to hear about Summingbird performance push. Happy to set up a video sync. As you've seen we've made some optimizations and we wanted to make more. Unfortunately we have to focus resources on other projects for the time being, but we plan to come back and make more optimizations. It will be great to learn about what you've planned. Let me sync up with you offline to set up a time for video sync.

@johnynek
Copy link
Collaborator

@pankajroark sounds good. I replied offline about a meeting time.

@CLAassistant
Copy link

CLAassistant commented Nov 16, 2019

CLA assistant check
All committers have signed the CLA.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants