Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Testing #776

Closed
benjchristensen opened this issue Jan 22, 2014 · 15 comments
Closed

Performance Testing #776

benjchristensen opened this issue Jan 22, 2014 · 15 comments

Comments

@benjchristensen
Copy link
Member

I would like to integrate performance testing as a first-class aspect of rxjava-core in https://github.com/Netflix/RxJava/tree/master/rxjava-core/src/perf

One option is Google Caliper: https://code.google.com/p/caliper/
Another is JMH: http://openjdk.java.net/projects/code-tools/jmh/

Of potential interest, Netty uses JMH: http://netty.io/wiki/microbench-module.html

I have placed some very simple, manual performance tests in the /src/perf folders for now but I'd like to establish the tooling and a few solid examples so we have a pattern to follow.

@benjchristensen
Copy link
Member Author

/cc @abersnaze as you've been involved in these discussions and you're researching Google Caliper.

@gvsmirnov
Copy link
Contributor

I would very much recommend using JMH, and not Caliper. The latter has lots and lots of issues, which are addressed in the former. Here's a great presentation about it.

@benjchristensen
Copy link
Member Author

Thank you for weighing in and sharing that presentation, just read through it, very interesting. Can you point to anything about the issues with Caliper?

@headinthebox
Copy link
Contributor

@gvsmirnov JMH looks technically pretty impressive, but seems not to integrate as nicely as Caliper in an IDE workflow. I could only find some very brief comments about IntelliJ integration on the Web, do you know more. Also, as @benjchristensen says, the presentation is super interesting but does not answer the question why Caliper is not a good choice.

A side question about all his benchmarking stuff is how much it relates to performance in production. i.e. when running the benchmarks, you measure things in a very specific way, but in production it runs in a completely different environment. It sometimes feels to me like measuring calories using a http://en.wikipedia.org/wiki/Calorimeter, which does not really correspond to the actual digestion of food. To try to state it more formally, is benchmarking monotonic, in other words does Benchmark(A) < Benchmark(B) imply that InProduction(A) < InProduction(B)?

@gvsmirnov
Copy link
Contributor

@benjchristensen Unfortunately, there is no article/presentation/whatever which explicitly points out all the pitfalls of Caliper that I know of. But for most of the common problems (outlined in the presentation), Caliper has no built-in means to work around (the last time I checked, at least). The most broken thing about Caliper is that it falls victim to loop unrolling. See here.

JMH is all about taking the trouble off our shoulders, especially the trouble we do not even suspect exists. Many things that are hard to implement in Caliper (like this and that and that) are easy to do in JMH.

@headinthebox Now, regarding IDE support, there is indeed next to no of it. But I personally hardly ever use IDE for things like running tests or working with VCSs. Command-line utilities work fine for me. And for JMH, they are much better that your average CLI tool.

@gvsmirnov
Copy link
Contributor

I have just started a mechanical-sympathy thread that discusses this subject. There will probably be a lot of info there in a couple of days.

@benjchristensen
Copy link
Member Author

Thank you @gvsmirnov for the information. This is something I hope we'll make a first-class aspect of RxJava in the near future and your information will really help.

Are you interested in helping us bootstrap RxJava with JMH? The rxjava-core/src/perf/ code is wide-open right now to setup correctly.

@gvsmirnov
Copy link
Contributor

@benjchristensen I most definitely am. There are some spare time issues at the moment, though, so I don't think I will be able to contribute for a couple of weeks. Afterwards, I would be happy to.

@benjchristensen
Copy link
Member Author

I understand that problem! Once you have some time I'd appreciate your help to get us started down the right path.

@abersnaze abersnaze mentioned this issue Feb 7, 2014
@abersnaze
Copy link
Contributor

Some observations on the difference now that I've actually used both of them:

Caliper
PROS

  • It also measures object count+memory usage as well as time.
  • Makes it clear that is monitoring JIT and GC events during the timing.
  • parameter annotations makes easier to test different configurations without having to generate a method for each combination manually.

CONS

  • Warm up is a bit a black box. I've seen the warnings that it has detect JIT during measurement often enough that it makes me think that it isn't doing enough to warm up the code.
  • It uploads the results!

P.S. I'm not an expert in either benchmarking tool.

@gvsmirnov
Copy link
Contributor

@benjchristensen Sorry it took me so long, but I'm finally back. I've thrown together a sample gradle project with JMH support here. Hoping to integrate it with RxJava real soon.

@gvsmirnov
Copy link
Contributor

Oh, finally! I have sent a pull request (#963) with the updated JMH benchmarking. It features changes both to the gradle setup, and to the benchmark itself.

The gradle set up us explained in this blog post.

The benchmark is changed in such a way that prevents most of the caveats (like DCE) from happening, while also ensuring that more accurate results are attained. Please consult these samples to gain deeper insight into how benchmarking should be done with JMH.

Here are the results that I got on my Haswell 2.6 GHz 16 GB RAM laptop with Java 8:

Benchmark                                  (size)   Mode   Samples         Mean   Mean error    Units
r.o.ObservableBenchmark.measureBaseline         1   avgt        10        0.003        0.000    us/op
r.o.ObservableBenchmark.measureBaseline      1024   avgt        10        2.764        0.051    us/op
r.o.ObservableBenchmark.measureBaseline   1048576   avgt        10     3104.088       49.586    us/op
r.o.ObservableBenchmark.measureMap              1   avgt        10        0.100        0.003    us/op
r.o.ObservableBenchmark.measureMap           1024   avgt        10        5.036        0.059    us/op
r.o.ObservableBenchmark.measureMap        1048576   avgt        10     6693.271      277.604    us/op

What we see here is that doing nothing RxJava introduces about a 2x overhead in latency compared to simply doing nothing. Pretty acceptable if you ask me.

@benjchristensen
Copy link
Member Author

This is great @gvsmirnov Thank you!

Is there a way to maintain historical snapshots over time for getting performance diffs?

@gvsmirnov
Copy link
Contributor

@benjchristensen you're very welcome.

Uh. I'm not exactly sure if there is an established practice with that. You can easily get JMH to output its results in csv, scsv or json. Should not be a long way from there.

What I'm doing is: before merging anything to master, run the benchmarks on master and on the branch. Works fine for me.

@benjchristensen
Copy link
Member Author

We have JMH integrated and being used so closing this. Thank you @gvsmirnov for your help on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants