Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automation of performance measurements #212

Open
2 of 5 tasks
bzz opened this issue Nov 20, 2018 · 12 comments
Open
2 of 5 tasks

Automation of performance measurements #212

bzz opened this issue Nov 20, 2018 · 12 comments

Comments

@bzz
Copy link
Contributor

bzz commented Nov 20, 2018

This is an umbrella issue for initial work on automating performance analysis/regression suite for bblfshd, to build a baseline benchmark.

Motivation (things reported to be slow):

TODOs:

  • small dataset of some LoC for each recommended driver (1 same program from RosetaCode?) Dataset for automation of performance measurements  #220
  • UAST parsing test suite (to run across gRPC: bblfshd/individual driver, STDIO: native parser)
  • UAST filtering test suite (rudimentary, 1 query)
  • OpenTrace instrumentation of
    • client-go
    • bblfshd
    • drivers
  • performance regression suite running on Jenkins

Each of the items above is expected to be taken care of as a separate Issue/PRs (by different authors).

As this is initial round of work on performance, there are no expectations on completeness of the test cases - it's rather important to have all prices in place and infrastructure up and going.

@tsolakoua
Copy link

I would like to focus on the small dataset of some LoC for each recommended driver for this Monday's OSD. I will create a separate issue on that and can be assigned to me.

@bzz
Copy link
Contributor Author

bzz commented Nov 27, 2018

For context - UAST perf measurements on gitbase side src-d/gitbase#606 hit #209.

Would be nice to try to generate at least similar load in our baseline and see how much it can be stretched from there.

@dennwc
Copy link
Member

dennwc commented Dec 3, 2018

The new SDK (v2.12.0+) will generate a benchmark report (bench.txt) during bblfsh-sdk test -b.

I will now update all drivers that include benchmark fixtures (many thanks to @tsolakoua!).

It won't be enabled in CI for obvious reasons (shared instances), so we still need some infrastructure to run it.

@tsolakoua
Copy link

Next Monday is OSD and I could continue on that since I finished with the benchmark fixtures. However, I don't understand well the next steps so I might need some support to get started.

@bzz
Copy link
Contributor Author

bzz commented Dec 4, 2018

It won't be enabled in CI for obvious reasons (shared instances), so we still need some infrastructure to run it.

\cc @smola as AFAIK he was working on some Jenkins setup

@smola
Copy link
Member

smola commented Dec 5, 2018

Watch https://github.com/src-d/backlog/issues/1307
We will have a Jenkins instance with a bare metal server dedicated to performance tests. It will be ready soon.

It will be guided by Jenkinsfile (see docs). I'll provide some example that works with our setup.

@smola
Copy link
Member

smola commented Dec 7, 2018

We already have the Jenkins deployment, soon you'll have the borges pipeline as an example for you to develop your own.

@bzz
Copy link
Contributor Author

bzz commented Dec 12, 2018

Linking in some instructions on using Jenkins for perf testing https://src-d.slack.com/archives/C0J8VQU0K/p1544633659068100

@dennwc
Copy link
Member

dennwc commented Jul 1, 2019

@lwsanty will continue to work on this, as discussed on Slack.

Specifically, we have a set of Go benchmarks in each driver which can be run using go test -run=NONE -bench=. ./driver/.... These benchmarks don't need a compiled driver, only the Go source and the data in ./fixtures directory. This benchmark only profiles the driver's Native AST -> Semantic UAST transformation pipeline, not the driver itself. We also have a tool to benchmark a fully compiled driver as well (parsing + protocol overhead + transformation) but it may be harder to setup at first.

I think it might be a good first step to setup our Jenkins instance to run these Go benchmarks for each driver either every few days or on each commit to the driver's master branch. Later we can expand it by pulling/building a Docker image, benchmarking it with and without bblfshd, etc. But for now, having performance stats for UAST transforms is super useful on its own.

@lwsanty
Copy link
Member

lwsanty commented Jul 2, 2019

According to prev comment. I propose to achieve this in the same way as it was done in borges, regression-borges
Things that need to be done:

  • create a separate repo in bblfsh, we can name it performance-driver, there's gonna be a utility and container built for further running benchmarks, parsing the output and propagating the results to some metrics services(prometheus/influx + grafana), that run in k8s.
    blockers
    • need to grant the access to create this repo or request it's creation
    • need to grant the access to srcd docker registry
  • need to make a request to infra team to launch metrics services in k8s
  • (optional) configure another methods of notification via slack/email
    blockers
    • need to grant admin access to Jenkins for me, I've already made the request
    • need slack token
    • need slack channel
    • need some service email

@smola @dennwc @bzz
It would be cool to have a feedback on this proposition.

@bzz
Copy link
Contributor Author

bzz commented Jul 2, 2019

Overall looks good!

blockers

JFY repository creation, as well as other ACL bits are handled by Infra where appropriate issues has to be filed, as soon as there is a consensus.


Before doing that, shall we briefly discuss what kind of performance regression dashboard do we want to have at the end?

E.g from the proposal on repository naming above, I figure that we are talking about individual driver "internal" performance benchmark.

I think it would be really useful is to include next things in the same dashboard:

  1. individual driver benchmarks test results (no need for actual full driver running, go test -bench=.)
    from the repository name proposal above, I presume that initial implementation targets this
  2. each driver performance under some pre-defined workload (though gRPC, only a driver conturing running running)
  3. bblfshd performance under sam pre-defined workload (gRPC, whole bblfshd)
  4. bblfshd performance under same workload, run though different clients (breakdown by client):

May be this would require turning current issue into and ☂️ and handle each of those individually though a new smaller issues in the order of priority.

I belive this way, all these may live in a same repository e.g bblfsh/performance, would be re-run by Jenkins on every release of bblfshd (manually triggered by tag name?) and should provide us (maintainers) with an accurate picture of expected performance and any possible regression.

Last but not least - for me, notifications are much less of the priority, comparing to having such a "dashboard".

Given the requirements above, I'm not sure how much of the regression-borges can be productively reused - afaik it consumes a single binary but in our case individual drivers do not have binary release artifacts and we would need to start containers instead (in some cases).

Also afaik regression-borges is mainly focused on output CSV with comparison between N version of the same binary, and in our case it could be more about populating some dashboard (graphan+ES?) with the metrics from different tools.

And for 2-4, I'm not 100% sure but thinks we might be able to re-use some of the prior work e.g:

@dennwc @creachadair WDYT? BTW, may be it will be productive to schedulle a quick call about this at some point.

@dennwc
Copy link
Member

dennwc commented Jul 2, 2019

👍 for scheduling a call.

bblfsh/performance sounds like a good name.

Agree about the notifications - they are not that important. The MVP for me is a dashboard with go test -bench=. benchmarks for each driver, even without gRPC/bblfshd. We can't really optimize native parsing and we can't change the protocol lightly to reduce overhead. The only actionable item is the optimization of UAST transforms or DSL, which will be monitored by the mentioned Go benchmark. And clients, of course, but that's out of the scope of MVP :)

For the dashboard itself, I'm not sure what is considered a "standard" right now, but I definitely don't want Jenkins dashboards - those are static and ugly. I also propose to use a pair of Grafana + Influx/ES/whatever if there are no better options. Grafana also provides "alarms" so we can setup notification later (if needed).

Re "single dashboard from multiple tools", as @lwsanty mentioned, we may need to consult with Infra team to know if we can reuse our Grafana instance in the pipeline cluster. We may need a separate one because of the isolation between clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants