-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profile-Guided Optimization (PGO) benchmark report #5
Comments
Thank you! Sounds helpful and a considerable speed up. Feel free to make a PR adding documentation for it. Or if you don't want to, I can try to do it as well. (I imagine a new markdown file in the docs folder, that can be linked in the readme and included in the library's docs module :D). I don't want to take your contribution away :) |
Oh and as an addition: I expect there are more improvements possible by optimizing the source code still. Postcard is about 3x faster at serialization right now. It won't be possible to reach this, since we do encode more data, but I bet it is possible to halve time needed time (I hope). Just need to learn how to do it :D |
I think it would be better if you could create such a document - in this case, it will be written in a consistent way with other pieces of documentation for the project. As a reference, I have several examples of such PGO docs in other projects (applications and libraries): https://github.com/zamazan4ik/awesome-pgo?tab=readme-ov-file#project-specific-documentation-about-pgo . I hope it will be helpful.
Oh, no worries about that at all! I am already happy that one more person is interested in PGO!
Yep, I understand. You can try to get some insights about possible optimization from PGO too since you can compare flamegraphs before and after PGO to check a difference (or even compare the resulting assembly/LLVM IR before and after PGO). It could be time-consuming, though. A nice thing with PGO is that all that "boring optimization stuff" is done semi-automatically by a compiler. You focus on high-level optimizations, the compiler does the "dirty" low-level optimization stuff ;) |
Alright, it is documented in #6. I would be interested in your feedback if you have the time :) The PR will close this, but it is linked in the docs, so users can find the information :) |
Sure! Just did it ;) |
Hi!
As I have done many times before, I decided to test the Profile-Guided Optimization (PGO) technique to optimize the library performance. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Since PGO has helped many other libraries, I decided to apply it to
serde-brief
to see if a performance win (or loss) can be achieved. Here are my benchmark results. For benchmarks, I used these benchmarks since it was mentioned in the Reddit post.This information can be interesting for anyone who wants to achieve more performance with the library in their use cases.
Test environment
serde-brief
version: used in the benchmark above (I guess the latest for the moment)Benchmark
For PGO optimization I use cargo-pgo tool. Release bench results I got with
taskset -c 0 cargo bench --no-default-features --features serde-brief
command. The PGO training phase is done withtaskset -c 0 cargo pgo bench -- --no-default-features --features serde-brief
, PGO optimization phase - withtaskset -c 0 cargo pgo optimize bench -- --no-default-features --features serde-brief
.taskset -c 0
is used for reducing the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).Results
I got the following results:
According to the results, PGO measurably improves the library's performance.
Further steps
I understand that the steps above can be time-consuming and hard to implement in practice. At the very least, the library's users can find this performance report and decide to enable PGO for their applications if they care about the library's performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this work.
Please don't treat the issue like an actual issue - it's just a benchmark report (since Discussions are disabled for the repo).
Thank you.
The text was updated successfully, but these errors were encountered: