-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various performance improvements #11
Conversation
Improves performance processing pure QUIC traffic by ~20% Relates to zeek/spicy#1565.
There's still a full packet copy within the decrypt_crypto_payload(), but it's one less now.
There should not be a need for the extra copying. hilti::rt::Bytes are mostly std::string and we can pass by const reference as well.
No need for the copy.
As before, avoid unnecessary copies of std::vector instances.
...and return hilti::rt::Bytes directly.
d8c6210
to
6408f4d
Compare
Now that we do not buffer the packet anymore explicitly, we do not need a should_buffer() method.
I suspect the structure here can be improved, but given we're only interested in the form, replace with an anonymous uint8 field.
There's not much point accumulating it in fields if we're never using it, anyhow.
We only need to copy out the buffer, no need to be overly safe.
Think previously we exported all the symbols :-/
We're not actually using any of the fields, so may as well use skip.
This removes the iterator usage but removes the explicit copy into std::vector<> in favor of using the hilti::rt::Bytes::data() content directly. Hide the reinterpret_cast<> behind a small helper function. And further feedback from Benjamin.
@bbannier - I think I adapted to most you suggested. Mind adding a ? I'll post some hyperfine numbers in a bit, too. The top two is the current analyzer version in
It takes ~0.8 seconds for many-requests-12000.pcap and ~1.5 seconds for 16-50000000.pcap without the analyzer enabled. |
Rough summary:
skip
optimization for padding andbytes
fields that are otherwise not used.On a pcap created with Python's aioquic package containing roughly 12k quic connections, this PR reduces runtime from ~18.5seconds to ~12seconds. By far the largest impact had removal of the previous
&try / backtrack()
approach - see also zeek/spicy#1565.