-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: add query optimizer #829
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, only thing worth considering would be some test cases showing the optimizer works as expected
I think for samples with many functions and large functions this could make a big difference (see K32 results). |
from capa.features.common import Arch, Bytes, Substring | ||
|
||
|
||
def test_optimizer_order(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, thanks a lot!
please review and merge #828 and #827 before this.
This PR adds a rule optimizer that re-orders the nodes in the rule logic tree to try simpler/faster cases before complex cases. For example, it prefers OS checks before mnemonic checks before regex checks.
In practice, this seems to make a small, but measurable difference in execution time:
(via: PMA01-01, 30 iterations)
Note that originally, in 152d0f3, I had the sign of the cost function inverted, so the optimizer was actually a de-optimizer: it picked approximately the worst possible order of evaluation. This led to a 13% increase in feature evaluations, whereas the correct ordering improves evaluation performance by about 2%. The mistake demonstrates that evaluation order can have a substantial impact on performance, though our rules are already fairly well structured (e.g. we typically have OS checks as the first line).
My opinion is that we should probably merge this PR because it does provide some performance benefit, code is very localized, and it doesn't change any of our public APIs/behaviors.
Further perf metrics, using k32 (2 iterations):
About 44% faster with 38% fewer feature evaluations.
Checklist