-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Walk: skip path creation in wildcard assignment #6267
Conversation
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice change! 👏
10d57a8
to
84f62b4
Compare
@@ -0,0 +1,28 @@ | |||
cases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of completeness, consider also creating a test case where we don't pass the output as an arg, but use the unification infix: e.g. [_, value] = walk(obj)
. Don't think that makes much difference to the builtin, but it wouldn't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could also throw in a
path = _
[path, value] = walk(a)
that might actually make it past the optimization. But if so, it's an edge case that I don't think warrants extra explanation in the builtin description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked using the opa-explorer, and the compiler rewrites:
foo["bar"] {
[path, value] := walk(input)
}
to
foo["bar"] {
__local3__ = input; walk(__local3__, __local2__)
[__local0__, __local1__] = __local2__
}
So it'd be the same thing eventually either way. Other tests seem to cover that since before.
topdown/walk.go
Outdated
@@ -70,6 +80,31 @@ func walk(filter, path *ast.Array, input *ast.Term, iter func(*ast.Term) error) | |||
return nil | |||
} | |||
|
|||
func walkNoPath(path, input *ast.Term, iter func(*ast.Term) error) error { | |||
if err := iter(ast.ArrayTerm(path, input)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of passing path
as an argument, could we simplify the function signature of walkNoPath
, and just pass a constant empty array term here? Since we know the path member won't be used, maybe we can even get away with passing in ast.Null
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. I've updated to pass an empty array term from a package var.
We do a lot of `walk`-ing in [Regal](https://docs.styra.com/regal). So much that it's by far the single most expensive operation. That means any optimization of the `walk` built-in function will be a win for us. Seeing as we rarely make use of the `path` component when `walk`-ing through AST inputs, I was curious to see if there was any optimization we could take when the path is a wildcard assignment, and as such clearly marked as unimportant. Turns out there is. Since the return value is provided in the operators list, we can check the value provided for the `path` part of the assigned array, and if it's a wildcard (`_`) skip path construction entirely. Example: ```rego walk(input, [_, value]) ``` This greatly simplifies the walk, and the performance gains are substantial. Traversing a ~7MB AST: **main** ```shell $ opa bench -d p.rego -i objects.json data.p.w +-------------------------------------------+---------------+ | samples | 6 | | ns/op | 168806625 | | B/op | 197364318 | | allocs/op | 3855327 | | histogram_timer_rego_query_eval_ns_75% | 169968114 | | histogram_timer_rego_query_eval_ns_90% | 170513459 | | histogram_timer_rego_query_eval_ns_95% | 170513459 | | histogram_timer_rego_query_eval_ns_99% | 170513459 | | histogram_timer_rego_query_eval_ns_99.9% | 170513459 | | histogram_timer_rego_query_eval_ns_99.99% | 170513459 | | histogram_timer_rego_query_eval_ns_count | 6.00 | | histogram_timer_rego_query_eval_ns_max | 170513459 | | histogram_timer_rego_query_eval_ns_mean | 168789611 | | histogram_timer_rego_query_eval_ns_median | 168924020 | | histogram_timer_rego_query_eval_ns_min | 166685000 | | histogram_timer_rego_query_eval_ns_stddev | 1239390 | +-------------------------------------------+---------------+ ``` **no-path-walk** ```shell $ opa bench -d p.rego -i objects.json data.p.w +-------------------------------------------+--------------+ | samples | 21 | | ns/op | 50629984 | | B/op | 38018790 | | allocs/op | 1025211 | | histogram_timer_rego_query_eval_ns_75% | 51239562 | | histogram_timer_rego_query_eval_ns_90% | 51540933 | | histogram_timer_rego_query_eval_ns_95% | 51674420 | | histogram_timer_rego_query_eval_ns_99% | 51688208 | | histogram_timer_rego_query_eval_ns_99.9% | 51688208 | | histogram_timer_rego_query_eval_ns_99.99% | 51688208 | | histogram_timer_rego_query_eval_ns_count | 21.0 | | histogram_timer_rego_query_eval_ns_max | 51688208 | | histogram_timer_rego_query_eval_ns_mean | 50611103 | | histogram_timer_rego_query_eval_ns_median | 50871459 | | histogram_timer_rego_query_eval_ns_min | 49518833 | | histogram_timer_rego_query_eval_ns_stddev | 748688 | +-------------------------------------------+--------------+ ``` The real-world impact is not as dramatic, since we aren't *just* walking, but normally need to actually **do** something with the values returned, but consistently shaving off about 13% eval time when linting one of the largest policy libraries isn't bad at all: **Regal main** ```shell go run main.go lint ~/tmp/kics/assets 162.16s user 6.04s system 593% cpu 28.362 total ``` **Regal walk-no-path** ```shell go run main.go lint ~/tmp/kics/assets 145.51s user 5.01s system 597% cpu 25.176 total ``` Signed-off-by: Anders Eknert <[email protected]>
84f62b4
to
362bb0b
Compare
This brings: * optimizations to [walk](open-policy-agent/opa#6267) * optimizations for [arithmetic](open-policy-agent/opa#6262) ops Both are nice to have in the next release. Signed-off-by: Anders Eknert <[email protected]>
This brings: * optimizations to [walk](open-policy-agent/opa#6267) * optimizations for [arithmetic](open-policy-agent/opa#6262) ops Both are nice to have in the next release. Signed-off-by: Anders Eknert <[email protected]>
This brings: * optimizations to [walk](open-policy-agent/opa#6267) * optimizations for [arithmetic](open-policy-agent/opa#6262) ops Both are nice to have in the next release. Signed-off-by: Anders Eknert <[email protected]>
This brings: * optimizations to [walk](open-policy-agent/opa#6267) * optimizations for [arithmetic](open-policy-agent/opa#6262) ops Both are nice to have in the next release. Signed-off-by: Anders Eknert <[email protected]>
We do a lot of
walk
-ing in Regal. So much that it's by far the single most expensive operation. That means any optimization of thewalk
built-in function will be a win for us.Seeing as we rarely make use of the
path
component whenwalk
-ing through AST inputs, I was curious to see if there was any optimization we could take when the path is a wildcard assignment, and as such clearly marked as unimportant. Turns out there is. Since the return value is provided in the operators list, we can check the value provided for thepath
part of the assigned array, and if it's a wildcard (_
) skip path construction entirely. Example:This greatly simplifies the walk, and the performance gains are substantial. Traversing a ~7MB AST:
main
no-path-walk
The real-world impact is not as dramatic, since we aren't just walking, but normally need to actually do something with the values returned, but consistently shaving off about 13% eval time when linting one of the largest policy libraries isn't bad at all:
Regal main
go run main.go lint ~/tmp/kics/assets 162.16s user 6.04s system 593% cpu 28.362 total
Regal walk-no-path
go run main.go lint ~/tmp/kics/assets 145.51s user 5.01s system 597% cpu 25.176 total