Walk: skip path creation in wildcard assignment #6267

anderseknert · 2023-10-03T08:10:27Z

We do a lot of walk-ing in Regal. So much that it's by far the single most expensive operation. That means any optimization of the walk built-in function will be a win for us.

Seeing as we rarely make use of the path component when walk-ing through AST inputs, I was curious to see if there was any optimization we could take when the path is a wildcard assignment, and as such clearly marked as unimportant. Turns out there is. Since the return value is provided in the operators list, we can check the value provided for the path part of the assigned array, and if it's a wildcard (_) skip path construction entirely. Example:

walk(input, [_, value]) # path is ignored

This greatly simplifies the walk, and the performance gains are substantial. Traversing a ~7MB AST:

main

$ opa bench -d p.rego -i objects.json data.p.w
+-------------------------------------------+---------------+
| samples                                   |             6 |
| ns/op                                     |     168806625 |
| B/op                                      |     197364318 |
| allocs/op                                 |       3855327 |
| histogram_timer_rego_query_eval_ns_75%    |     169968114 |
| histogram_timer_rego_query_eval_ns_90%    |     170513459 |
| histogram_timer_rego_query_eval_ns_95%    |     170513459 |
| histogram_timer_rego_query_eval_ns_99%    |     170513459 |
| histogram_timer_rego_query_eval_ns_99.9%  |     170513459 |
| histogram_timer_rego_query_eval_ns_99.99% |     170513459 |
| histogram_timer_rego_query_eval_ns_count  |          6.00 |
| histogram_timer_rego_query_eval_ns_max    |     170513459 |
| histogram_timer_rego_query_eval_ns_mean   |     168789611 |
| histogram_timer_rego_query_eval_ns_median |     168924020 |
| histogram_timer_rego_query_eval_ns_min    |     166685000 |
| histogram_timer_rego_query_eval_ns_stddev |       1239390 |
+-------------------------------------------+---------------+

no-path-walk

$ opa bench -d p.rego -i objects.json data.p.w
+-------------------------------------------+--------------+
| samples                                   |           21 |
| ns/op                                     |     50629984 |
| B/op                                      |     38018790 |
| allocs/op                                 |      1025211 |
| histogram_timer_rego_query_eval_ns_75%    |     51239562 |
| histogram_timer_rego_query_eval_ns_90%    |     51540933 |
| histogram_timer_rego_query_eval_ns_95%    |     51674420 |
| histogram_timer_rego_query_eval_ns_99%    |     51688208 |
| histogram_timer_rego_query_eval_ns_99.9%  |     51688208 |
| histogram_timer_rego_query_eval_ns_99.99% |     51688208 |
| histogram_timer_rego_query_eval_ns_count  |         21.0 |
| histogram_timer_rego_query_eval_ns_max    |     51688208 |
| histogram_timer_rego_query_eval_ns_mean   |     50611103 |
| histogram_timer_rego_query_eval_ns_median |     50871459 |
| histogram_timer_rego_query_eval_ns_min    |     49518833 |
| histogram_timer_rego_query_eval_ns_stddev |       748688 |
+-------------------------------------------+--------------+

The real-world impact is not as dramatic, since we aren't just walking, but normally need to actually do something with the values returned, but consistently shaving off about 13% eval time when linting one of the largest policy libraries isn't bad at all:

Regal main

go run main.go lint ~/tmp/kics/assets  162.16s user 6.04s system 593% cpu 28.362 total

Regal walk-no-path

go run main.go lint ~/tmp/kics/assets  145.51s user 5.01s system 597% cpu 25.176 total

netlify · 2023-10-03T08:11:54Z

✅ Deploy Preview for openpolicyagent ready!

Name	Link
🔨 Latest commit	`10d57a8`
🔍 Latest deploy log	https://app.netlify.com/sites/openpolicyagent/deploys/651bcc75c32f9c0008eba904
😎 Deploy Preview	https://deploy-preview-6267--openpolicyagent.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

charlieegan3

Really nice change! 👏

johanfylling · 2023-10-03T09:10:31Z

test/cases/testdata/walkbuiltin/test-walkbuiltin-wildcard-path.yaml

@@ -0,0 +1,28 @@
+cases:


For the sake of completeness, consider also creating a test case where we don't pass the output as an arg, but use the unification infix: e.g. [_, value] = walk(obj). Don't think that makes much difference to the builtin, but it wouldn't hurt.

could also throw in a

path = _ [path, value] = walk(a)

that might actually make it past the optimization. But if so, it's an edge case that I don't think warrants extra explanation in the builtin description.

I checked using the opa-explorer, and the compiler rewrites:

foo["bar"] { [path, value] := walk(input) }

to

foo["bar"] { __local3__ = input; walk(__local3__, __local2__) [__local0__, __local1__] = __local2__ }

So it'd be the same thing eventually either way. Other tests seem to cover that since before.

johanfylling · 2023-10-03T09:25:05Z

topdown/walk.go

@@ -70,6 +80,31 @@ func walk(filter, path *ast.Array, input *ast.Term, iter func(*ast.Term) error)
 	return nil
 }

+func walkNoPath(path, input *ast.Term, iter func(*ast.Term) error) error {
+	if err := iter(ast.ArrayTerm(path, input)); err != nil {


Instead of passing path as an argument, could we simplify the function signature of walkNoPath, and just pass a constant empty array term here? Since we know the path member won't be used, maybe we can even get away with passing in ast.Null here.

Alright. I've updated to pass an empty array term from a package var.

topdown/walk.go

We do a lot of `walk`-ing in [Regal](https://docs.styra.com/regal). So much that it's by far the single most expensive operation. That means any optimization of the `walk` built-in function will be a win for us. Seeing as we rarely make use of the `path` component when `walk`-ing through AST inputs, I was curious to see if there was any optimization we could take when the path is a wildcard assignment, and as such clearly marked as unimportant. Turns out there is. Since the return value is provided in the operators list, we can check the value provided for the `path` part of the assigned array, and if it's a wildcard (`_`) skip path construction entirely. Example: ```rego walk(input, [_, value]) ``` This greatly simplifies the walk, and the performance gains are substantial. Traversing a ~7MB AST: **main** ```shell $ opa bench -d p.rego -i objects.json data.p.w +-------------------------------------------+---------------+ | samples | 6 | | ns/op | 168806625 | | B/op | 197364318 | | allocs/op | 3855327 | | histogram_timer_rego_query_eval_ns_75% | 169968114 | | histogram_timer_rego_query_eval_ns_90% | 170513459 | | histogram_timer_rego_query_eval_ns_95% | 170513459 | | histogram_timer_rego_query_eval_ns_99% | 170513459 | | histogram_timer_rego_query_eval_ns_99.9% | 170513459 | | histogram_timer_rego_query_eval_ns_99.99% | 170513459 | | histogram_timer_rego_query_eval_ns_count | 6.00 | | histogram_timer_rego_query_eval_ns_max | 170513459 | | histogram_timer_rego_query_eval_ns_mean | 168789611 | | histogram_timer_rego_query_eval_ns_median | 168924020 | | histogram_timer_rego_query_eval_ns_min | 166685000 | | histogram_timer_rego_query_eval_ns_stddev | 1239390 | +-------------------------------------------+---------------+ ``` **no-path-walk** ```shell $ opa bench -d p.rego -i objects.json data.p.w +-------------------------------------------+--------------+ | samples | 21 | | ns/op | 50629984 | | B/op | 38018790 | | allocs/op | 1025211 | | histogram_timer_rego_query_eval_ns_75% | 51239562 | | histogram_timer_rego_query_eval_ns_90% | 51540933 | | histogram_timer_rego_query_eval_ns_95% | 51674420 | | histogram_timer_rego_query_eval_ns_99% | 51688208 | | histogram_timer_rego_query_eval_ns_99.9% | 51688208 | | histogram_timer_rego_query_eval_ns_99.99% | 51688208 | | histogram_timer_rego_query_eval_ns_count | 21.0 | | histogram_timer_rego_query_eval_ns_max | 51688208 | | histogram_timer_rego_query_eval_ns_mean | 50611103 | | histogram_timer_rego_query_eval_ns_median | 50871459 | | histogram_timer_rego_query_eval_ns_min | 49518833 | | histogram_timer_rego_query_eval_ns_stddev | 748688 | +-------------------------------------------+--------------+ ``` The real-world impact is not as dramatic, since we aren't *just* walking, but normally need to actually **do** something with the values returned, but consistently shaving off about 13% eval time when linting one of the largest policy libraries isn't bad at all: **Regal main** ```shell go run main.go lint ~/tmp/kics/assets 162.16s user 6.04s system 593% cpu 28.362 total ``` **Regal walk-no-path** ```shell go run main.go lint ~/tmp/kics/assets 145.51s user 5.01s system 597% cpu 25.176 total ``` Signed-off-by: Anders Eknert <[email protected]>

This brings: * optimizations to [walk](open-policy-agent/opa#6267) * optimizations for [arithmetic](open-policy-agent/opa#6262) ops Both are nice to have in the next release. Signed-off-by: Anders Eknert <[email protected]>

charlieegan3 previously approved these changes Oct 3, 2023

View reviewed changes

anderseknert dismissed charlieegan3’s stale review via 84f62b4 October 3, 2023 08:21

anderseknert force-pushed the walk-no-path branch from 10d57a8 to 84f62b4 Compare October 3, 2023 08:21

johanfylling previously approved these changes Oct 3, 2023

View reviewed changes

anderseknert dismissed johanfylling’s stale review via 362bb0b October 3, 2023 10:47

anderseknert force-pushed the walk-no-path branch from 84f62b4 to 362bb0b Compare October 3, 2023 10:47

johanfylling approved these changes Oct 3, 2023

View reviewed changes

anderseknert merged commit 7fa6165 into open-policy-agent:main Oct 3, 2023
24 checks passed

anderseknert deleted the walk-no-path branch October 3, 2023 11:12

anderseknert mentioned this pull request Oct 3, 2023

Change to built-in definitions for next release showing up in docs for the current one #6269

Closed

anderseknert mentioned this pull request Oct 3, 2023

Bump OPA version to current main StyraInc/regal#362

Merged

crunchtime-ali mentioned this pull request Oct 27, 2023

opa 0.58.0 Homebrew/homebrew-core#152526

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Walk: skip path creation in wildcard assignment #6267

Walk: skip path creation in wildcard assignment #6267

anderseknert commented Oct 3, 2023 •

edited

Loading

netlify bot commented Oct 3, 2023

charlieegan3 left a comment

johanfylling Oct 3, 2023

johanfylling Oct 3, 2023

anderseknert Oct 3, 2023

johanfylling Oct 3, 2023

anderseknert Oct 3, 2023

Walk: skip path creation in wildcard assignment #6267

Walk: skip path creation in wildcard assignment #6267

Conversation

anderseknert commented Oct 3, 2023 • edited Loading

netlify bot commented Oct 3, 2023

✅ Deploy Preview for openpolicyagent ready!

charlieegan3 left a comment

Choose a reason for hiding this comment

johanfylling Oct 3, 2023

Choose a reason for hiding this comment

johanfylling Oct 3, 2023

Choose a reason for hiding this comment

anderseknert Oct 3, 2023

Choose a reason for hiding this comment

johanfylling Oct 3, 2023

Choose a reason for hiding this comment

anderseknert Oct 3, 2023

Choose a reason for hiding this comment

anderseknert commented Oct 3, 2023 •

edited

Loading