Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide built-in function (or other mean) to check if collection x includes collection y #4358

Closed
anderseknert opened this issue Feb 17, 2022 · 6 comments · Fixed by #4751
Closed

Comments

@anderseknert
Copy link
Member

Several people have asked me recently how to check if a composite value (i.e. collection) is included in another collection. While this is obviously doable for any composite value type (set, array, object) already, it isn't obvious how to do it!

Think subset but generalized to work with any collection type. Example below using in operator, but this is obviously not working code:

{1, 2} in {1, 3, 2} # true, set so order doesn't matter

["b", "c"] in ["a", "b", "c", "d"] # true, order matters

{"foo": "bar"} in {"a": "b", "foo": "bar"} # true

It would be a useful addition if Rego provided a built-in, or had a current operator extended, for quickly answer these types of "subset" queries:

included := includes({"a": "b", "foo": "bar"}, {"foo": "bar"})

# or 

allow {
    includes(provided_roles, required_roles)
}

When bouncing this idea off of @srenatus , he suggested the proposed elipsis operator matching might be used to accomplish this:

{1, 2, ...} = {1, 3, 2}
[..., "b", "c", ...] = ["a", "b", "c", "d"]
{"foo": "bar", ...} = {"a": "b", "foo": "bar"}

Although it's not as clear how that would work when provided values stored in variables rather than literals like provided above. We also discussed some ideas of extending in, some or every to accomplish this, but I think we're both leaning towards a new built-in function being the easiest way to accomplish this. Definitely open to suggestions though!


Since I realize some people might find this while searching for a solution for how to accomplish this currently, here's some ways you could do it (and I think, pretty good evidence for why this should be made simpler):

Sets

superset := {1, 2, 4, 7}
subset := {2, 7}

included := subset - superset == set()

# or using intersection

included := subset & superset == subset

Arrays

import future.keywords.in

superarr := ["a", "b", "c", "d", "e"]
subarr := ["c", "d"]

# Create array of slices of same length as subarr and check for membership
included := subarr in {a | superarr[i]; a := array.slice(superarr, i, i + count(subarr))}

Objects

superobject := {"a": 1, "c": 3, "b": 2, "nested": {"foo": "bar"}, "arr": [1, 2, {"x": 1}]}
subobject := {"a": 1, "b": 2, "nested": {"foo": "bar"}, "arr": [1, 2, {"x": 1}]}

path_matches[match] {
    [path, value] := walk(subobject)
    not is_object(value)
    
    # Check that for any given path in the subobject, the same value is set in the superobject
    match := object.get(superobject, path, null) == value
}

# Only matches in set (i.e no false)
included := path_matches == {true}
@tsandall
Copy link
Member

Looks like a good built-in function candidate. A couple minor thoughts:

  • We should namespace the built-in function. Maybe something like object.subset(a, b)?
  • We should define how the function behaves for different types of valid inputs, e.g., object.subset(set, array), object.subset(obj, set), etc.
  • We should define how the function handles unexpected types of inputs, e.g., object.subject(1, null). I'm guessing it can just be undefined.

@anderseknert
Copy link
Member Author

As for namespacing, agreed. I wonder though if it can be seen as confusing that an object-prefixed function takes other types of arguments, like two sets? Either way, I like the subset name! For scalar values we could just do an equals comparison I guess 🤔 and return false or undefined if they don't match. Or just fail on anything other than composite values like you suggested.

@srenatus
Copy link
Contributor

srenatus commented May 3, 2022

I'd think subset makes for a good relation, like walk... all the examples above, but also

{ x | subset(x, {1, 2}) } == { {}, {1}, {2}, {1, 2} }

For correctness reasons of our current implementations, it would be cool if there was a second relation. Currently, we only have walk.

@anderseknert
Copy link
Member Author

I can kinda see the coolness factor 🙂 I guess if it works the same way (and performs about the same) when provided a fixed value, why not! How would this work when a subset is not matched though? Would it return undefined? IIRC walk always returns something, so it's never "falsy".. or can it be?

@srenatus
Copy link
Contributor

srenatus commented May 4, 2022

IIRC walk always returns something, so it's never "falsy".. or can it be?

A call to walk can be undefined, like

open-policy-agent/opa-main % opa eval 'walk([], [[a,b], c])' -fpretty
undefined
open-policy-agent/opa-main % opa eval 'walk([[]], [[a,b], c])' -fpretty
undefined
open-policy-agent/opa-main % opa eval 'walk([[[]]], [[a,b], c])' -fpretty
+---+---+----+
| a | b | c  |
+---+---+----+
| 0 | 0 | [] |
+---+---+----+

Same would go for a subset relation. subset({1}, {2}) would be undefined.

@anderseknert
Copy link
Member Author

Thanks! 👍

charlesdaniels pushed a commit to charlesdaniels/opa that referenced this issue Jun 9, 2022
This implements the new object.subset() builtin.

Based on my benchmarking, this offers a 2.77x speedup compared to
implementing the same thing in pure Rego, and is also easier to read.

Fixes open-policy-agent#4358

Signed-off-by: Charles Daniels <[email protected]>
ashutosh-narkar pushed a commit that referenced this issue Jun 9, 2022
This implements the new object.subset() builtin.

Based on my benchmarking, this offers a 2.77x speedup compared to
implementing the same thing in pure Rego, and is also easier to read.

Fixes #4358

Signed-off-by: Charles Daniels <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants