Fix is_complete_for check #587

manopapad · 2023-02-24T00:37:54Z

This is the underlying issue behind nv-legate/cunumeric#721 (comment). Thanks to @aschaffer for doing most of the work to uncover this.

The error in the original logic can be seen when running the following example:

import cunumeric as num
a = num.ones((8, 10, 8))
v = num.ones((3, 5, 3))
num.convolve(a, v, mode="same")

with LEGATE_TEST=1 CUNUMERIC_TEST=1 legate --cpus 3 --event --dataflow, in which case legion spy verification fails with:

ERROR: Index Partition: 15 was labelled complete but there are missing points. This is definitely an application bug and invalidates all of Legion's dependence analysis. You must fix this before Legion Spy can continue.

The call to is_complete_for for the offending partition is misjudging it as complete. Here are the values in that call:

self._tile_shape = Shape((8, 4, 8))
self._color_shape = Shape((1, 3, 1))
self._offset = Shape((0, 0, -2))
extents = Shape((8, 10, 8))
offsets = Shape((0, 0, 0))
my_lo = Shape((0, 0, -2))
my_hi = Shape((8, 12, 6)))

@magnatelee please also take a look at the following potential related issues:

The task creation code in deferred.py:convolve is setting PartSym.complete = False, but this information is not honored when creating the partitions. Instead is_complete_for is consulted like normal.
Here a (lexicographic) comparison operator on Shape was used where an elementwise comparison was intended. It would be good to check any other uses of these operators, to confirm that this mixup isn't happening there too.

magnatelee · 2023-02-24T02:11:18Z

@manopapad @aschaffer thanks for catching this bug. It's my fault to assume comparison between tuples to be element-wise.

The task creation code in deferred.py:convolve is setting PartSym.complete = False, but this information is not honored when creating the partitions. Instead is_complete_for is consulted like normal.

Note that completeness of a partition symbol is a constraint on the partition and not a command to the solver. It's perfectly fine to pass a complete partition to a place where an incomplete partition is expected (the inverse isn't true though). That said, the solver currently doesn't check if the solution for a given partition symbol abides by the completeness constraint, though I should and will add it at some point.

Here a (lexicographic) comparison operator on Shape was used where an elementwise comparison was intended. It would be good to check any other uses of these operators, to confirm that this mixup isn't happening there too.

Let's change Shape's comparison operator directly. I meant it to be element-wise and thus shouldn't have used tuple comparison. I checked all uses of <= in the code and the one you guys spotted seems like the only one, so there should be no surprise with changing the behavior.

manopapad · 2023-02-24T21:33:09Z

I tried removing the definitions of all inequality comparisons from Shape and running the cuNumeric test suite, and this uncovered another use which I replaced, please see 91576e1.

Let's change Shape's comparison operator directly. I meant it to be element-wise and thus shouldn't have used tuple comparison.

I don't like this, for two reasons. First and less importantly, this would make <= on Shape not a total ordering. Second and more important, this would make <= do something different on Shape than on tuple, and Shape is pretty much some extra functionality on top of tuple, so this is asking for a mixup to happen in the future.

I would rather we define proper elementwise_lt or similar methods on Shape (or as freestanding functions), and possibly remove the inequality operators from Shape, to avoid any possibility of confusion in the future.

CCing @bryevdv for his opinion too.

bryevdv · 2023-02-24T21:57:22Z

I don't have all the context, but

        return all(a <= b for (a, b) in zip(my_lo, offsets)) and all(
            a <= b for (a, b) in zip(offsets + extents, my_hi)
        )

is pretty tedious and verbose, and not very expressive of the fact that some kind of higher level comparison is happening between my_lo, my_hi, and offsets. So I'd agree that an explicit API for these comparisons seems reasonable on that basis.

magnatelee · 2023-02-24T22:08:09Z

I don't like this, for two reasons. First and less importantly, this would make <= on Shape not a total ordering. Second and more important, this would make <= do something different on Shape than on tuple, and Shape is pretty much some extra functionality on top of tuple, so this is asking for a mixup to happen in the future.

Shape behaving like tuple for some operations is accidental and not by design. If anything, Shape meant to behave more like NumPy ndarray; for example, + on tuples concatenate the values, whereas that on Shape and ndarray performs element-wise addition. Also, what does a lexicographical order between Shapes even mean? We shouldn't decide the semantics of an operator based on what an abstraction's closest kin does, but what's best for that abstraction. I agree that there's ambiguity in the operator semantics, but I strongly disagree Shape should do what tuple does.

I would rather we define proper elementwise_lt or similar methods on Shape (or as freestanding functions), and possibly remove the inequality operators from Shape, to avoid any possibility of confusion in the future.

An alternative is to have <= and other comparison operators return a tuple of booleans, which would then be passed to all or any. I don't like spelling out elementwise_lt knowing we'll never want the lexicographical order on shapes.

manopapad · 2023-02-24T22:11:09Z

@magnatelee is suggesting we simply use the built-in __le__ for this:

Shape(a,b) <= Shape(c,d) := a <= c and b <= d

However, I am a little worried about this solution, because this would imply:

holds: (1,2) <= (2,1)  # tuple.__le__ is lexicographic
does not hold: Shape(1,2) <= (2,1)  # Shape.__le__ is elementwise

so IMHO this is breaking the principle of least surprise.

I would instead suggest an explicit "elementwise" comparison function (naming & placement TBD).

manopapad · 2023-02-24T22:25:07Z

We shouldn't decide the semantics of an operator based on what an abstraction's closest kin does, but what's best for that abstraction.

I agree with the principle. However, in this particular case I think there is (justifiably) a high risk of confusion. The class is named "shape", which is bound to be conflated with a NumPy shape, which is an actual tuple. Moreover, a lot of the time we are initializing a Shape based on a NumPy array's shape (i.e. a tuple), so it is very likely that developers will assume that the two are compatible. So to save ourselves some headaches I think we should try to avoid subtle cases of incompatibility.

An alternative is to have <= and other comparison operators return a tuple of booleans, which would then be passed to all or any. I don't like spelling out elementwise_lt knowing we'll never want the lexicographical order on shapes.

That's a better solution, I can proceed with that.

bryevdv · 2023-02-24T22:52:48Z

An alternative is to have <= and other comparison operators return a tuple of booleans,

Just noting that in case someone forgets to do

if all(s1 <= s2):  # if all((True, False, True....)):  [False]

and instead does (the IMO more typical)

if s1 <= s2:  # if (True, False, True....):  [Truthy]

then the comparison will nearly always return True.

manopapad · 2023-02-24T23:23:56Z

then the comparison will nearly always return True.

Will mypy flag it as an error?

bryevdv · 2023-02-24T23:54:28Z

I don't see why it would, it's perfectly valid (if not useful) to do

dev39 ❯ cat foo.py 
a = (True, False)

if all(a): print("all")

if a: print("oops")

~/tmp
dev39 ❯ mypy foo.py
Success: no issues found in 1 source file

~/tmp
dev39 ❯ python foo.py 
oops

magnatelee · 2023-02-25T01:21:06Z

An alternative is to have <= and other comparison operators return a tuple of booleans,

Just noting that in case someone forgets to do
if all(s1 <= s2):  # if all((True, False, True....)):  [False]
and instead does (the IMO more typical)
if s1 <= s2:  # if (True, False, True....):  [Truthy]
then the comparison will nearly always return True.

I think we can return our own container where __bool__ is overriden to raise an exception if there are more than one value. fyi, that's how numpy ndarray avoids ambiguity.

magnatelee · 2023-02-25T01:32:07Z

I agree with the principle. However, in this particular case I think there is (justifiably) a high risk of confusion. The class is named "shape", which is bound to be conflated with a NumPy shape, which is an actual tuple. Moreover, a lot of the time we are initializing a Shape based on a NumPy array's shape (i.e. a tuple), so it is very likely that developers will assume that the two are compatible. So to save ourselves some headaches I think we should try to avoid subtle cases of incompatibility.

That's fair, but I'd blame inherent ambiguity in the comparison operators defined to return a single boolean. Except for the unfortunate naming collision with ndarray.shape, there's nothing in Shape that would get people thinking that it behaves like a tuple. And that actually makes me think naming it Shape was a mistake and we should rename it. Plus, we could potentially give a more meaningful name to each comparison operator returning boolean; for example, a <= b could be renamed to b.subsumes(a), which is what it's really meant to be.

That's a better solution, I can proceed with that.

See my comment about a custom boolean container. We could do it if we want to avoid all footguns.

Will mypy flag it as an error?

I believe when an object obj passed in a place where a boolean is expected, an implicit obj.__bool__() call is inserted by the interpreter.

manopapad · 2023-02-27T21:31:27Z

I went ahead and implemented the approach of returning a tuple-like of bools, guarding that __bool__ is not called directly on it. MyPy helpfully pointed out one more use of comparison operators.

Assuming that this is a core-internal API, I just used asserts to signal misuse, but happy to switch to something else if you guys prefer.

Plus, we could potentially give a more meaningful name to each comparison operator returning boolean; for example, a <= b could be renamed to b.subsumes(a), which is what it's really meant to be.

We need two types of subsumes, all(<=) and all(<), so I guess we'd need subsumes and strict_subsumes, if we wanted to go with this.

bryevdv

I think a proper compare API would be nicer, but if users aren't really expected to be comparing shapes this seems OK at least for the short term to get some guardrails in the next release.

Fix is_complete_for check

ed7d6b6

manopapad added the category:bug-fix PR is a bug fix and will be classified as such in release notes label Feb 24, 2023

manopapad requested a review from magnatelee February 24, 2023 00:37

Fix another use of inequality on Shape

91576e1

manopapad added 2 commits February 27, 2023 11:39

Merge branch 'branch-23.03' into fix-is-complete

1103085

Return a bool tuple from Shape comparisons

97a546e

manopapad requested a review from bryevdv February 27, 2023 21:31

bryevdv reviewed Feb 27, 2023

View reviewed changes

bryevdv approved these changes Feb 27, 2023

View reviewed changes

manopapad merged commit 33ce363 into nv-legate:branch-23.03 Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix is_complete_for check #587

Fix is_complete_for check #587

manopapad commented Feb 24, 2023

magnatelee commented Feb 24, 2023 •

edited

Loading

manopapad commented Feb 24, 2023

bryevdv commented Feb 24, 2023

magnatelee commented Feb 24, 2023 •

edited

Loading

manopapad commented Feb 24, 2023

manopapad commented Feb 24, 2023

bryevdv commented Feb 24, 2023 •

edited

Loading

manopapad commented Feb 24, 2023

bryevdv commented Feb 24, 2023 •

edited

Loading

magnatelee commented Feb 25, 2023

magnatelee commented Feb 25, 2023 •

edited

Loading

manopapad commented Feb 27, 2023

bryevdv left a comment

Fix is_complete_for check #587

Fix is_complete_for check #587

Conversation

manopapad commented Feb 24, 2023

magnatelee commented Feb 24, 2023 • edited Loading

manopapad commented Feb 24, 2023

bryevdv commented Feb 24, 2023

magnatelee commented Feb 24, 2023 • edited Loading

manopapad commented Feb 24, 2023

manopapad commented Feb 24, 2023

bryevdv commented Feb 24, 2023 • edited Loading

manopapad commented Feb 24, 2023

bryevdv commented Feb 24, 2023 • edited Loading

magnatelee commented Feb 25, 2023

magnatelee commented Feb 25, 2023 • edited Loading

manopapad commented Feb 27, 2023

bryevdv left a comment

Choose a reason for hiding this comment

magnatelee commented Feb 24, 2023 •

edited

Loading

magnatelee commented Feb 24, 2023 •

edited

Loading

bryevdv commented Feb 24, 2023 •

edited

Loading

bryevdv commented Feb 24, 2023 •

edited

Loading

magnatelee commented Feb 25, 2023 •

edited

Loading