Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python symbolic::Formula returns True for matrices (but not scalars)? #8315

Open
RussTedrake opened this issue Mar 9, 2018 · 52 comments
Open
Assignees
Labels
component: pydrake Python API and its supporting Starlark macros priority: low type: bug

Comments

@RussTedrake
Copy link
Contributor

from pydrake.all import MathematicalProgram
prog = MathematicalProgram()
x = prog.NewContinuousVariables(2)
print(x)
print(x[0] == 0.)
print(x==[0.,0.])

I get the following output:

[x(0) x(1)]
(x(0) = 0)
[ True  True]
@soonho-tri
Copy link
Member

soonho-tri commented Mar 9, 2018

It's because x == [0, 0] becomes [bool(x[0] == 0), bool(x[1] == 0)]. Python considers anything but the following items as True:

  • The Boolean value, False
  • Any numerical value equal to 0 (0, 0.0 but not 2 or -3.1)
  • The special value None
  • Any empty sequence or collection, including the empty string and the empty list

So, in the end, we have [True, True]. I'll check if there is a workaround for this.

@soonho-tri
Copy link
Member

First thing I can do is to provide __nonzero__ for symbolic::Formula so that the above example throws an exception instead of giving a false value [True, True].

@jwnimmer-tri
Copy link
Collaborator

I'm not sure I'm 100% understanding the bug report or subsequent discussion here. Russ, are you saying that the desired output should look like this?

[x(0) x(1)]
(x(0) = 0)
[(x(0) = 0) (x(1) = 0)]

@soonho-tri
Copy link
Member

@jwnimmer-tri , I think that's what Russ wants.

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Mar 9, 2018

So we'll have:

  • python relational operators on Expression always return a Formula;
  • the __nonzero__ on Formula throws with a note telling users to call Evaluate explicitly.

I like this.

@soonho-tri
Copy link
Member

soonho-tri commented Mar 9, 2018

python relational operators on Expression always return a Formula

FYI, this is what we already have as print(x[0] == 0.) returns x(0) == 0.

the __nonzero__ on Formula throws with a note telling users to call Evaluate explicitly.

I'll open a PR shortly. But the thing is that it will not give us what Russ wants for the above print(x==[0.,0.]) example. Instead of returning [(x(0) = 0) (x(1) = 0)], it will throw an exception as it can't evaluate x(0) = 0 without knowing the value of x(0).

I think there are no good solutions as long as we want to use both of == operator over numpy.ndarray. We need to give up one of the two.

  1. We might introduce a function Equal and make Equal([a, b], [0, 0]) return [a = 0, b = 0] as requested. Or;

  2. We can introduce a new class such as FormulaArray and provide __eq__ (and other relational operations) so that FormulaArray1 == FormulaArray2 works as requested.

I'm happy to learn if there is a better option here.

@RussTedrake
Copy link
Contributor Author

@EricCousineau-TRI says he might know a solution

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Mar 9, 2018

One thing is pybind11 only uses order in determining overloads - it starts with the first, and keeps on checking until it has a successful overload, thus the order in which you .def the overloads matters.

However, looking more at the discussion, it may seem to be something else; I'm guessing it's because NumPy's operator== is taking precedence over the return value (returning a logical array) rather than respecting the actual return type.

I'll see if there's a way around that.

EDIT: Can reproduce the issue per Soonho's comment in a simplified example: EricCousineau-TRI/repro@c6398b5

@soonho-tri
Copy link
Member

We might introduce a function Equal and make Equal([a, b], [0, 0]) return [a = 0, b = 0] as requested. Or;

I just found that numpy provides the function that I looked for: https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.logic.html#comparison

@EricCousineau-TRI
Copy link
Contributor

I tested np.equal, and still get an output of logical values. The documentation for np.equal states:

out : ndarray or bool
Output array of bools, or a single bool if x1 and x2 are scalars.

:(

There is a possibility of extending a custom ndarray to not return logicals for logical operations:
https://docs.scipy.org/doc/numpy-1.13.0/user/basics.subclassing.html#basics-subclassing

But that means that (a) we have to cast in pybind11 (not that big of a deal, I'd think) and (b) we'd have to take care that nd.array([<symbolic>...]) == nd.array([<symbolic>...]) does not occur.

@soonho-tri
Copy link
Member

:-( Just found the same problem. It has dtype parameter. Do you think it might save us?

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Mar 9, 2018

Nay :( Seems to be constrained by ufunc:

>>> print(np.equal(av, bv, dtype=Custom))
Traceback (most recent call last):
  File "numpy_eq.py", line 28, in <module>
    print(np.equal(av, bv, dtype=Custom))
TypeError: No loop matching the specified signature and casting
was found for ufunc equal

It seems that your option (2) (perhaps have FormulaArray subclass ndarray), coupled with an error on __nonzero__, is the way to go?

@soonho-tri
Copy link
Member

sigh... I'll dig it up this evening.

@EricCousineau-TRI
Copy link
Contributor

One other option: From the results of #8116, if it is possible to write a custom dtype, and possibly register custom ufuncs, then perhaps there is a chance that we can register a ufunc for __eq__ for symbolic types.

That being said, I'm not sure if that will affect the behavior of np.equal if it's default dtype is bool.

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Mar 9, 2018

Will update this post with some other options:

Tracing up where I think ndarray.__eq__ comes from:

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Mar 9, 2018

Hup, it looks like there's a method numpy.set_numeric_ops (that has __doc__, but I can't find its online documentation), referenced in this post:
https://stackoverflow.com/questions/45601964/source-code-location-and-explanation-of-vectorized-array-operations

We could use this to do something like:

np.set_numeric_ops(equal=drake_ufunc_equal)

and see if there's a way to teach the ufunc about how to handle different dtypes, so that we don't impact performance of other objects.

Example of a simple override (which would cause everything to slow down / suck):

>> eq = lambda a, b: a == b
>> generic_equal = np.frompyfunc(eq, 2, 1)
>> 
>> a = Custom('a')
>> b = Custom('b')
>> print(a == b)
eq('a', 'b')
>> 
>> av = np.array([a, b])
>> bv = np.array([b, a])
>> 
>> print(generic_equal(av, bv))
["eq('a', 'b')" "eq('b', 'a')"]
>> print(np.equal(av, bv))
[ True  True]
>> 
>> np.set_numeric_ops(equal=generic_equal)
>> print(np.equal(av, bv))
[ True  True]
>> print(av == bv)
["eq('a', 'b')" "eq('b', 'a')"]
>> np.equal = generic_equal
>> print(np.equal(av, bv))
["eq('a', 'b')" "eq('b', 'a')"]

@soonho-tri
Copy link
Member

Looks nice! I'll try that in the coming PR.

@EricCousineau-TRI
Copy link
Contributor

Hmm... Is it OK if we hold off on incorporating this just yet?
I'd like to look a bit deeper and ensure that we don't incur any nasty side effects (e.g. killing speed for other libraries that aren't using Drake for image stuff, etc.), and see if there's an elegant solution to guarantee this.

What kind of timeline would you want this resolved on? (I've talked with @m-chaturvedi, and we'll start addressing #8116)

@soonho-tri
Copy link
Member

What kind of timeline would you want this resolved on?

@RussTedrake can answer this (he marked this with high priority).

@RussTedrake
Copy link
Contributor Author

it depends on the cost of resolving it... but I'm about to lecture about trajectory optimization, and this occurs very frequently when formulating trajectory optimizations.

@RussTedrake
Copy link
Contributor Author

i also marked it as "high" priority because in it's current form, it's a bit of a landmine...

@EricCousineau-TRI
Copy link
Contributor

Sounds good. I'll focus on the next few of days, because after having worked on it for a bit, I'm fairly confident that the solution for #8116 will most likely solve this too (if it goes like I planned it).

BTW: Posted StackOverflow as well: https://stackoverflow.com/questions/49205075/possible-to-have-logical-operations-e-g-ndarray-eq-not-return-bool-in

@EricCousineau-TRI
Copy link
Contributor

Finally finished the spike test (with some generalizing / polishing), and it seems like user-defined dtypes in NumPy will solve both this issue and #8116:
https://github.com/EricCousineau-TRI/repro/blob/1b1edfa/python/pybind11/dtype_stuff/test_basic.cc#L117-L128
https://github.com/EricCousineau-TRI/repro/blob/1b1edfa/python/pybind11/dtype_stuff/test_basic.py#L89-L93

Output from those lines in the Python script (out of sheer laziness, I had operator== just double the number):

>> av = np.array([Custom(1), Custom(2)])
>> mutate(av)
>> print(av)
[<Custom(101.0)> <Custom(102.0)>]
>> print(av == av)
[<Custom(202.0)> <Custom(204.0)>]

Required some shifty appeasements of both pybind11 and numpy. There is also the remaining issue about memory leaking here (which now that I have a working prototype, will respond there with some more info):
numpy/numpy#10721

While it sucks to leak memory, I think is the cleanest solution. In reviewing PyTorch, Tensorflow, etc., I realized that each of those might have some niceties, but would be a longer-term goal (using this article as a quick overview / guide:
http://davidstutz.de/a-note-on-extending-tensorflow-pytorch-and-theano-for-deep-learning-on-custom-datastructures/ )

Will start polishing this to push into master.

@eric-wieser
Copy link
Contributor

Right now I think the problem is that np.equal.types contains only XX->? loops, and you want an OO->O loop.

I think it would be pretty reasonable for numpy to add a OO->O loop to np.equal and the other comparison ops - with that in place, you could overload the == operator to just force this loop to be used.

@EricCousineau-TRI
Copy link
Contributor

Per Slack convo with @hongkai-dai, this is one workaround that can be used with NumPy 1.13.3, used on Bionic, which predates upstream patches which are NumPy ~1.15:
33508e1

Output:

$ bazel run //bindings/pydrake:py/symbolic_test -- TestSymbolicExpression.test_relation_operators_workaround
[<Formula "(x < 1)"> <Formula "(y < 2)">]
[<Formula "(x < 1)"> <Formula "(y < 1)">]
[<Formula "(x < 1)"> <Formula "(x < 2)">]

Will write up a PR thing.

@RussTedrake
Copy link
Contributor Author

Just to leave a clear update (even for my own memory), the current status is that

from pydrake.all import MathematicalProgram, eq
prog = MathematicalProgram()
x = prog.NewContinuousVariables(2)
print(x)
print(x[0] == 0.)
print(eq(x,[0.,0.]))
print(x == [0.0])

results in

[Variable('x(0)', Continuous) Variable('x(1)', Continuous)]
(x(0) == 0)
[<Formula "(x(0) == 0)"> <Formula "(x(1) == 0)">]
---------------------------------------------------------------------------
DeprecationWarning                        Traceback (most recent call last)
<ipython-input-69-ff08b00d0643> in <module>()
      5 print(x[0] == 0.)
      6 print(eq(x,[0.,0.]))
----> 7 print(x == [0.0])

DeprecationWarning: elementwise == comparison failed; this will raise an error in the future.

@RussTedrake
Copy link
Contributor Author

btw @EricCousineau-TRI -- the "DeprecationWarning" above actually terminates my program.

@EricCousineau-TRI
Copy link
Contributor

Yup, that is intentional. If it didn't, then you'd just have a crappier error later on:

warnings.filterwarnings(
"error", category=DeprecationWarning,
message="elementwise == comparison failed")

@buoyancy99
Copy link

Now NEP 15 is done and numpy doc recommends __array_ufunc__ to achieve the functionality as quoted below,

The presence of __array_ufunc__ also influences how ndarray handles binary operations like arr + obj and arr < obj when arr is an ndarray and obj is an instance of a custom class.

Any updates on this? Any plan to subclass np.ndarray in the future so built-in operators like <, > will be supported (and thus allow vectorized constraints)?

@RussTedrake
Copy link
Contributor Author

if __array_ufunc__ can resolve this, that would be fantastic!

@jwnimmer-tri
Copy link
Collaborator

Note that our minimum required version of numpy is now >= 1.17.4. That might be of help in this issue?

@buoyancy99
Copy link

buoyancy99 commented Sep 25, 2022

I looked further into this today and found a temporary solution. If we want a high-performance version of this (actual multi-threaded vectorized creation of new formulas), overriding __array_ufunc__ won't work,

This is because of there is no signature for logical operators to return objects. from numpy documentation, it's always bool.
From the output of np.info(np.less_equal)

 Returns
 ------- out : bool or ndarray of bool
   Array of bools, or a single bool if `x1` and `x2` are scalars.

However, if we don't mind having the same performance as list comprehension, the solution is actually really easy, as shown in my code block below. We can use np.vectorize to wrap the overloaded logical operators in Expression, and put it in __array_ufunc__. The function np.vectorize is not an actual multi-threaded vectorized method. However, 1. for non-linear constraints, people need to use for loops to create and add them to MathematicalProgram anyway, and np.vectorize won't we worse than that. 2. usually we don't usually really have a huge amount of constraints. 3. np.vectorize handles broadcasting nicely.
Given this three factors, I actually think this is a pretty good solution that we should be satisfied with.

Here is a temporary workaround for us:

from pydrake.all import MathematicalProgram
import numpy as np
import operator

prog = MathematicalProgram()
x = prog.NewContinuousVariables(2)


class PatchedArray(np.ndarray):
    to_patch = {
        np.greater: operator.gt,
        np.greater_equal: operator.ge,
        np.less: operator.lt,
        np.less_equal: operator.le,
        np.equal: operator.eq,
        np.not_equal: operator.ne
    }

    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        with np.testing.suppress_warnings() as sup:
            sup.filter(RuntimeWarning, "invalid value encountered in")
            casted_inputs = (item.view(np.ndarray) if isinstance(item, PatchedArray) else item for item in inputs)
            if ufunc in self.to_patch:
                return np.vectorize(self.to_patch[ufunc])(*casted_inputs, **kwargs)
            else:
                return getattr(ufunc, method)(*casted_inputs, **kwargs).view(PatchedArray)


x_patched = x.view(PatchedArray)
print(x_patched == np.zeros(2))
print(x_patched >= np.zeros(2))
print(x_patched <= np.zeros(2))
print(x_patched < np.zeros(2))
print(type(x_patched + 1))
print(type(x_patched < np.zeros(2)))
print(x_patched[None] * np.ones((3, 2)) == np.zeros((3, 2)))
print(x_patched[None] * np.ones((3, 2)) >= 1)

Output will be

[<Formula "(x(0) == 0)"> <Formula "(x(1) == 0)">]
[<Formula "(x(0) >= 0)"> <Formula "(x(1) >= 0)">]
[<Formula "(x(0) <= 0)"> <Formula "(x(1) <= 0)">]
[<Formula "(x(0) < 0)"> <Formula "(x(1) < 0)">]
<class '__main__.PatchedArray'>
<class 'numpy.ndarray'>
[[<Formula "(x(0) == 0)"> <Formula "(x(1) == 0)">]
 [<Formula "(x(0) == 0)"> <Formula "(x(1) == 0)">]
 [<Formula "(x(0) == 0)"> <Formula "(x(1) == 0)">]]
[[<Formula "(x(0) >= 1)"> <Formula "(x(1) >= 1)">]
 [<Formula "(x(0) >= 1)"> <Formula "(x(1) >= 1)">]
 [<Formula "(x(0) >= 1)"> <Formula "(x(1) >= 1)">]]

@eric-wieser
Copy link
Contributor

This is because of there is no signature for logical operators to return objects. from numpy documentation, it's always bool.

It shouldn't be hard to write a patch to change this. IIRC, numpy now has "object loops" for the logical_and and logical_or operators, so contributing them for comparisons (assuming they're not in fact already there) would seem uncontroversial.

@RussTedrake
Copy link
Contributor Author

Excellent! Thank you @buoyancy99 and @eric-wieser !

I would definitely like to resolve this. @buoyancy99 , does Eric's hint give you enough to go on to explore that version of the solution?

@eric-wieser
Copy link
Contributor

eric-wieser commented Sep 25, 2022

In fact, the object loop already exists as far back as numpy 1.16.5 (numpy/numpy#10745), which I referred to in #8315 (comment) above

In [30]: from sympy import symbols

In [31]: x, y = symbols('x y')

In [32]: np.greater([x], [y]) # errors
TypeError: cannot determine truth value of Relational

In [33]: np.greater([x], [y], dtype=object)  # works
Out[33]: array([x > y], dtype=object)

@buoyancy99
Copy link

buoyancy99 commented Sep 25, 2022

Excellent! Thank you @buoyancy99 and @eric-wieser !

I would definitely like to resolve this. @buoyancy99 , does Eric's hint give you enough to go on to explore that version of the solution?

Just updated the patched subclass code following Eric's suggestion and it works. However, I believe that with this update, we will still need to subclass np.ndarray and link all C++ Expression arrays to it. This has to be done in our fork of pybind11 I believe. I need some additional guidance on how to start there.

Updated subclass code

class PatchedArray(np.ndarray):
    to_patch = {
        np.greater: operator.gt,
        np.greater_equal: operator.ge,
        np.less: operator.lt,
        np.less_equal: operator.le,
        np.equal: operator.eq,
        np.not_equal: operator.ne
    }

    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        with np.testing.suppress_warnings() as sup:
            sup.filter(RuntimeWarning, "invalid value encountered in")
            casted_inputs = (item.view(np.ndarray) if isinstance(item, PatchedArray) else item for item in inputs)
            if ufunc in self.to_patch:
                kwargs.update(dtype=object)
                return getattr(ufunc, method)(*casted_inputs, **kwargs)
            else:
                return getattr(ufunc, method)(*casted_inputs, **kwargs).view(PatchedArray)

x_patched = x.view(PatchedArray)
print(x_patched == np.zeros(2))
print(x_patched >= np.zeros(2))
print(x_patched <= np.zeros(2))
print(x_patched < np.zeros(2))
print(type(x_patched + 1))
print(type(x_patched < np.zeros(2)))
print(x_patched[None] * np.ones((3, 2)) == np.zeros((3, 2)))
print(x_patched[None] * np.ones((3, 2)) >= 1)

@eric-wieser
Copy link
Contributor

Probably you want the more conservative:

class PatchedArray(np.ndarray):
    to_patch = {  # no need for a `dict` any more
        np.greater,
        np.greater_equal,
        np.less,
        np.less_equal,
        np.equal,
        np.not_equal
    }

    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        with np.testing.suppress_warnings() as sup:
            sup.filter(RuntimeWarning, "invalid value encountered in")
            casted_inputs = (item.view(np.ndarray) if isinstance(item, PatchedArray) else item for item in inputs)
            if ufunc in self.to_patch:
                kwargs.setdefault('dtype', object)
                return getattr(ufunc, method)(*casted_inputs, **kwargs)
            else:
                return getattr(ufunc, method)(*casted_inputs, **kwargs).view(PatchedArray)

which doesn't override the dtype if the user requests a different one explicitly

@buoyancy99
Copy link

Probably you want the more conservative:

class PatchedArray(np.ndarray):
    to_patch = {  # no need for a `dict` any more
        np.greater,
        np.greater_equal,
        np.less,
        np.less_equal,
        np.equal,
        np.not_equal
    }

    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        with np.testing.suppress_warnings() as sup:
            sup.filter(RuntimeWarning, "invalid value encountered in")
            casted_inputs = (item.view(np.ndarray) if isinstance(item, PatchedArray) else item for item in inputs)
            if ufunc in self.to_patch:
                kwargs.setdefault('dtype', object)
                return getattr(ufunc, method)(*casted_inputs, **kwargs)
            else:
                return getattr(ufunc, method)(*casted_inputs, **kwargs).view(PatchedArray)

which doesn't override the dtype if the user requests a different one explicitly

any suggestion for the pybind11 part?

@eric-wieser
Copy link
Contributor

Not really, I haven't touched pybind11 for a few years

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented May 6, 2024

This has to be done in our fork of pybind11 I believe.

Our goal in Drake is to get rid of our fork of pybind11, and instead use a vanilla upstream version (#19250). Solutions that involve patching pybind11 will not be accepted. However, I don't think we need it in this case.

[We] will still need to subclass np.ndarray and link all C++ Expression arrays to it.

As I understand it, the goal is to return PatchedArray from bindings like MathematicalProgram.NewContinuousVariables that return a Matrix<Variable, ...>? Even in the extreme, we could write a helper function to accomplish that and manually call it from all bindings that return arrays of variables, but probably a type caster template specialization could handle it automatically. Either way, it's do-able without changing pybind11 itself.

@jwnimmer-tri
Copy link
Collaborator

Actually, we want to patch Matrix<Expression, ...> and Matrix<Variable, ...> both, I think. All of the six comparison operators should return the Formula type when any of the operands are Expression or Variable.

@RussTedrake
Copy link
Contributor Author

Sounds great! I would love to finally resolve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pydrake Python API and its supporting Starlark macros priority: low type: bug
Projects
None yet
Development

No branches or pull requests

6 participants