Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

solve_left and solve_right should use coercion #12406

Closed
robertwb opened this issue Jan 31, 2012 · 44 comments
Closed

solve_left and solve_right should use coercion #12406

robertwb opened this issue Jan 31, 2012 · 44 comments

Comments

@robertwb
Copy link
Contributor

This ticket uses coercion to find suitable parents for the arguments to Matrix.solve_right in order to solve these problems:

sage: A = matrix(QQ, 2, [1, 2 ,3, 4])
sage: b = vector(RDF, [pi, e]); b
(3.14159265359, 2.71828182846)
sage: A.solve_right(b)   # should return (-3.564903478720541, 3.353248066155167)
(-27594871646705519/7740706532848242, 17304339474632025/5160471021898828)
sage: R.<x> = QQ[]
sage: b = vector(R, [1, x]); b
(1, x)
sage: A.solve_right(b)   # should return (t - 2, -1/2*t + 3/2)
[ugly traceback]
TypeError: not a constant polynomial

This is implemented by moving the coercion code from Matrix_double_dense.solve_right, that was implemented in #17405, to the super class. As a result, Matrix_double_dense.solve_right is redundant and therefore removed.

The super method Matrix.solve_right is refactored a lot:

  • the check parameter is ignored for inexact rings (see also solve_right fails with floating-point matrices #13932)
  • the implementation is changed so that the rank of the matrix is not computed in solve_right, but only in _solve_right_nonsingular_square; this way, inexact rings overwriting the latter method can avoid computing the rank (which would not work over inexact rings)

Additionally, this ticket adds support for calling solve_right/solve_left with non-square matrices over RDF/CDF. In such cases, solve_right returns the least-squares solution of the equation system, similar to the !Matlab/Octave backslash operator (which our documentation already claims is implemented). This is implemented using scipy.linalg.lstsq.

sage: A = matrix(RDF, 3, 2, [1, 3, 4, 2, 0, -3])
sage: b = vector(RDF, [5, 6, 1])
sage: x = A.solve_right(b)
sage: (A * x - b).norm()
3.2692119900020438
sage: x == A \ b
True

CC: @orlitzky

Component: linear algebra

Author: Markus Wageringel

Branch/Commit: 93cea85

Reviewer: Michael Orlitzky

Issue created by migration from https://trac.sagemath.org/ticket/12406

@robertwb robertwb added this to the sage-5.11 milestone Jan 31, 2012
@jdemeyer jdemeyer modified the milestones: sage-5.11, sage-5.12 Aug 13, 2013
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.1, sage-6.2 Jan 30, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.2, sage-6.3 May 6, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.3, sage-6.4 Aug 10, 2014
@mwageringel

This comment has been minimized.

@mwageringel
Copy link

Branch: u/gh-mwageringel/12406

@mwageringel
Copy link

comment:6

Based on #17405.


New commits:

95e4157Fix minor typo.
8c41acfFix hidden bug in RDF/CDF matrix multiplication dead error-checking.
5844bbeChange RDF/CDF matrix multiplication to accept matrices for B.
fab2bcdImprove exceptions
bd6e4fcMerge tag '9.1.beta4' into #17405
7b07a2717405: fix solve_right and solve_left over RDF and CDF
0ccb65a12406: implement coercion for generic solve_right

@mwageringel
Copy link

Commit: 0ccb65a

@mwageringel
Copy link

Author: Markus Wageringel

@mwageringel
Copy link

Dependencies: #17405

@mwageringel
Copy link

comment:8

A pynormaliz doctest fails in src/sage/geometry/polyhedron/library.py.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 28, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

a815ff912406: explicitly convert to AA in generalized_permutahedron

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 28, 2020

Changed commit from 0ccb65a to a815ff9

@orlitzky
Copy link
Contributor

orlitzky commented Mar 5, 2020

comment:22

Replying to @orlitzky:

Follow-up: I do see that the documentation for the subclass method promises different things for square and non-square matrices, but I don't think that's consistent any longer

Scratch that, I see that the generic matrix code disagreed with its documentation all along. The MATLAB backslash operator doesn't check the rank, it only looks at the dimensions and does a QR solve regardless of the rank for a non-square matrix. It insists that square systems actually be solvable. So I would expect an error from A = matrix(ZZ, [[1, 2, 3], [2, 0, 2], [3, 2, 5]]), if we really implemented the backslash properly. But we don't, because the check parameter directly contradicts it. This is a nightmare! I'm going to play around a bit more but then probably throw up my hands in frustration and stop wasting your time.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 6, 2020

comment:23

Ok, I finally have my head wrapped around this.

The behavior promised by the solve_right() method is just fundamentally incompatible with inexact rings, and also with the promise that it makes to implement the MATLAB backslash operator. The existing code uses the "general" solver for rank-deficient matrices, which isn't how the MATLAB backslash works. That behavior is documented and doctested throughout sage, so it's difficult to change.

In hindsight, the reason for setting check=False and adding the (check or K.is_exact()) condition is clear: it let's you implement the backslash operator correctly for RDF and CDF matrices by hacking the superclass method to choose the "wrong" branch. Normally I would say this is a horrendous design, but you can only work with what you're given.

I made a post to the -devel list asking about the behavior of solve_right, but I don't expect to get an easy answer (if this was fun/easy to fix, it wouldn't look like this). It's apparent that to truly make everything consistent, we will have to change either the superclass documentation or its implementation. And changing the implementation (for ALL matrices) at this point is more than either of us have bargained for.

So, here's the best solution that I was able to come up with. Instead of changing the superclass implementation, we change the documentation to admit that (a) we don't really implement the backslash the way MATLAB does, specifically with respect to rank-deficient matrices; and (b) the check parameter will not be used over inexact rings. Item (a) is a given in the superclass, and I think (b) is justified because all solutions are approximate over inexact rings. Propagating item (a) into the matrix_double_dense subclass is a "breaking" change, but one that seems only to improve the answers we get.

Here's the summary:

  1. At the beginning of Matrix.solve_right, we set check=False whenever the ring is inexact.
  2. Since check is now False for RDF/CDF matrices, there's no reason for the subclass methods to force check=False, and therefore no reason for the subclass methods at all! We can drop them and move the documentation/examples into the superclass, to eliminate all of the weirdness that results from having two different sets of documentation on classes that silently convert into one another.
  3. The coercion and everything else proceeds basically as you had it, except without calling solve_right again, because now there is no subclass method to call.
  4. Instead of hacking the "if" statement to use the square-nonsingular solve for all square RDF/CDF matrices, we leave the logic alone and let sage choose to use least squares in that case. This breaks some doctests, but in reality it improves many of the answers. Some LinAlgErrors become near-solutions, and some ill-conditioned systems produce much better solutions this way.

This doesn't attempt to implement the backslash exactly, but it does solve systems in a consistent(ish) manner regardless of whether or not the ring is exact. It also passes all of the doctests that don't refer specifically to the behavior that was changed on purpose.

I've pushed a branch so you can see my proof-of-concept:

sagemath/sagetrac-mirror@d596711

Please, take a look and see what you think. If you think it's a better approach, we can clean it up. If not... I tried. I'll give this a positive review if you prefer the original approach.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 6, 2020

comment:24

That branch is u/mjo/ticket/12406 if you'd rather look at it locally, by the way.

@mwageringel
Copy link

comment:25

As I mentioned on sage-devel, I would prefer an implementation for inexact rings where square matrices and non-square matrices are treated differently. Contrary to what I said on devel though, I now realize that this is not consistent with exact rings: if a square matrix is singular to machine precision, an error is raised regardless of whether a solution exists or not, whereas over exact rings an error is raised only if a solution does not exist.

Since we cannot in general detect such a degenerate situation over inexact rings however, this seems like the price we have to pay for inexact computations. Still, it seems important to get a predictable answer (the algorithm should not depend on numerical noise), so I would prefer to compute a least-squares solution only in the non-square case. This agrees with Matlab as well as SciPy treating the square situation differently. Getting a warning about a square matrix being singular can be a useful answer too, but it needs to be documented that this is different than in the exact case. Since the method is effectively broken for inexact rings currently, I do not expect these changes to cause many problems.

I agree that the check keyword should be ignored over inexact rings, and that the way I used check was bad. I tried to circumvent the rank computation for RDF/CDF by that, because computing the rank cannot work over inexact rings. I would have just written the if-statement as

if not self.is_square() or K.is_exact() and self.rank() != self.nrows():

if it was not for a test failure in src/sage/quivers/morphism.py involving RR, an inexact ring different from RDF/CDF. This is something that #13932 will hopefully resolve eventually.

The more I think about it though, it seems to me that the rank computation should be avoided even for exact rings. Computing the rank will usually involve some sort of echelonization I suppose which is already a major part of the work required to find a solution of a square system. Instead, we should remove the rank computation from the generic solve_right method entirely and leave it to the implementation of _solve_right_nonsingular_square to decide (possibly using the check_rank keyword) whether computing the rank is necessary or appropriate, depending on the ring or type of the matrix as well as on the algorithm used for solving. If appropriate (e.g. for an exact ring), _solve_right_nonsingular_square should signal a specific error in case of a singular matrix that would be caught in solve_right, so that the _solve_right_general can be tried instead. This gives all the control to the class implementing _solve_right_nonsingular_square as to which kind of solution will be computed.


A couple of other thoughts:

I still think it would be cleaner to call solve_right again after coercion, even if no subclass overwrites it, as I do not think there is much practical relevance for saving a stack frame. The main reason for calling solve_right again is that this is closer to how coercion is usually used in Sage (which is also relevant for the next point), but I will not insist on this if indeed we remove the subclass method.


Replying to @orlitzky:

All other concerns aside, we should now move the self.rank() computation up before the coercion. Rank computations in Sage aren't numerically stable

Indeed, this would be better for the rank computation, but it is different from how the coercion system usually works in Sage, so could lead to unexpected results. To the best of my knowledge, usually, the coercion system first searches for a common parent of two operands and only then invokes the appropriate binary operation on them. This is why the doctest

A.solve_right(b) == A.change_ring(RDF).solve_right(b)

is justified, as it merely asserts that the answer is the same in either case. Because of that, I think it would be better to avoid using information obtained from the old parent, even if this is counter-intuitive. If we move the rank computation to _solve_right_nonsingular_square, this does not matter anyway.


I also think we should change...

if K not in _Fields:
    K = K.fraction_field()
    self = self.change_ring(K)

to

if K not in _Fields:
    K = K.fraction_field()
    self = self.change_ring(K)
    B = B.change_ring(K)

for consistency, so that we always call either _solve_right_general or _solve_right_nonsingular_square with arguments that have the same base ring.

I agree. Ideally, it should be part of the coercion code (except in the non-integral-domain case), so changing rings happens only once.


Replying to @orlitzky:

The second is that if I start with

A = matrix(ZZ, ...)
b = vector(RDF, ...)

and then try to figure out what's going to happen from

A.solve_right?

I'm not going to see the subclass documentation, so we shouldn't silently convert to a subclass and then implement the subclass method to do something different.

Indeed, this is not optimal. The superclass documentation should state that coercion is used on the arguments, and what the subclass method does (if it exists) should be consistent with what the superclass documentation says.


There are two test failures with your branch.

File "src/sage/matrix/matrix_complex_ball_dense.pyx", line 553, in sage.matrix.matrix_complex_ball_dense.Matrix_complex_ball_dense._solve_right_nonsingular_square
Failed example:
    matrix(CBF, 2, 2, 0) \ vector([-1, 1])
Expected:
    Traceback (most recent call last):
    ...
    ValueError: matrix equation has no solutions
Got:
    (0, 0)

I am not sure how solve_right is implemented for CBF, but it is another case we may need to consider. This is interesting because, despite being inexact, for some systems over CBF it may be possible to decide with certainty whether or not a solution exists.

The other failure is the peculiar doctest in src/sage/quivers/morphism.py. It is caused by lifting a nontrivial element along the zero-morphism of a vector space over RR. A preimage does not exist, which cannot reliably tested over RR, so in this case I think the doctest is wrong in using RR rather than an exact ring like QQ as everywhere else in that file. Since lift is implemented using solve_right, this means that, if solve_right produces a poor approximation over an inexact ring, the result of lift will be just as bad.


I will try to see if I can make changes as outlined above if you agree.

@orlitzky
Copy link
Contributor

orlitzky commented Mar 8, 2020

comment:26

Replying to @mwageringel:

There are two test failures with your branch... in this case I think the doctest is wrong in using RR rather than an exact ring like QQ as everywhere else in that file.

Indeed, and I think the CBF doctest is wrong as well. It uses an indirect doctest instead of calling _solve_right_nonsingular_square, so it's one of those cases where a change in the superclass behavior means that the test isn't checking what it's supposed to check. But these may turn out to be irrelevant so I don't want to worry too much about it.

The more I think about it though, it seems to me that the rank computation should be avoided even for exact rings. Computing the rank will usually involve some sort of echelonization I suppose which is already a major part of the work required to find a solution of a square system.

Excellent point! I'm embarrassed that I stared at that rank() call so long without realizing it.

In light of that, I agree with everything else you said. For inexact rings, we should go with the MATLAB behavior, even if it disagrees with the exact-ring behavior, because you can't know if the system is rank-deficient without first solving the problem. For backwards compatibility, we have to keep the exact-ring behavior roughly as it is, and document the difference. If we can try the square-nonsingular method first and have it fall back to the generic solve when the matrix isn't full rank, then that sounds reasonable to me.

In the long term, I think it would be a better design to decree that subclasses should not override solve_right, and should instead only override _solve_right_nonsingular_square and _solve_right_general. That would still be in the spirit of the coercion framework (modulo the non-integral-domain stanza), in that it coerces the arguments and then "immediately" dispatches to the appropriate subclass method. In that way we avoid people implementing weird solve_right methods in subclasses whose documentation no one will see. I won't give you a hard time about this though. You're doing all the work, so do whatever you think is best.

@mwageringel

This comment has been minimized.

@mwageringel
Copy link

Changed commit from 6d0bbf6 to 99c2a7e

@mwageringel
Copy link

Changed dependencies from #17405 to none

@mwageringel
Copy link

comment:28

I have updated the branch accordingly. The rank computation is now done in _solve_right_nonsingular_square only, and the backslash operator now works as in Matlab and Octave, so that a least-squares solution is only computed in the non-square case. For exact rings, these changes should not make a difference.

Additionally, I have added a doctest for the problem in #13932, which is solved by this branch since the check parameter is now ignored over inexact rings.


New commits:

d751e78Trac # 12406: document that "check" means nothing over inexact rings.
71f0c28Trac #12406: absorb RDF/CDF solve_* into the superclass methods.
9aa2eb3Trac #12406: unbreak doctests for exact rings.
988e21512406: move rank checks to _solve_right_nonsingular_square
99c2a7e12406: test that `check` is ignored over inexact rings

@mwageringel
Copy link

Changed branch from u/gh-mwageringel/12406 to u/gh-mwageringel/12406v2

@orlitzky
Copy link
Contributor

comment:29

Instead of testing the string contents of the error message, what do you think about adding a new NotFullRankError in matrix2.py to be thrown in that case? The documentation could say something like "The fact that a square system is rank-deficient sometimes only becomes apparent while attempting to solve it. The solve_left and solve_right methods defer to _solve_right_nonsingular_square for square systems, and that method raises this error if the system turns out to be singular."

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 30, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

c1178d912406: add NotFullRankError for _solve_right_nonsingular_square

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 30, 2020

Changed commit from 99c2a7e to c1178d9

@mwageringel
Copy link

comment:31

That is a good idea. I have implemented it.

@orlitzky
Copy link
Contributor

comment:32

Let's get this over with =)

The only other thing I see is in the new test,

Over inexact rings, the ``check`` parameter is ignored as the result is
only an approximate solution (:trac:`13932`)::

sage: RF = RealField(52)
sage: B = matrix(RF, 2, 2, 1)
sage: A = matrix(RF, [[0.24, 1, 0], [1, 0, 0]])
sage: (A * A.solve_right(B) - B).norm() < 1e-14
True

to ensure that check=True is ignored, we also want that small norm to not actually be zero. If you feel like adding that, just go ahead and set it back to positive review without the round-trip.

Thanks for sticking with this. I've been complaining about issues like this since I was an undergrad, but I never had the guts to tackle it myself.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 31, 2020

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

93cea8512406: test that the solution is inexact

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 31, 2020

Changed commit from c1178d9 to 93cea85

@mwageringel
Copy link

comment:34

Thank you for the review and all the valuable input you provided.

I have added the non-zeroness check to the doctest. Setting this back to positive.

@vbraun
Copy link
Member

vbraun commented Apr 9, 2020

Changed branch from u/gh-mwageringel/12406v2 to 93cea85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants