Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Numpy] Add qr backward part 2 for wide matrices with m < n #18197

Merged
merged 1 commit into from
Jul 17, 2020

Conversation

D-Roberts
Copy link
Contributor

@D-Roberts D-Roberts commented Apr 29, 2020

Description

This is the 2nd part of the QR backward implementation. The 1st part (merged) covered square and deep matrix shapes (nrows >= ncols) and part 2 now covers the remaining wide matrix shapes (ncols > nrows).

References:
Differential Programming Tensor Networks

The added test includes a numerical check (via central differences) of the analytical gradient since this is a novel implementation. The tests were run offline 1K times to insure against flakiness.

Checklist

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Code is well-documented:
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • QR back part 2
  • Tests

@mxnet-bot
Copy link

Hey @D-Roberts , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [miscellaneous, centos-cpu, sanity, windows-gpu, website, centos-gpu, unix-cpu, edge, clang, windows-cpu, unix-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@D-Roberts
Copy link
Contributor Author

@haojin2 PR ready for review, tnx.

@haojin2
Copy link
Contributor

haojin2 commented May 1, 2020

@D-Roberts Thanks for your contribution! From now and on please @yzhliu for reviews of NumPy-related contributions since I'm moving my gravity from this project.

@D-Roberts
Copy link
Contributor Author

Hi @yzhliu, can you take a look? thanks

Copy link
Contributor

@hzfan hzfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate a bit about how the backward works when m < n? Seems https://arxiv.org/pdf/1710.08717.pdf does not cover this case.

src/operator/numpy/linalg/np_qr-inl.h Show resolved Hide resolved
@D-Roberts
Copy link
Contributor Author

The code follows the idea in the reference Differential Programming Tensor Networks. At high level, partition/split the input A into 2 matrices X and Y and R (from A=QR decomposition) into 2 matrices U and V. Then X = QU and get X_grad by applying the gradient derivation from the square input case (m=n) with adjusted Q_grad. Also get Y_grad separately. Then A_grad is the concatenation of X_grad and Y_grad.

@D-Roberts
Copy link
Contributor Author

@mxnet-bot run ci [centos-cpu, unix-cpu, windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu, windows-gpu, centos-cpu]

@D-Roberts
Copy link
Contributor Author

@mxnet-bot run ci [centos-cpu, unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [centos-cpu, unix-cpu]

Copy link
Contributor

@hzfan hzfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! I added one more comment, trying to be more memory-efficient.

src/operator/numpy/linalg/np_qr-inl.h Outdated Show resolved Hide resolved
Copy link
Contributor

@hzfan hzfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@D-Roberts
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu]

@D-Roberts
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu]

@D-Roberts
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu]

@D-Roberts
Copy link
Contributor Author

Hi @yzhliu - is there anything else you'd like me to do on this? tnx

@D-Roberts
Copy link
Contributor Author

Any updates on this?

@D-Roberts
Copy link
Contributor Author

Hello @yzhliu - are we planning to merge this soon? This particular case of differentiable QR can be useful on batch, in place of LQ, or SVD in recent computer vision research for solving least squares.

@hzfan hzfan merged commit 60d0672 into apache:master Jul 17, 2020
@hzfan
Copy link
Contributor

hzfan commented Jul 17, 2020

Merged into master. Thanks @D-Roberts , @haojin2

@ptrendx
Copy link
Member

ptrendx commented Jul 17, 2020

This PR broke master CPU pipelines and blocks PRs (test_np_linalg_qr fails), see e.g. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-cpu/detail/master/2093/pipeline

@szha
Copy link
Member

szha commented Jul 17, 2020

@D-Roberts @hzfan could you look into the issue that @ptrendx mentioned? If it can't be fixed in a couple of hours let's revert the change first.

@D-Roberts
Copy link
Contributor Author

@hzfan Thank you for your prompt assistance, I appreciate it.

@leezu @szha @DickJC123 I will resubmit a separate PR. For my future reference - what are your recommendations to avoid the "stale PR" situation? CI passed when first submitted about 3 months ago and I rebased and CI passed about 2 months ago when the PR was reviewed. All along I followed up on the PR every 2-3 weeks or so.

@szha
Copy link
Member

szha commented Jul 19, 2020

@D-Roberts we will likely need to automate it so that stale CI checks are invalidated. In the meantime, if the PR sits for a long time, feel free to ping me or any other committer to get more attention on it.

@leezu
Copy link
Contributor

leezu commented Jul 20, 2020

@D-Roberts the recommendation is to comment with "at mxnet-bot run ci [all]" where at is @

@D-Roberts D-Roberts deleted the qr_back_two branch December 21, 2020 19:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants