-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8514] LU factorization on BlockMatrix #8563
Conversation
…d methods in BlockMatrix. Added one more unit test
…xample near the top of the page.
I have been corresponding with Xiangrui Meng on this. (I'm with IBM On Tue, Sep 1, 2015 at 10:38 PM, UCB AMPLab [email protected]
Jerome Nilmeier, PhD |
add to whitelist |
ok to test |
Thanks Xiangrui! Cheers, J Sent from my iPhone
|
Test build #41914 has finished for PR 8563 at commit
|
@nilmeier Do you have a reference to a paper which analyses the running time and communication costs for the algorithm implemented here ? |
This approach has some similarity to the CALU paper that you posted, and In terms of running time, I have some single node (i7 macbook) data Sincerely, Jerome On Wed, Sep 2, 2015 at 10:19 AM, Shivaram Venkataraman <
Jerome Nilmeier, PhD |
Hello @shivaram: Here are some more timings from a 10 node cluster. npb in the legend is For these cases, I am seeing better than n^3 scaling, which suggests that We run into stack overflow errors for a large number of blocks. If we I'm using a cluster that is dedicated to another project for these timings, I have attached the scripts used to generate the data. Please forgive the Please let me know if these are helpful for you, or if you need anything Sincerely, Jerome On Wed, Sep 2, 2015 at 11:05 AM, Jerome Nilmeier [email protected] wrote:
Jerome Nilmeier, PhD |
@nilmeier Could you post a link to the timing graph ? I can't seem to find it on the JIRA or here on github. |
Hello @shivaram Sorry to delay. Here are some first pass timings on our 10 node cluster.. There is a README in this drive directory. Please feel free to contact me Sincerely, Jerome On Tue, Sep 8, 2015 at 9:48 AM, Shivaram Venkataraman <
Jerome Nilmeier, PhD |
Hi @shivaram Some of my colleagues here at IBM have noted that some of the BlockMatrix Sincerely, @nilmeier On Tue, Sep 8, 2015 at 3:19 PM, Jerome Nilmeier [email protected] wrote:
Jerome Nilmeier, PhD |
I am working through some updates to this pull request that include some minor changes over the next week in preparation for an internal code review. Mostly, these will be minor changes. I will also be adding a BlockMatrix.solve method that will use LU to solve AX=B for X. Should I include all of these updates in the current pull request? Cheers, Jerome |
…e method. Refactored shiftIndices method. Updated docs.
Test build #42966 has finished for PR 8563 at commit
|
Review comments: --> Try making the portion of the examples with input data more condensed; perhaps --> There's an orphan/duplicate ScalaDoc comment at line 356 of BlockMatrix.scala --> Remove carriage return on lines 370, 481, 569, 732 --> Recommend adding a call to the new subtract method to the MLLib --> New API calls to BlockMatrix should have corresponding PySpark APIs --> Error message at line 394 should print out the block sizes that don't match --> The code at line 384 should multiply every element of b.head by -1 as far --> Line 456 and 465-471 have wrong indentation --> Scaladoc at 474 should state that blockRowRange and blockColRange are block --> In lines 460-463, consider making a single pass over the blocks instead of --> Add a note to SchurComplement that the current implementation assumes that --> Might want to use a case class in return type of blockLUtoSolver --> Take a close look at the performance impact of the chains of --> In recursiveSequencesBuild, you may want to break off more than one block --> Might want to reuse buffers for the local data allocated at lines 623-629 |
Thank you @frreiss for the review! I will address these issues and update the pull request in the next few days. Cheers, J |
Addressed Fred's Review Comments in the most recent update. I was not able to fully address all issues raised...they are described below: Review comments:
O We did some studies, and the code would benefit from refactoring to remove recursion.
O This may require a significant rewrite to handle correctly....I would like to try this for the next revision.
O I would like to explore this as well in the future, but I didn't address it in this update. |
|
||
|
||
|
||
/** Computes the LU Decomposition of a Square Matrix. For a matrix A of size (n x n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For scala doc, we use
/**
* Start from here
* End here
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in a second commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected in a second commit.
Test build #43537 has finished for PR 8563 at commit
|
Test build #43539 has finished for PR 8563 at commit
|
Test build #43545 has finished for PR 8563 at commit
|
Test build #43550 has finished for PR 8563 at commit
|
Test build #43549 has finished for PR 8563 at commit
|
@@ -312,7 +313,7 @@ class BlockMatrix @Since("1.3.0") ( | |||
} | |||
|
|||
/** Collects data and assembles a local dense breeze matrix (for test only). */ | |||
private[mllib] def toBreeze(): BDM[Double] = { | |||
private [mllib] def toBreeze(): BDM[Double] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove a space after private
Thank you for the comments Yu. I'll update these in the next few days. Best, Jerome |
@nilmeier thanks. This document would help you with contributing to Spark. |
I'll look through this, thank you. I usually follow the style guide
conventions while running the scripts, but I have missed some of the other
conventions.
Cheers, Jerome
|
Test build #46398 has finished for PR 8563 at commit
|
Test build #46395 has finished for PR 8563 at commit
|
Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. @dbtsai there are a few pull requests that were waiting on your review. Can you revisit them even if they are closed? |
Yes, I hadn't heard back from anyone on this in some time...was this On Wed, Jun 15, 2016 at 3:08 PM, Reynold Xin [email protected]
Jerome Nilmeier, PhD |
@dbtsai https://github.com/dbtsai Was this algorithm in okay shape? We On Wed, Jun 15, 2016 at 5:26 PM, Jerome Nilmeier [email protected] wrote:
Jerome Nilmeier, PhD |
@mengxr and/or @shivaram may be reviewing.