-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-10757] [MLlib] Java friendly constructor for distributed matrices #9159
Conversation
Can one of the admins verify this patch? |
@mengxr Could you add @Javelinjs to whitelist? |
@Javelinjs Could you add unit test cases for this function? Then Jenkins could run tests for you patch after @mengxr added you to whitelist. |
@yanboliang Sure I'll work on it soon. Thank you. |
It looks well, but I curious about whether we can figure out a more elegant way like this: class BlockMatrix[M <: Matrix] @Since("1.3.0") (
@Since("1.3.0") val blocks: RDD[((Int, Int), M)],
@Since("1.3.0") val rowsPerBlock: Int,
@Since("1.3.0") val colsPerBlock: Int,
private var nRows: Long,
private var nCols: Long) extends DistributedMatrix with Logging {
} Can we embrace all these cases together and make an unified constructor? |
@yanboliang Your constructor can build. But this implies exposing In my opinion, I think BlockMatrix should hide the underlying implementation of its sub-matrix. Please let me know if my idea has any problem. |
8dbd4c5
to
7260df6
Compare
@yanboliang @mengxr I've added some test cases. |
Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. |
This patch allows java developers to construct BlockMatrix and RowMatrix more naturally.
Take BlockMatrix as an example, the type of 'blocks' in current constructor is RDD[((Int, Int), Matrix)], while more precisely it should be RDD[((Int, Int), T<:Matrix)]. This could cause some problem. If we add a Java-friendly constructor
and then construct a BlockMatrix with RDD[(Int, Int), DenseMatrix], the complier will complain a conflict, like:
On the other hand, BlockMatrix should hide the underlying implementation of sub-matrix, no matter it is dense or sparse. Thus, a class level covariant for Matrix is improper. That's why I add a companion object method to construct BlockMatrix/RowMatrix
For IndexedRowMatrix/CoordinateMatrix, since the RDD parameters for them are really simple and contain no type inheritance, I don't see necessity to add additional Java friendly apis for those two. Or do we need them to let java users construct IndexedRowMatrix/CoordinateMatrix in the same way as what they do with BlockMatrix/RowMatrix?