-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LogisticRegressionMG] Support standardization with no data modification #5724
[LogisticRegressionMG] Support standardization with no data modification #5724
Conversation
4745c35
to
4519a1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @lijinf2. Your changes look great for the most part. I'd still ike to see the mg standardization pieces pulled out eventually. Most of my feedback is in the quality of the pytests but I think the fixes should be straightforward.
cpp/src/glm/qn/mg/glm_base_mg.cuh
Outdated
true, | ||
raft::mul_op(), | ||
stream); | ||
raft::resource::sync_stream(*(this->handle_p)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need to sync here since we're not copying anything to host to be read directly after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Revised.
SimpleDenseMat<T> mean_mat(mean_vector, 1, D); | ||
|
||
// calculate mean | ||
rmm::device_uvector<T> ones(num_rows, stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really nice if we could encapculate this normalization computation so that it can reused in RAFT. I understand now the complexities involved in refactoring. For example, I often forget about the SimpleMat
because it's buried so deep in cuML's other APIs (and only used in the qn
solvers).
Still, we are going to start moving some of the mnmg primitives over to RAFT. Hopefully in the not-so-distant future. This also includes k-means and whatnot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Seems need a PR to RAFT then a PR to cuml to revise this part.
Thinking to get it done in the next release. Created an issue ticket for tracking this: #5739.
"max_iter": max_iter, | ||
} | ||
|
||
X_origin = np.array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to lead to tests which are brittle and hard to maintain. Please generate this data and process the naive version of the expected result (you can do the standardization up front and use singe-gpu logistic). Please do this instead of hardcoding values. We seldomly have to resort to doing this but only because a reasonable way to achieve a naive test doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried generating a random sparse matrix of any size for classification. Please check!
"max_iter": max_iter, | ||
} | ||
|
||
X = np.array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see below for comments on hardcoding these values and generating larger test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revised.
) | ||
@pytest.mark.parametrize("datatype", [np.float32]) | ||
@pytest.mark.parametrize("delayed", [False]) | ||
@pytest.mark.parametrize("ncol_and_nclasses", [(2, 2), (6, 4)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test for a few different variations here please? Even just a couple higher numbers like (100, 10)
would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Just added (100, 10)
datatype, | ||
) | ||
|
||
X_origin = np.ascontiguousarray(X_origin.T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please generate larger arrays for testing, especially when sparse. It doesn't have to be massive, but larger than 4x5 would be helpful (like 100x25 or 1000x100).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, added a function to generate sparse matrix of any size for multi-class classification.
72c3979
to
f592e93
Compare
d37f95c
to
87f8680
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for creating the issue to pull out the normalization pieces.
09291a1
to
e911001
Compare
…uous coef, support fit_intercept=False adaptation
…ed and all tests passed with and without standardization
e911001
to
72d5796
Compare
Removing |
The key idea is to modify coefficients in linearFwd to get the same predictions, and modify the gradients in linearBwd to get the same gradients.