-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-18693][ML][MLLIB] ML Evaluators should use weight column #16557
[SPARK-18693][ML][MLLIB] ML Evaluators should use weight column #16557
Conversation
Test build #71247 has finished for PR 16557 at commit
|
Test build #71270 has finished for PR 16557 at commit
|
it looks like a random test timed out: |
Jenkins retest this please |
Jenkins, retest this please |
Test build #71272 has finished for PR 16557 at commit
|
Test build #71278 has finished for PR 16557 at commit
|
Test build #71291 has started for PR 16557 at commit |
Jenkins, retest this please |
1 similar comment
Jenkins, retest this please |
397c26b
to
808ca6b
Compare
Jenkins, retest this please |
Jenkins doesn't seem to be working ... |
@sethah @Lewuathe @thunterdb @WeichenXu123 @jkbradley would you be able to take a look at the changes to add a weight column to binary/multiclass/regression evaluators/metrics classes? It looks like you are familiar with this code. Thank you! |
Jenkins, retest this please |
Test build #71331 has finished for PR 16557 at commit
|
Test build #71334 has finished for PR 16557 at commit
|
808ca6b
to
228bbfb
Compare
Jenkins, retest this please |
Test build #71523 has finished for PR 16557 at commit
|
Test build #71531 has finished for PR 16557 at commit
|
Jenkins, retest this please |
Test build #71544 has finished for PR 16557 at commit
|
Jenkins, retest this please |
1 similar comment
Jenkins, retest this please |
e2873f2
to
d589dfe
Compare
Test build #71604 has finished for PR 16557 at commit
|
Test build #71607 has finished for PR 16557 at commit
|
Test build #71670 has finished for PR 16557 at commit
|
c1f5f09
to
ba68f72
Compare
Jenkins, retest this please |
ping @sethah @Lewuathe @thunterdb @WeichenXu123 @jkbradley would you be able to take a look at the changes to add a weight column to binary/multiclass/regression evaluators/metrics classes? It looks like you are familiar with this code. Thank you! |
Test build #71858 has finished for PR 16557 at commit
|
Test build #71860 has finished for PR 16557 at commit
|
ping @sethah @Lewuathe @thunterdb @WeichenXu123 @jkbradley would you be able to take a look at the changes to add a weight column to binary/multiclass/regression evaluators/metrics classes? It looks like you are familiar with this code. Thank you! |
1 similar comment
ping @sethah @Lewuathe @thunterdb @WeichenXu123 @jkbradley would you be able to take a look at the changes to add a weight column to binary/multiclass/regression evaluators/metrics classes? It looks like you are familiar with this code. Thank you! |
@srowen @yanboliang might you be able to take a look at this PR? Is it possibly too large and I should break it up into 3 PRs, one per evaluator/metrics class? |
I wouldn't ping that frequently, please. I don't feel qualified to review this myself. |
+1 for breaking it up, maybe starting with regression. Also, just because something hasn't been reviewed in two weeks does not mean that there is no interest in it. Two weeks is not all that long (I've seen valuable PRs sit for much longer than that) and it likely just means people are busy. As Sean pointed out, it's probably not necessary to ping once per day. |
I agree, let's break this PR. It will go faster, and some changes may require longer discussions. |
6dbb0ad
to
a0fc4c3
Compare
ok, I will close this and create three new PRs, one for each of the evaluators |
Test build #73525 has finished for PR 16557 at commit
|
…ed weight column for multiclass classification evaluator ## What changes were proposed in this pull request? The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and MulticlassClassificationEvaluator and the corresponding metrics classes BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use sample weight data. I've closed the PR: #16557 as recommended in favor of creating three pull requests, one for each of the evaluators (binary/regression/multiclass) to make it easier to review/update. Note: I've updated the JIRA to: https://issues.apache.org/jira/browse/SPARK-24101 Which is a child of JIRA: https://issues.apache.org/jira/browse/SPARK-18693 ## How was this patch tested? I added tests to the metrics class. Closes #17086 from imatiach-msft/ilmat/multiclass-evaluate. Authored-by: Ilya Matiach <[email protected]> Signed-off-by: Sean Owen <[email protected]>
…ed weight column for regression evaluator ## What changes were proposed in this pull request? The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and MulticlassClassificationEvaluator and the corresponding metrics classes BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use sample weight data. I've closed the PR: apache#16557 as recommended in favor of creating three pull requests, one for each of the evaluators (binary/regression/multiclass) to make it easier to review/update. The updates to the regression metrics were based on (and updated with new changes based on comments): https://issues.apache.org/jira/browse/SPARK-11520 ("RegressionMetrics should support instance weights") but the pull request was closed as the changes were never checked in. ## How was this patch tested? I added tests to the metrics class. Closes apache#17085 from imatiach-msft/ilmat/regression-evaluate. Authored-by: Ilya Matiach <[email protected]> Signed-off-by: Sean Owen <[email protected]>
…ed weight column for multiclass classification evaluator ## What changes were proposed in this pull request? The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and MulticlassClassificationEvaluator and the corresponding metrics classes BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use sample weight data. I've closed the PR: apache#16557 as recommended in favor of creating three pull requests, one for each of the evaluators (binary/regression/multiclass) to make it easier to review/update. Note: I've updated the JIRA to: https://issues.apache.org/jira/browse/SPARK-24101 Which is a child of JIRA: https://issues.apache.org/jira/browse/SPARK-18693 ## How was this patch tested? I added tests to the metrics class. Closes apache#17086 from imatiach-msft/ilmat/multiclass-evaluate. Authored-by: Ilya Matiach <[email protected]> Signed-off-by: Sean Owen <[email protected]>
…ed weight column for regression evaluator ## What changes were proposed in this pull request? The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and MulticlassClassificationEvaluator and the corresponding metrics classes BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use sample weight data. I've closed the PR: apache#16557 as recommended in favor of creating three pull requests, one for each of the evaluators (binary/regression/multiclass) to make it easier to review/update. The updates to the regression metrics were based on (and updated with new changes based on comments): https://issues.apache.org/jira/browse/SPARK-11520 ("RegressionMetrics should support instance weights") but the pull request was closed as the changes were never checked in. ## How was this patch tested? I added tests to the metrics class. Closes apache#17085 from imatiach-msft/ilmat/regression-evaluate. Authored-by: Ilya Matiach <[email protected]> Signed-off-by: Sean Owen <[email protected]>
…ed weight column for binary classification evaluator ## What changes were proposed in this pull request? The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and MulticlassClassificationEvaluator and the corresponding metrics classes BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use sample weight data. I've closed the PR: #16557 as recommended in favor of creating three pull requests, one for each of the evaluators (binary/regression/multiclass) to make it easier to review/update. ## How was this patch tested? I added tests to the metrics and evaluators classes. Closes #17084 from imatiach-msft/ilmat/binary-evalute. Authored-by: Ilya Matiach <[email protected]> Signed-off-by: Sean Owen <[email protected]>
What changes were proposed in this pull request?
The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and MulticlassClassificationEvaluator and the corresponding metrics classes BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use sample weight data.
The updates to the regression metrics were based on (and updated with new changes based on comments):
https://issues.apache.org/jira/browse/SPARK-11520
("RegressionMetrics should support instance weights")
but the pull request was closed as the changes were never checked in.
How was this patch tested?
This is still a work in progress, I will be adding more tests soon. I took the regression tests from:
#9907
Which was closed as a stale PR but I updated it with some changes.