Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Changes to mxnet.metric #18083

Merged
merged 27 commits into from
May 14, 2020
Merged

Changes to mxnet.metric #18083

merged 27 commits into from
May 14, 2020

Conversation

acphile
Copy link
Contributor

@acphile acphile commented Apr 16, 2020

Description

change based on #18046

  1. make improvements in metric
    a. improve Class MAE (and MSE, RMSE)
    b. improve Class _BinaryClassification
    c. improve Class TopKAccuracy
    d. add Class MeanCosineSimilarity
    e. add Class MeanPairwiseDistance
  2. move mxnet.metric to mxnet.gluon.metric

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@acphile acphile requested a review from szha as a code owner April 16, 2020 09:56
@mxnet-bot
Copy link

Hey @acphile , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [centos-cpu, sanity, miscellaneous, centos-gpu, windows-cpu, website, unix-gpu, edge, windows-gpu, clang, unix-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

for label, pred in zip(labels, preds):
self.metrics.update_binary_stats(label, pred)

if self.average == "macro":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, macro averaging + F1 does not mean to average the F1 of each batch. I think we should revise it to be the same as https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Averaging F1 per batch here previously existed in metric.py before I made changes. This calculation also exists in MAE, MSE, RMSE, and PearsonCorrelation. Should I remove all of them accordingly? For the average "macro" in sklearn, it seems used in calculating F1 score for multiclass/multilabel targets. But currently our F1 only supports binary classification. I think I need to make extensions for F1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove it and make it similar to sklearn. This is in fact the reason why I never use the metric class in MXNet.


mae = numpy.abs(label - pred).mean()

if self.average == "macro":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@leezu leezu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @acphile! Some comments

  1. @sxjscience suggested compatibility with sklearn's metrics. If so, we should have a mechanism to ensure compatibility / correctness. One way to ensure this is to compare to add tests that compare the output of sklearn to the output of the gluon metric for different inputs. Such test may even include random data to ensure compatibility in edge cases (cf https://en.wikipedia.org/wiki/Fuzzing)

  2. We currently support get() vs. get_global(), reset() vs. reset_local(), but in fact the global functionality is not used anywhere in MXNet and there may not be a good widely used use-case for it. To make our metric API more pythonic and easier to understand, we may remove the global support.

  3. @sxjscience suggests to remove the macro support because it's not correct and not widely used

  4. Your code needs to pass the sanity checks for coding style http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fsanity/detail/PR-18083/1/pipeline

return 2 * self.precision * self.recall / (self.precision + self.recall)
else:
return 0.
return (1 + self.beta ** 2) * self.precision * self.recall / numpy.maximum(self.beta ** 2 * self.precision + self.recall, 1e-12)

@property
def global_fscore(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should be removed as you dropped the global states?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method actually refers to the micro calculation for F1 and it is not related to original global support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to adjust the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ok to use global_fscore since it is in a private container class.

@@ -24,9 +24,9 @@

import numpy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using numpy, we can use mxnet.numpy as it runs asynchronously and has GPU support.
The summary states of a metric should be stored on CPU, but if for example data and label are on GPU and input to the metric, we can calculate the sufficient statistics on GPU

label = label.as_np_ndarray().astype('int32')
if self.class_type == "binary":
self._set(1)
if len(numpy.unique(label)) > 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will trigger synchronization (as we need to wait for the result of the np.unique operator). Could we make error checking that triggers synchronization optional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acphile
Copy link
Contributor Author

acphile commented Apr 28, 2020

@mxnet-bot run ci [centos-cpu, sanity, centos-gpu, windows-cpu, unix-gpu, windows-gpu, unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [centos-gpu, centos-cpu, windows-gpu, unix-cpu, sanity, unix-gpu, windows-cpu]

@leezu
Copy link
Contributor

leezu commented Apr 30, 2020

Let's disable it here, because it blocks this PR

@leezu
Copy link
Contributor

leezu commented Apr 30, 2020

You can go ahead and try reproduce the issue locally:

The reproducer is available since more than a month at #17886 (comment)

@marcoabreu
Copy link
Contributor

No, it's unrelated and should be a separated and isolated PR. Each PR should serve one propose. That way, we can focus discussions, have single purpose commits and also allow reverting

@leezu
Copy link
Contributor

leezu commented Apr 30, 2020

Let's disable it in #18204

@acphile
Copy link
Contributor Author

acphile commented May 1, 2020

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@acphile
Copy link
Contributor Author

acphile commented May 7, 2020

@mxnet-bot run ci [unix-cpu, windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu, windows-gpu]

@@ -29,7 +28,12 @@
def get_classif_model(model_name, use_tensorrt, ctx=mx.gpu(0), batch_size=128):
mx.contrib.tensorrt.set_use_fp16(False)
h, w = 32, 32
net = gluoncv.model_zoo.get_model(model_name, pretrained=True)
model_url = "https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/models/"
Copy link
Contributor

@leezu leezu May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't hardcode master in the URL here. The repository may change and will then break the CI. Instead, use the commit ID: https://raw.githubusercontent.com/dmlc/web-data/221ce5b7c6d5b0777a1e3471f7f03ff98da90a0a/gluoncv/models

@acphile
Copy link
Contributor Author

acphile commented May 8, 2020

@mxnet-bot run ci [windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu]

tests/python/train/test_mlp.py Show resolved Hide resolved
tests/python/unittest/test_metric.py Outdated Show resolved Hide resolved
@acphile
Copy link
Contributor Author

acphile commented May 9, 2020

@mxnet-bot run ci [unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu]

@leezu leezu merged commit effbb8b into apache:master May 14, 2020
leezu added a commit that referenced this pull request May 14, 2020
@mseth10
Copy link
Contributor

mseth10 commented May 14, 2020

@acphile this PR fails nightly CD while running nightly python unit tests. The following tests fail:
test_mcc, test_multilabel_f1, test_binary_f1
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-cd-release-job/detail/mxnet-cd-release-job/1119/pipeline/329

We'll need to revert this PR and fix the failures before re-merging it. Here's the link to revert PR: #18318

mseth10 added a commit to mseth10/incubator-mxnet that referenced this pull request May 14, 2020
@leezu
Copy link
Contributor

leezu commented May 14, 2020

The reason it fails is that the CI check in this PR was run too long ago. I should have restarted it before merging the PR. Meantime master changed and caused some additional changes to be necessary. They are in https://github.com/apache/incubator-mxnet/pull/18312/files

@leezu leezu mentioned this pull request May 27, 2020
AntiZpvoh pushed a commit to AntiZpvoh/incubator-mxnet that referenced this pull request Jul 6, 2020
* finish 5 changes

* move metric.py to gluon, replace mx.metric with mx.gluon.metric in python/mxnet/

* fix importError

* replace mx.metric with mx.gluon.metric in tests/python

* remove global support

* remove macro support

* rewrite BinaryAccuracy

* extend F1 to multiclass/multilabel

* add tests for new F1, remove global tests

* use mxnet.numpy instead of numpy

* fix sanity

* rewrite ce and ppl, improve some details

* use mxnet.numpy.float64

* remove sklearn

* remove reset_local() and get_global in other files

* fix test_mlp

* replace mx.metric with mx.gluon.metric in example

* fix context difference

* Disable -DUSE_TVM_OP on GPU builds

* Fix disable tvm op for gpu runs

* use label.ctx in metric.py; remove gluoncv dependency in test_cvnets

* fix sanity

* fix importError

* remove nose

Co-authored-by: Ubuntu <[email protected]>
Co-authored-by: Leonard Lausen <[email protected]>
chinakook added a commit to chinakook/mxnet that referenced this pull request Nov 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants