test_metric_performance #18330

leezu · 2020-05-15T01:44:34Z

http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18284/runs/6/nodes/364/steps/731/log/?start=0


[2020-05-14T23:42:14.444Z] =================================== FAILURES ===================================
[2020-05-14T23:42:14.444Z] ___________________________ test_metric_performance ____________________________
[2020-05-14T23:42:14.444Z] [gw1] linux -- Python 3.6.9 /usr/bin/python3
[2020-05-14T23:42:14.444Z] 
[2020-05-14T23:42:14.444Z]     def test_metric_performance():
[2020-05-14T23:42:14.444Z]         """ unittest entry for metric performance benchmarking """
[2020-05-14T23:42:14.444Z]         # Each dictionary entry is (metric_name:(kwargs, DataGenClass))
[2020-05-14T23:42:14.444Z]         metrics = [
[2020-05-14T23:42:14.444Z]             ('acc', ({}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('top_k_acc', ({'top_k': 5}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('F1', ({}, F1MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('Perplexity', ({'ignore_label': -1}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('MAE', ({}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('MSE', ({}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('RMSE', ({}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('ce', ({}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('nll_loss', ({}, MetricDataGen)),
[2020-05-14T23:42:14.444Z]             ('pearsonr', ({}, PearsonMetricDataGen)),
[2020-05-14T23:42:14.444Z]         ]
[2020-05-14T23:42:14.444Z]     
[2020-05-14T23:42:14.444Z]         data_size = 1024 * 128
[2020-05-14T23:42:14.444Z]     
[2020-05-14T23:42:14.444Z]         batch_sizes = [16, 64, 256, 1024]
[2020-05-14T23:42:14.444Z]         output_dims = [128, 1024, 8192]
[2020-05-14T23:42:14.444Z]         ctxs = [mx.cpu(), mx.gpu()]
[2020-05-14T23:42:14.444Z]     
[2020-05-14T23:42:14.444Z]         print("\nmx.gluon.metric benchmarks", file=sys.stderr)
[2020-05-14T23:42:14.444Z]         print(
[2020-05-14T23:42:14.444Z]             "{:15}{:10}{:12}{:12}{:15}{:15}{}".format(
[2020-05-14T23:42:14.444Z]                 'Metric', 'Data-Ctx', 'Label-Ctx', 'Data Size', 'Batch Size', 'Output Dim', 'Elapsed Time'),
[2020-05-14T23:42:14.444Z]             file=sys.stderr)
[2020-05-14T23:42:14.444Z]         print("{:-^90}".format(''), file=sys.stderr)
[2020-05-14T23:42:14.444Z]         for k, v in metrics:
[2020-05-14T23:42:14.444Z]             for c in output_dims:
[2020-05-14T23:42:14.444Z]                 for n in batch_sizes:
[2020-05-14T23:42:14.444Z]                     for pred_ctx, label_ctx in itertools.product(ctxs, ctxs):
[2020-05-14T23:42:14.444Z] >                       run_metric(k, v[1], (data_size * 128)//(n * c), n, c, pred_ctx, label_ctx, **v[0])
[2020-05-14T23:42:14.444Z] 
[2020-05-14T23:42:14.444Z] tests/python/unittest/test_metric_perf.py:118: 
[2020-05-14T23:42:14.444Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2020-05-14T23:42:14.444Z] tests/python/unittest/test_metric_perf.py:76: in run_metric
[2020-05-14T23:42:14.444Z]     mx.nd.waitall()
[2020-05-14T23:42:14.444Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2020-05-14T23:42:14.444Z] 
[2020-05-14T23:42:14.444Z]     def waitall():
[2020-05-14T23:42:14.444Z]         """Wait for all async operations to finish in MXNet.
[2020-05-14T23:42:14.444Z]     
[2020-05-14T23:42:14.444Z]         This function is used for benchmarking only.
[2020-05-14T23:42:14.444Z]     
[2020-05-14T23:42:14.444Z]         .. note::
[2020-05-14T23:42:14.444Z]     
[2020-05-14T23:42:14.444Z]            If your mxnet code throws an exception, then waitall can cause performance impact.
[2020-05-14T23:42:14.444Z]         """
[2020-05-14T23:42:14.444Z] >       check_call(_LIB.MXNDArrayWaitAll())
[2020-05-14T23:42:14.444Z] E       Failed: Timeout >1200.0s
[2020-05-14T23:42:14.444Z] 
[2020-05-14T23:42:14.444Z] python/mxnet/ndarray/ndarray.py:211: Failed
[2020-05-14T23:42:14.444Z] ---------------------------- Captured stderr setup -----------------------------

metric now rely on mxnet numpy implementation and may be slower until mxnet overhead is reduced.

@acphile should the timeout on this test be temporarily increased?

The text was updated successfully, but these errors were encountered:

ChaiBapchya · 2020-05-15T05:58:50Z

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-18326/1/pipeline

http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18327/runs/1/nodes/365/log/?start=0

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-18324/3/pipeline
#18324 #18326 #18327 unrelated PRs

leezu · 2020-05-15T06:50:40Z

Notice that this timeout only happens on Python3: MKL-CPU jobs. The job there times out after 1200 seconds (global timeout), whereas it finishes in under 300 seconds on the non-MKL Python 3 CPU job. So the problem is due to #18244 (now the test relies on MXNet numpy, whereas it relied on upstream numpy before)

cc @TaoLv

leezu · 2020-05-15T06:50:52Z

Suggest we mark this test as xfail for MKL builds for now

leezu · 2020-05-15T07:28:15Z

The test doesn't actually enforce anything. It's output is not monitored (AFAIK) so it may be best to simply delete this test.

leezu added Flaky v2.0 labels May 15, 2020

acphile mentioned this issue May 15, 2020

Remove test metric perf #18331

Closed

7 tasks

leezu mentioned this issue May 15, 2020

Skip test_metric_performance on MKL builds #18335

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_metric_performance #18330

test_metric_performance #18330

leezu commented May 15, 2020

ChaiBapchya commented May 15, 2020 •

edited

Loading

leezu commented May 15, 2020

leezu commented May 15, 2020

leezu commented May 15, 2020

test_metric_performance #18330

test_metric_performance #18330

Comments

leezu commented May 15, 2020

ChaiBapchya commented May 15, 2020 • edited Loading

leezu commented May 15, 2020

leezu commented May 15, 2020

leezu commented May 15, 2020

ChaiBapchya commented May 15, 2020 •

edited

Loading