Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLAKY] relay/test_op_level1.py::test_binary_op failing intermittently #6417

Closed
leandron opened this issue Sep 7, 2020 · 9 comments
Closed

Comments

@leandron
Copy link
Contributor

leandron commented Sep 7, 2020

As part of #6302, it seems there is another flaky test, that is not related to the contents of the being submitted there: tests/python/relay/test_op_level1.py::test_binary_op.

You can see this on Jenkins for the full stack trace: https://ci.tvm.ai/blue/organizations/jenkins/tvm/detail/PR-6302/14/pipeline/185

The error message I see is:

>               np.testing.assert_allclose(op_res.asnumpy(), ref_res, rtol=0.01)
E               AssertionError: 
E               Not equal to tolerance rtol=0.01, atol=0
E               
E               Mismatched elements: 1 / 250 (0.4%)
E               Max absolute difference: 0.00023794
E               Max relative difference: 0.01051613
E                x: array([[[1.909631e-01, 2.819712e-02, 6.732421e-01, 6.087055e-01,
E                        7.002974e-02],
E                       [3.610725e-03, 6.303363e-01, 2.848548e-03, 1.215679e-01,...
E                y: array([[[1.9092e-01, 2.8198e-02, 6.7334e-01, 6.0889e-01, 7.0007e-02],
E                       [3.6106e-03, 6.3037e-01, 2.8477e-03, 1.2158e-01, 7.1533e-01],
E                       [2.4643e-02, 6.6943e-01, 7.9880e-03, 3.1323e-01, 7.0996e-01],...

I believe we will need to increase the tolerance, but I don't know by how much it would be adequate, so I'll just report it here.

cc @tqchen @merrymercy

@tqchen
Copy link
Member

tqchen commented Sep 7, 2020

My guess the problem was due to floordivide/floormod, where there is a boundary case. So increasing tol may not help. We need to explicitly change the generator to avoid such boundary cases. There are some existing topi cases that already do this.

@tqchen
Copy link
Member

tqchen commented Sep 7, 2020

related
#6106,
#4378,
#4210

@tqchen
Copy link
Member

tqchen commented Sep 12, 2020

@leandron would you be interested in sending a fix?

@leandron
Copy link
Contributor Author

@tqchen Yes, I’ll have a look

@tqchen
Copy link
Member

tqchen commented Sep 13, 2020

There are know fixes please take a look at the issues linked

@leandron
Copy link
Contributor Author

Just to follow up here. I investigated how a fix would look like and it seems there is some manual filtering to be done on values around 0.5 for floor_divide and/or floor_mod, that might disagree when comparing relay vs. numpy.

I'll to submit a PR with this filter (soon), similar to #4382, plus improve diagnostic on that test case, so in case it fails, it will give us which function (out of 6 tested on that test) is the offending one.

@tqchen
Copy link
Member

tqchen commented Oct 15, 2020

ping @leandron :)

@leandron
Copy link
Contributor Author

Sorry I didn't manage to get into this yet.

@denise-k
Copy link
Collaborator

denise-k commented Sep 27, 2021

Hi folks, is this still active?

@areusch areusch added the needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it label Oct 19, 2022
@hpanda-naut hpanda-naut added dev:ci and removed needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it labels Nov 28, 2022
@tqchen tqchen closed this as completed Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants