-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ONNX CI workflow is broken #5971
Comments
It does seem like primTorch would be to blame because we also use the "prim" or "prims" prefix, but we don't have a prim::Constant or any C++ code |
|
This comment was marked as resolved.
This comment was marked as resolved.
My bad, I linked the wrong PR 🤦 Sorry for the noise. |
After some painful bisection, I finally found the real offender: pytorch/pytorch#73284. After seeing that the PR title contains the phrase ONNX and our ONNX tests are failing, I have no idea how I missed that when looking at the PRs 🤦 |
@BowenBao Philip confirmed above that one of the PRs submitted last week broke TorchVision and @atalman helped us revert it at pytorch/pytorch#77349. Please review and let us know what you think. Since the PR is 1 week old already, other PRs might depend on it. So if you believe that a separate forward fix is needed, please feel free to close the PR and issue a new fix. We just need to resolve the issues on TorchVision CI as soon as possible. Thanks in advance ashen one. |
I'll look at this today. It took us 3 months to get that PR merged so even though it's generally bad practice, I'd really like to fix forward if possible. |
Thanks. I replied at pytorch/pytorch#73284 (comment) |
I've got this failure locally with torch and torchvision built from master / main. |
Fix has been merged in pytorch master. @datumbox please let us know if this fixes torchvision CI. |
We'll know when the fresh |
@BowenBao With today's nightly the tests are still failing. |
Hi @pmeier, if my understanding is correct, it appears my fix has not been included in yesterday's nightly yet. 5-13 nightly pytorch/pytorch@44bf440, head is pytorch/pytorch@65f71c0, the fix commit is pytorch/pytorch@a812c4c after it. |
@atalman I wonder if you could confirm the cutpoint of yesterday's nightly? |
Can confirm @BowenBao's assessment. You can verify yourself, by looking at the It will make its way into tomorrows nightly. I'll retest and close this if the fix worked. |
Summary: None as input is legal per ONNX spec for representing optional inputs. For [example](https://github.com/onnx/onnx/blob/main/docs/Operators.md#inputs-2---3-7) `constant_value` for `ONNX::Pad`. This PR removes such constraint check that was set prior to calling onnx shape inference. For the issue below, such constraint prevents the onnx shape inference of `ONNX::Pad`, which leads to falling back on an incorrect constant traced shape. For the unit test in this PR, prior to this PR, the ONNX shape inference for `ONNX::Pad` would be skipped, and would return `None` instead. Fixes pytorch/vision#5971 Pull Request resolved: #77379 Approved by: https://github.com/garymm Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/a812c4cd96d94d51627d2af290ae87de34169ec0 Reviewed By: atalman Differential Revision: D36378914 fbshipit-source-id: a11be0f9666dd637490db80725f7021b328c9f27
Fixed on latest nightly |
Since the 5th of May our CI workflow for ONNX is broken (commit 970ba35). Looking at the warnings emitted by the failing tests
Two models are affected
faster_rcnn
andmask_rcnn
. To reproduce run:I believe a recent patch to primtorch might be the offender here. cc @neginraoof @seemethere @mruberry
The text was updated successfully, but these errors were encountered: