-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "add CPU autotp UT (#4263)" #4419
Conversation
This reverts commit 388c848.
@delock and @Yejing-Lai do you have any idea on the problem with #4263? Thanks! |
@delock - it appears after that PR was merged we see the cpu_inference tests either failing or failing and timing out at the 6 hour mark. I'm not sure why/how this passed on the merge queue, but reverting the PR fixes it, so I'm trying to isolate what the change was that caused this, or if these are new test failures that should be skipped. Let me know if you have any thoughts on this? This looks to be when it ran in the merge queue, it seems to have failed but somehow GitHub marked that as a pass? Its also running more tests than the previous one that ran in the merge queue before it. Is it worth skipping these tests for now, or should we revert then re-apply this PR? |
Hi @loadams, sorry for breaking the workflow. From the error message (https://github.com/microsoft/DeepSpeed/actions/runs/6341645987/job/17235544538), it means the UT tries to start test with FP16 data format which is not supported on CPU. How did it escaped merge CI is not clear to me. A proper fix should skip these tests for FP16. From the error log,
|
@loadams @tjruwase This PR fix the not implemented error
|
Thanks, @delock - I'll close this PR and we can move any other discussion to your PR. I'm still not sure how this showed as passed on the GitHub checks, but thanks for looking into this. |
This PR fix UT test error as described in this PR and the following test job. This PR skips `TestModelTask` if dtype is not supported by accelerator, or `InferenceBuilder` is not implemented by accelerator. #4419 https://github.com/microsoft/DeepSpeed/actions/runs/6341645987/job/17235544538 --------- Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Liangliang-Ma <[email protected]> Co-authored-by: Quentin Anthony <[email protected]> Co-authored-by: Dashiell Stander <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Ramya Ramineni <[email protected]> Co-authored-by: Xie Zejian <[email protected]> Co-authored-by: Conglong Li <[email protected]> Co-authored-by: Michael Wyatt <[email protected]>
This PR fix UT test error as described in this PR and the following test job. This PR skips `TestModelTask` if dtype is not supported by accelerator, or `InferenceBuilder` is not implemented by accelerator. microsoft#4419 https://github.com/microsoft/DeepSpeed/actions/runs/6341645987/job/17235544538 --------- Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Liangliang-Ma <[email protected]> Co-authored-by: Quentin Anthony <[email protected]> Co-authored-by: Dashiell Stander <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Ramya Ramineni <[email protected]> Co-authored-by: Xie Zejian <[email protected]> Co-authored-by: Conglong Li <[email protected]> Co-authored-by: Michael Wyatt <[email protected]>
This reverts commit 388c848.