Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Default to n=2 for test parallelism #12376

Merged
merged 1 commit into from
Aug 12, 2022
Merged

[ci] Default to n=2 for test parallelism #12376

merged 1 commit into from
Aug 12, 2022

Conversation

driazati
Copy link
Member

@driazati driazati commented Aug 11, 2022

This decreases the test times for most of the tests except a few that did not run under pytest-xdist with 2 worker nodes. This also doesn't decrease overall runtime since CI is still bottlenecked on other jobs. However, this could lead to savings in compute which makes CI more sustainable so this is still worthwhile, though we should revert this if we start seeing "weird" errors like OOMs more often.

cc @Mousius @areusch @gigiblender

@driazati driazati force-pushed the x2 branch 3 times, most recently from 9a795c5 to 2a6d9ba Compare August 11, 2022 18:24
@github-actions
Copy link
Contributor

Built docs for commit 2a6d9ba can be found here.

@driazati driazati marked this pull request as ready for review August 11, 2022 20:53
@driazati driazati merged commit 369e8b2 into apache:main Aug 12, 2022
driazati added a commit to driazati/tvm that referenced this pull request Aug 12, 2022
)"

This reverts commit 369e8b2.

There are certain tests that need to be serialized first before this can
merge or else failures like
https://ci.tlcpack.ai/job/tvm/job/main/4040/display/redirect will happen
based on which tests happen to be run together or not
masahi pushed a commit that referenced this pull request Aug 12, 2022
…12413)

This reverts commit 369e8b2.

There are certain tests that need to be serialized first before this can
merge or else failures like
https://ci.tlcpack.ai/job/tvm/job/main/4040/display/redirect will happen
based on which tests happen to be run together or not

Co-authored-by: driazati <[email protected]>
driazati added a commit to driazati/tvm that referenced this pull request Aug 12, 2022
driazati added a commit to driazati/tvm that referenced this pull request Aug 12, 2022
This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.
driazati added a commit to driazati/tvm that referenced this pull request Aug 18, 2022
driazati added a commit to driazati/tvm that referenced this pull request Aug 18, 2022
This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.
driazati added a commit to driazati/tvm that referenced this pull request Aug 19, 2022
driazati added a commit to driazati/tvm that referenced this pull request Aug 19, 2022
This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.
driazati added a commit to driazati/tvm that referenced this pull request Aug 20, 2022
This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.
driazati added a commit to driazati/tvm that referenced this pull request Aug 22, 2022
This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.
driazati added a commit to driazati/tvm that referenced this pull request Aug 24, 2022
driazati added a commit to driazati/tvm that referenced this pull request Aug 24, 2022
This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.
areusch pushed a commit that referenced this pull request Aug 25, 2022
* Revert "[skip ci] Revert "[ci] Default to n=2 for test parallelism (#12376)" (#12413)"

This reverts commit 478b672.

* [ci] Default to n=2 for test parallelism

This is attempt #2 of #12376 which was reverted in #12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.

Co-authored-by: driazati <[email protected]>
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
This decreases the test times for most of the tests except a few that did not run under pytest-xdist with 2 worker nodes. This also doesn't decrease overall runtime since CI is still bottlenecked on other jobs. However, this could lead to savings in compute which makes CI more sustainable so this is still worthwhile, though we should revert this if we start seeing "weird" errors like OOMs more often.
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
)" (apache#12413)

This reverts commit 369e8b2.

There are certain tests that need to be serialized first before this can
merge or else failures like
https://ci.tlcpack.ai/job/tvm/job/main/4040/display/redirect will happen
based on which tests happen to be run together or not

Co-authored-by: driazati <[email protected]>
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
* Revert "[skip ci] Revert "[ci] Default to n=2 for test parallelism (apache#12376)" (apache#12413)"

This reverts commit 478b672.

* [ci] Default to n=2 for test parallelism

This is attempt apache#2 of apache#12376 which was reverted in apache#12413. The changes
in `plugin.py` should keep all the tests on the same node so sporadic
failures don't happen due to scheduling.

Co-authored-by: driazati <[email protected]>
mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023
This decreases the test times for most of the tests except a few that did not run under pytest-xdist with 2 worker nodes. This also doesn't decrease overall runtime since CI is still bottlenecked on other jobs. However, this could lead to savings in compute which makes CI more sustainable so this is still worthwhile, though we should revert this if we start seeing "weird" errors like OOMs more often.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants