[CI Problem] x386 CI running out of RAM #10180

mbrookhart · 2022-02-07T18:28:32Z

On a recent PR that added a few extra tests to Relay, we discovered that pytest was running over the 4GB RAM limit on the x386 CI job. We fixed this by reducing the memory use of the failing test ~10%, but we're getting to the point in our test size were running pytest tests/python/relay seems to be accumulating too much in RAM via the tests and pytest logs to actually run on x386. I imagine we'll hit this again in the future, should we perhaps write a bash script to run the test files 1 by 1 for the 32 bit job?

cc @driazati @areusch

Also wondering if @leandron might have some thoughts.

Branch/PR Failing

#10026

The text was updated successfully, but these errors were encountered:

areusch · 2022-02-07T19:03:59Z

we could also try to investigate why pytest-forked doesn't like GPUs. could you post any information you have about that?

mbrookhart · 2022-02-07T19:19:24Z

I attempted to fix this using pytest --forked here: #10174

But it failed a lot of tests on a lot of jobs related to GPU. I got the feeling that initializing the GPU interface on the main thread and then trying to access it from the forked thread broke an assumption somewhere in the stack, but I didn't dig very deeply on what the root cause might be.

masahi · 2022-02-07T21:42:35Z

@mbrookhart Can you point to the failed log from a job in #10026?

FranckQC · 2022-02-08T00:21:48Z

Hi everyone.
It looks like I have the same issue on this i386 test on the CSE PR : #9482

Let's see how the current build will end up (in theory the docs should be ok this time, and the Windows buikd too, it was just due to a URL change for the docs, and to a Github maintenance for the Windows build).

mbrookhart · 2022-02-08T21:29:18Z

@masahi https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-10026/20/pipeline

driazati · 2022-08-09T23:29:30Z

cautiously closing this since we've changed the CI infra good bit in the meantime, please re-open if this happens again

masahi mentioned this issue Feb 7, 2022

Implementation of Common Subexpression Elimination for TIR #9482

Merged

driazati closed this as completed Aug 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI Problem] x386 CI running out of RAM #10180

[CI Problem] x386 CI running out of RAM #10180

mbrookhart commented Feb 7, 2022

areusch commented Feb 7, 2022

mbrookhart commented Feb 7, 2022

masahi commented Feb 7, 2022 •

edited

Loading

FranckQC commented Feb 8, 2022

mbrookhart commented Feb 8, 2022

driazati commented Aug 9, 2022 •

edited

Loading

[CI Problem] x386 CI running out of RAM #10180

[CI Problem] x386 CI running out of RAM #10180

Comments

mbrookhart commented Feb 7, 2022

Branch/PR Failing

areusch commented Feb 7, 2022

mbrookhart commented Feb 7, 2022

masahi commented Feb 7, 2022 • edited Loading

FranckQC commented Feb 8, 2022

mbrookhart commented Feb 8, 2022

driazati commented Aug 9, 2022 • edited Loading

masahi commented Feb 7, 2022 •

edited

Loading

driazati commented Aug 9, 2022 •

edited

Loading