-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Ensure that running threads started as dependencies of a test ar… #22715
[CI] Ensure that running threads started as dependencies of a test ar… #22715
Conversation
…e freed, even if the process has never been started
PR #22715: Size comparison from 989ad8e to 72bc3ae Increases (3 builds for bl602, cc13x2_26x2, psoc6)
Decreases (3 builds for bl702, cc13x2_26x2, psoc6)
Full report (37 builds for bl602, bl702, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, psoc6, qpg, telink)
|
I have relaunched The tests of interests here are: |
Those tests are green on first try: I will relaunch them to see if they stay green on a second attempt. |
Again, the Attached is the log of the
|
Third time in a row that |
…e freed, even if the process has never been started
Issue Being Resolved
Trying to locally runs the test suite with an iteration count > 100 never succeeds on my laptop:
At some point the test are starting to get slower, and some output is dropped randomly.
I have tracked that down to a regression from #17727.
To summary, #17727 has changed the behaviour of the test suite to pass all the applications declared as targets into the test suite as dependencies of the current test. This is fine by itself, the issue is that the code waiting for a process to be stopped just loop forever if one of the dependency has not been started - which happens for most of the dependencies.
As a result, we may end up with more than a thousand threads just sleeping (
time.sleep(0.1)
) until the point where the situation stops beeing tractable for the underlying hardware.I suspect that to be the root cause of the intermittent failure that happens on darwin for
Test_RH_2_1
.Test_RH_2_1
being the one that is at the limit of what the CI hardware can supports. Why, does it happens more for darwin than for Linux ? Again that may depends on the hardware running the test suite. And why does it happens more fordarwin-framework-tool
than forchip-tool
itself ? I suspect the tests running under theDarwin-framework-tool
to be more "chatty", which may results into some stdout buffers beeing completely filled up when the test is running, and so we are more likely to miss the "interesting" part.So all of this is just theory at the moment, I guess CI runs will prove me right or wrong...
If my CI run is green for the darwin bits on the first run, I will tagged this one has hot fix at the current redness on the CI is a pain.
Change overview