-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent RISC-V CI failures #6905
Comments
FYI: @AdamBrousseau |
I have built Debian 10 (buster - this is what CI is running as far as I can tell) and Debian 11 (bullseye) images as similar to CI build node as I could and run few tests there:
@AdamBrousseau Is the CI node ( |
Yes, the debian machine is a vm running on kvm. |
#6913 has been merged but it did not help much. Now it hangs up in
When this happened, the build node was not swapping and CPU usage was low - qemu process simply hang. I'm running out of ideas. It's hard to reproduce for me - essentially I see this kind of failure only on Eclipse CI. QEMU does not implement "extended remote" protocol so one cannot debug multi-threaded programs running under user-mode emulation with QEMU. I can try to see where in QEMU it hangs, but not sure how useful this would be. If anyone has an idea how to approach this, I'm one big ear. |
For the record, I run only
and:
I tried on Eclipse CI node with the bit-identical static QEMU binary copied from my (working) deb10 image to no avail, still hangs. |
Another observation: I also tried running tests on my freshly build deb10 image with sysroot copied from CI node - didn't hang once in 100 runs. I also noticed that the uptime of CI node ( Anyways, I'll try to update sysroot on CI node to "fresh" (with newer versions of libraries, most notably glibc) as it might to reduce hangups (but not fix them). |
Rebooted. Let me know how it goes. |
@AdamBrousseau: Unfortunately, reboot did not help, but thanks anyway! I'm going to update |
I did it but the test build hanged just like before. Maybe just bad luck, but when I tried to build ant run test manually it was far more stable. Anyways, I'm going to keep new versions there for some time (I left backups on the node so reverting is a matter of redirecting symlinks back) |
With new sysroot, |
Just adding reference to PR #6912 as it might help with this. Maybe. |
For some time, we're experiencing itermittent failures with RISC-V CI cross-compiling job, see for example #6706 or #6704.
This issue is create to track progress on stabilising RISC-V CI cross-compiling job.
The text was updated successfully, but these errors were encountered: