-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High rate of spurious CI failures on macOS machines #14459
Comments
I believe it was waiting for xcode-locator, that can be super long if the machine has many Xcodes. Try this hack: https://www.smileykeith.com/2021/03/08/locking-xcode-in-bazel/ |
Thanks, I will try that. Do you think that this could also be related to Issue 2? |
It could be, since that file is only created if Xcode is successfully detected. bazel/tools/cpp/cc_configure.bzl Line 73 in b4b0c32
|
I just checked again and found no new failures in the two weeks since I started using a checked-in |
@sventiffe Reopening since the exact same issue appeared again today over at rules_jni. I started pinning with
@thii Are there any additional diagnostics that I could enable that would help diagnose the underlying issue? |
That file actually looks pretty good to me, since yours didn't fail with osx_archs import, it's because that file doesn't contain I'm not sure where that's coming from but it should be darwin_x86_64 probably instead |
Nice catch. I think I found the root cause and submitted a fix as #14796. |
Nice, I'll keep an eye on that. My assumption for why this is similar to the flaky CI case is because the toolchain only falls back to this codepath on macOS when Xcode cannot be found, which could happen in the case of a timeout running the discovery logic for that. This would likely workaround that issue if folks needed full Xcode in their cases https://www.smileykeith.com/2021/03/08/locking-xcode-in-bazel/ |
Previously, if the xcode_locator failed and cc_autoconf_toolchain used the non-Xcode C++ toolchain as a fallback, its reference to `@local_config_cc//:cc-compiler-darwin`, where darwin is the legacy cpu value for x86_64 macOS, would be invalid. Fixes #14459 Closes #14796. PiperOrigin-RevId: 451860477 Change-Id: Iec115f600ebb7ac0786b2169276d25e3ff5d54bf Co-authored-by: Fabian Meumertzheim <[email protected]>
Description of the problem / feature request:
I run daily CI checks in my rulesets' GitHub Actions pipeline. The macOS pipelines, running on
macos-latest
, fail every few days with two kinds of spurious failures that I have never been able to reproduce locally:Issue 1:
Issue 2:
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I have no way to consistently reproduce this issue, but it happens every few days on rules_jni's CI schedule.
What operating system are you running Bazel on?
macOS 10.15
What's the output of
bazel info release
?Over time, I have hit the issues on 4.2.2, 5.0.0rc3 and various last_green builds.
Have you found anything relevant by searching the web?
unix_cc_toolchain_config.bzl:cc_toolchain_config
from@bazel_tools
. bazel-contrib/toolchains_llvm#75 (comment)Any other information, logs, or outputs that you want to share?
I can make arbitrary changes to the CI config if that helps to gather more information on the cause of these issues.
The text was updated successfully, but these errors were encountered: