Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustup-installed nightly compiler no longer works on MacOS Mojave #104570

Closed
ghost opened this issue Nov 18, 2022 · 27 comments · Fixed by #104650 or #105123
Closed

rustup-installed nightly compiler no longer works on MacOS Mojave #104570

ghost opened this issue Nov 18, 2022 · 27 comments · Fixed by #104650 or #105123
Assignees
Labels
C-bug Category: This is a bug. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.

Comments

@ghost
Copy link

ghost commented Nov 18, 2022

Code

I tried this on the latest nightly compiler installed by rustup:

cargo build

I expected to see this happen: My project builds.

Instead, this happened: rustc crashed due to a dynamic library linkage problem.

Version it worked on

It most recently worked on: rustc 1.67.0-nightly (96ddd32c4 2022-11-14).

Version with regression

rustc --version --verbose:

It crashes without giving the version, but so far it affects the rustup toolchains nightly-2022-11-16, nightly-2022-11-17, and nightly (which I assume is the same as nightly-2022-11-17 for now.)

Backtrace

The rustc binary segfaults with the following output:

dyld: Library not loaded: @rpath/librustc_driver-dae4a1cad6347bf8.dylib
  Referenced from: /Users/abuse/.rustup/toolchains/nightly-x86_64-apple-darwin/bin/rustc
  Reason: no suitable image found.  Did find:
	/Users/abuse/.rustup/toolchains/nightly-x86_64-apple-darwin/bin/../lib/librustc_driver-dae4a1cad6347bf8.dylib: cannot load 'librustc_driver-dae4a1cad6347bf8.dylib' (load command 0x80000034 is unknown)
	/Users/abuse/.rustup/toolchains/nightly-x86_64-apple-darwin/bin/../lib/librustc_driver-dae4a1cad6347bf8.dylib: stat() failed with errno=1
	/Users/abuse/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/librustc_driver-dae4a1cad6347bf8.dylib: cannot load 'librustc_driver-dae4a1cad6347bf8.dylib' (load command 0x80000034 is unknown)
	/Users/abuse/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/librustc_driver-dae4a1cad6347bf8.dylib: stat() failed with errno=1
Abort trap: 6

Setting RUST_BACKTRACE=1 does not get further diagnostics, probably because the compiler isn't even being run due to the linkage problem.

Likely cause

I am running MacOS Mojave. Yes, this is quite an old OS release, and no, I'm not intending to upgrade any time soon. It seems likely that whoever is building binaries for the x86_64-apple-darwin target has upgraded their build environment and/or changed some settings, inadvertently causing it to produce binaries which require a newer version of MacOS. (Xcode is unhelpful like that.) I haven't seen any announcement of dropping of support for older MacOS versions.

@ghost ghost added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Nov 18, 2022
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. and removed regression-untriaged Untriaged performance or correctness regression. labels Nov 18, 2022
@thomcc
Copy link
Member

thomcc commented Nov 18, 2022

We might drop older versions (#104385), but even after that, Mojave should still be supported (Mojave is 10.14, we'd be increasing the requirement to 10.12).

This is likely an unintentional regression from #103929, although I do not know for sure. It could also be related to CI changes, as you mention.

@thomcc
Copy link
Member

thomcc commented Nov 18, 2022

Apparently we may have bumped the CI runners to macOS 12 around the same time, which is a CI change that is much more likely to be the root cause than that PR.

@thomcc thomcc added the T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. label Nov 18, 2022
@BlackHoleFox
Copy link
Contributor

BlackHoleFox commented Nov 18, 2022

I'll take a look at this one and see if the issue comes from my target cleanup PR. @rustbot claim

@ehuss
Copy link
Contributor

ehuss commented Nov 18, 2022

I confirmed with a bisect that it started with #104091 which was the first PR to start using XCode 14.

I'm wondering if it is necessary to start using something like -macos_version_min to the linker?

@BlackHoleFox
Copy link
Contributor

which was the first PR to start using XCode 14.

For my own knowledge going forward, how'd you know that? The GA definition says we're using macos-latest but GitHub's blog says this became macOS 12 back in early October.

I'm wondering if it is necessary to start using something like -macos_version_min to the linker?

Since you figured out what actually introduced it (so not any code changes) I'll look at that instead. man ld isn't helpful telling you what -macos_version_min does that -platform_version wouldn't.

@ehuss
Copy link
Contributor

ehuss commented Nov 18, 2022

For my own knowledge going forward, how'd you know that?

I've been tracking the issue in #103044. I updated my system to xcode 14, and that test started failing. I also monitor the GitHub changelog blog which announces these changes.

When GitHub starts changing the default for a -latest runner, it does it over a span of months, migrating individual repos along the way. Our CI (rust-lang-ci/rust) started failing on Monday with that error. We had to expedite the fix to get it working again. There is no advance notice when exactly a particular repo will migrate, it just happens (they do warn you, but it can happen at any time over a span of months).

As for determining which image is being used, you can look at the CI logs. For example, in the build for that PR here: https://github.com/rust-lang-ci/rust/actions/runs/3466400077/jobs/5790206511, at the top of the log is "Set up job" which has an entry "Runner image" which tells you which image is being used (and a link to what software is installed). I used that to verify that our CI had migrated to macos-12 when the builds started failing and someone reported it on Zulip.

@ehuss
Copy link
Contributor

ehuss commented Nov 18, 2022

man ld isn't helpful telling you what -macos_version_min does that -platform_version wouldn't.

I think -macos_version_min=10.13 is the same as -platform_version macos 10.13 10.13.

@BlackHoleFox
Copy link
Contributor

I think I found the issue and solution (though the fix is kinda obvious): As of XCode 14, the macOS SDK no longer supports any versions below 10.13 (High Sierra). While I've yet to test this conclusively my thinking is that even though we're telling the linker -platform_version macos 10.7 10.7 its just ignoring us and using the "running on system default", whatever that is.

On a similar note, @thomcc tried running rustc on 10.7 a few weeks ago and ended up getting a linker error as well. This would make sense since even XCode 13 only supports back to 10.9, so rustc hasn't supported back to before 10.9 regardless of what we've claimed for a while now (guess no one uses those for development :) )

So given that we're only going to 10.12 in the ongoing-MCP, we're going to be stuck with XCode 13 for the time being to even support host tools on "older" macOS (deploying back should still work). We'll need to override that in any CI job that produces distributed artifacts. I'll test out the theory tomorrow and probably PR that a rough fix for this repo to "confirm"...

@thomcc
Copy link
Member

thomcc commented Nov 19, 2022

On a similar note, @thomcc tried running rustc on 10.7 a few weeks ago and ended up getting a linker error as well

Yes, I can confirm this. Although, cargo also doesn't support 10.7, so I had assumed it was due to that (and to be fair, it still might be!)

@ehuss
Copy link
Contributor

ehuss commented Nov 19, 2022

Oh interesting, I missed the compatibility table on that xcode 14 page.

We used to have the ability to switch xcode, but it was removed in #89849. I would just resurrect that diff.

@BlackHoleFox
Copy link
Contributor

following on that hypothesis from the other day here's the dump from vtool -show-build on 2022-11-18 for the x86 stdlib (just used to get an x86 artifact easily on M1 but should be the same for rustc):

Load command 10
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform MACOS
    minos 12.0
      sdk 12.3
   ntools 1
     tool LD
  version 819.6

and then stable's:

Load command 8
      cmd LC_VERSION_MIN_MACOSX
  cmdsize 16
  version 10.7
      sdk 12.1

so seems like the "its just ignoring us and using the running on system default" is what's happening when you give it a minimum -platform_version too low. Also worth saying this is broken on M1 too but I guess not enough people are using macOS 11 with M1 to notice. Nightly:

Load command 10
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform MACOS
    minos 12.0
      sdk 12.3
   ntools 1
     tool LD
  version 819.6

and then stable's:

Load command 9
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform MACOS
    minos 11.0
      sdk 12.1
   ntools 1
     tool LD
  version 711.0

@ehuss
Copy link
Contributor

ehuss commented Nov 27, 2022

Reopening, as unfortunately #104650 did not solve the problem. I don't immediately see what else could be the problem, though.

@ehuss ehuss reopened this Nov 27, 2022
@BlackHoleFox
Copy link
Contributor

BlackHoleFox commented Nov 27, 2022

Bleh, that seems correct sadly. Here's the latest nightly. It seems the same as before with no change:

vtool -show-build ~/.rustup/toolchains/nightly-2022-11-27-x86_64-apple-darwin/lib/librustc_driver-582242ced547d33f.dylib
...
Load command 10
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform MACOS
    minos 12.0
      sdk 12.3
   ntools 1
     tool LD
  version 764.0

Maybe downgrading further to XCode 13.2 could at least tell us if the CI process is actually using our desired SDK because that has the 12.1 SDK (not 12.3 like above), so there would be a visible difference).

@thomcc
Copy link
Member

thomcc commented Nov 27, 2022

Sorry bors, you're being too optimistic.

@thomcc thomcc reopened this Nov 27, 2022
@BlackHoleFox
Copy link
Contributor

I'm not having much luck locally with XCode 13.2 and the 12.1 SDK either :rip: I'll try again a bit later.

@ehuss
Copy link
Contributor

ehuss commented Nov 28, 2022

I feel like the XCode version has been a mostly false lead. The most recent beta was built on macos-12 with xcode 14, and it works just fine on macOS 10.13. I confirmed that older versions of rustc don't generate code with LC_DYLD_CHAINED_FIXUPS. I went back to 96ddd32, which was still failing. Going all the way back 5e97720 seems to work. So there is some issue in between those two. If I find some time tomorrow, I might try bisecting doing local builds. That will probably take several hours, though.

There's probably still the issue with 10.13 being the min deployment target in xcode 14, but I think that is a secondary issue.

@ehuss
Copy link
Contributor

ehuss commented Nov 29, 2022

@BlackHoleFox I think I better understand what is going on. I bisected the change in behavior to #103929. I was a bit confused as there are two factors in play (that PR and the switch in XCode). From what I can tell, it's as-if the deployment target is being completely ignored. I'll keep poking a bit, but maybe you can spot the issue. I see some questionable things about Cc::No vs Cc::Yes.

@ehuss
Copy link
Contributor

ehuss commented Nov 29, 2022

I see several differences. apple_sdk_base had different settings from apple_base, but it looks like those were unified. x86_64-apple-darwin was using apple_base which had different settings from the sdk one. The Target option differences look like:

--- b	2022-11-28 16:53:46.718160978 -0800
+++ a	2022-11-28 16:53:36.398616799 -0800
@@ -20,6 +20,7 @@
     "ZERO_AR_DATE=1"
   ],
   "link-env-remove": [
+    "MACOSX_DEPLOYMENT_TARGET",
     "IPHONEOS_DEPLOYMENT_TARGET"
   ],
   "linker-is-gnu": false,
@@ -33,6 +34,14 @@
       "x86_64",
       "-m64"
     ],
+    "ld": [
+      "-arch",
+      "x86_64",
+      "-platform_version",
+      "macos",
+      "10.7",
+      "10.7"
+    ],
     "ld64.lld": [
       "-arch",
       "x86_64",
@@ -44,7 +53,12 @@
   },
   "split-debuginfo": "packed",
   "stack-probes": {
-    "kind": "call"
+    "kind": "inline-or-call",
+    "min-llvm-version-for-inline": [
+      16,
+      0,
+      0
+    ]
   },
   "supported-sanitizers": [
     "address",

For example, MACOSX_DEPLOYMENT_TARGET should not be added to link-env-remove.

@BlackHoleFox Will you have some time to look fixing that?

@ehuss
Copy link
Contributor

ehuss commented Nov 29, 2022

I'm also curious about the supposed increase of the minimum deployment target to 10.13 in Xcode 14. rustc 1.66.0-beta.2, which was built with Xcode 14 seems to work just fine on 10.12. The LC_VERSION_MIN_MACOSX still reports 10.7. I wonder to what degree that change in minimum deployment target affects Rust's use case (which is essentially using clang as a linker). In other words, I'm wondering if #104650 is necessary. My instinct is that it's probably better to be on the safe side and use Xcode 13, but I'm curious what Apple means when they say they raised the minimum.

@BlackHoleFox
Copy link
Contributor

BlackHoleFox commented Nov 29, 2022

@ehuss Thanks for bisecting that down. Checking my PR for issues was next on my list since the XCode route just kept going nowhere. I have the time today to look more at it (and hopefully fix it) but thank you again for the diff and quick findings.

I'm also curious about the supposed increase of the minimum deployment target to 10.13 in Xcode 14.

Same here. I had originally interpreted this as what the whole SDK supported, let alone what clang still supports, but I am now doubting that too. Part of what I did last night was poke at the SDK metadata and I found that even XCode 14.0.1 with the 12.3 SDK still has older deployment versions far past what the docs seem to say.

<key>DEPLOYMENT_TARGET_SUGGESTED_VALUES</key>
<array>
	<string>10.9</string>
	<string>10.10</string>
	<string>10.11</string>
	<string>10.12</string>
	<string>10.13</string>
	<string>10.14</string>
	<string>10.15</string>
	<string>11.0</string>
	<string>11.1</string>
	<string>11.2</string>
	<string>11.3</string>
	<string>11.4</string>
	<string>11.5</string>
	<string>12.0</string>
	<string>12.2</string>
	<string>12.3</string>
</array>

@BlackHoleFox
Copy link
Contributor

BlackHoleFox commented Nov 29, 2022

Looking at my other cleanup PR after seeing the target spec diff above, it was surprisingly obvious where the screwup happened around link-env-remove. That specific issue (and more cleanup to make this stuff less fragile) is done on this branch which I'll PR later after work.

I couldn't grasp enough about how rustc_target handles linker flavors to know why the new ld blob showed up or if that's harmful, so its still that way with the changes so far. The obvious answer is "the pre_link_args from apple_sdk_base is being used too now" but the LinkerFlavor::Darwin() mixtures are throwing me off. Maybe during PR review someone can point me in a better direction?

@ehuss
Copy link
Contributor

ehuss commented Nov 29, 2022

Yea, the different variants of LinkerFlavor are a little strange.

  • Darwin(Cc::Yes, Lld::No) — Use "gcc" (cc which is really clang), the default
  • Darwin(Cc::No, Lld::No) — Use Apple's ld64 (ld)
  • Darwin(Cc::No, Lld::Yes) — Use LLVM's ld64.lld (lld in ld64 mode)
  • Darwin(Cc::Yes, Lld::Yes) — I think this variant is unused, and means "gcc"

I don't know how many people use ld directly. #103929 did change its behavior in terms of what it links (previously it would pass -lSystem and all the other libraries, and now it doesn't), and I don't know what that will break. It also now passes -arch and -platform_version which probably seems fine.

@BlackHoleFox
Copy link
Contributor

previously it would pass -lSystem and all the other libraries, and now it doesn't

That on the other hand seems like it would break stuff yea. Is there somewhere you saw that I missed looking at just the target specification? I didn't see any library linking arguments set.

I diffed the target specifications on 3f11d39 (before I started touching apple_base.rs and then against my current branch and the only differences in just the targets seem to be harmless/minor.

aarch64:

diff --git a/Users/fox/Downloads/before_cleanup_aarch64.json b/Users/fox/Downloads/after_cleanup_aarch64.json
index 865bfa43837..41854d7f020 100644
--- a/Users/fox/Downloads/before_cleanup_aarch64.json
+++ b/Users/fox/Downloads/after_cleanup_aarch64.json
@@ -29,6 +29,7 @@
   "os": "macos",
   "pre-link-args": {
     "gcc": [
+      "-m64",
       "-arch",
       "arm64"
     ],

i686:

diff --git a/Users/fox/Downloads/before_cleanup_i686.json b/Users/fox/Downloads/after_cleanup_i686.json
index e39cb59d694..b4184032496 100644
--- a/Users/fox/Downloads/before_cleanup_i686.json
+++ b/Users/fox/Downloads/after_cleanup_i686.json
@@ -29,9 +29,9 @@
   "os": "macos",
   "pre-link-args": {
     "gcc": [
+      "-m32",
       "-arch",
-      "i386",
-      "-m32"
+      "i386"
     ],
     "ld": [
       "-arch",

x86_64:

diff --git a/Users/fox/Downloads/before_cleanup_x86_64.json b/Users/fox/Downloads/after_cleanup_x86_64.json
index f06bcb33e37..30c851ceeeb 100644
--- a/Users/fox/Downloads/before_cleanup_x86_64.json
+++ b/Users/fox/Downloads/after_cleanup_x86_64.json
@@ -29,9 +29,9 @@
   "os": "macos",
   "pre-link-args": {
     "gcc": [
+      "-m64",
       "-arch",
-      "x86_64",
-      "-m64"
+      "x86_64"
     ],
     "ld": [
       "-arch",

@BlackHoleFox
Copy link
Contributor

@ehuss I compiled a combination of no_std, with-std, and with/without including some #[link] attributes but to the best of my knowledge everything looks the same across compilers. -lSystem is passed the same way across stable/nightly/#105123 (both when expected with std and when my test code asks to link it). Diffing the whole linker invocation doesn't show anything else amiss either.

#105123 should actually fix this problem. It doesn't relate to the diff in my previous comment since that turned out to be unnecessary.

@timmyjose
Copy link

We might drop older versions (#104385), but even after that, Mojave should still be supported (Mojave is 10.14, we'd be increasing the requirement to 10.12).

This is nice to hear. Anxiously awaiting for this nightly bug to be fixed soon!

@bors bors closed this as completed in 7fe9597 Dec 4, 2022
@timmyjose
Copy link

Did a rustup update and verified on Mojave (macOS Intel) :

$ rustc --version
rustc 1.67.0-nightly (53e4b9dd7 2022-12-04)

Thank you all!

@ghost
Copy link
Author

ghost commented Dec 5, 2022

Confirmed fixed here too. Thanks for what turned out to be a lot more hard work than I was expecting when I raised a "simple" bug report!

RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 20, 2024
…lacrum

Build macOS distribution artifacts with XCode 13

After all of the `rust-lang/rust` Apple runners started using macOS 12, the builds created by CI began to use XCode 14.0.1. Due to this (as far as we can tell), XCode's build tools started to ignore the `MACOSX_DEPLOYMENT_TARGET` being defined by us for the distributed builds that let both `rustc` and `libstd` work on older versions. The current idea is that since XCode 14's macOS SDK doesn't support deployment targets before 10.13, it uses some default of its own. You can see the difference between stable's and the most recent nighty's supported versions [here](rust-lang/rust#104570 (comment)).

I wasn't able to confirm my SDK versioning hypothesis locally since I think there's something jammed with my XCode installation, but hopefully this should still fix it for releases.

Closes rust-lang/rust#104570

r? `@Mark-Simulacrum`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
…lacrum

Build macOS distribution artifacts with XCode 13

After all of the `rust-lang/rust` Apple runners started using macOS 12, the builds created by CI began to use XCode 14.0.1. Due to this (as far as we can tell), XCode's build tools started to ignore the `MACOSX_DEPLOYMENT_TARGET` being defined by us for the distributed builds that let both `rustc` and `libstd` work on older versions. The current idea is that since XCode 14's macOS SDK doesn't support deployment targets before 10.13, it uses some default of its own. You can see the difference between stable's and the most recent nighty's supported versions [here](rust-lang/rust#104570 (comment)).

I wasn't able to confirm my SDK versioning hypothesis locally since I think there's something jammed with my XCode installation, but hopefully this should still fix it for releases.

Closes rust-lang/rust#104570

r? `@Mark-Simulacrum`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
6 participants