-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MacOS /nix unmount when reboot. /nix ownership change to root #4640
Comments
Can you see what (I'll go into a little more detail later...) |
Edits:
I don't recall seeing this specific problem before, but we have had at least a few people turn up with nix+macOS+VM issues over the past few months. Nothing solid has come out of these reports yet, in part because everyone works around their immediate problem and forgets or loses interest in debugging it. My general hunch for a while has been that some cloud providers are doing something ~weird (i.e., that desktop users don't do) when they set up their VMs, and that this is what causes the trouble. I tried Nix out in a VM several weeks back while I was suffering through trying to debug an unrelated issue that required repeatedly updating macOS. While it was a complete pain, I didn't run into any of the issues described so far. But, in the process of researching that, I did stumble on this: https://mrmacintosh.com/securetoken-documentation/ From that description, it's at least plausible that they're setting up accounts without SecureToken, and that this causes the trouble. But, we still need to validate that thesis, find a fix, and figure out if it's practical for us to fix it automagically on install or if it's the sort of thing we'll just have to test for and complain about. Since there's nothing solid to point to, I'll collect some links to existing discussions about this:
For completeness, here are IRC logs covering this troubleshooting attempt:
fstab still works fine in Catalina (and in Big Sur). Your /nix is probably owned by root because nothing is mounted there. AFAIK, any mount point described by /etc/synthetic.conf will be owned by root until/unless some other user successfully mounts something over it. |
@abathur thanks for getting back to me so quickly.
I am not super familiar with nix, nor mac in general. I am using it now for my new job Correction on my initial report - when I say reboot I actually meant start a brand new mac1.metal instance using the AMI (amazon machine image) created from the first machine. Basically I am using packer to create AMI and provision nix onto the machine that will later be baked into an AMI. |
Do you have a "password" for that account? With the caveat that I don't really know what we're doing here, I think you can enable this with something like (I'm basing this on If |
so after enabling the security token (run as root)
reboot, then the nix volume is attached. (not sure if this will work with the AMI process tho. Create a image that has security token already generated) Now I am seeing a different issue on the build machine.
I am running build agent as a daemon by having a plist in /Library/LaunchDaemon. The build agent gets launched when my mac1.metal ec2 machine get booted. If I login tho and run the build agent locally then (tried this solution setting sandbox and extra path in my nix.conf) |
Progress! This latter issue is at least something others have reported. I'm curious about the build agent and what you're using it for here? There's also a --daemon install of Nix (which will likely become the only supported install after #4289) that'll run as root and use nixbld users for builds, but maybe you've already ruled it out in your situation? |
If you know you need a distinct build agent, I'm curious how your launchdaemon differs from the one a daemon install would use, and whether those differences matter here: https://github.com/NixOS/nix/blob/master/misc/launchd/org.nixos.nix-daemon.plist.in If you want something less async, we can also talk in #nix-darwin on IRC |
I am using buildkite. So the buildkite-agent is the one invoking nix. I basically set all the nix env in the plist so the agent shell would have nix context (equivalent of doing (I been using ec2-user in our conversation thus far in attempt to simplify the discussion since ec2-user is the default user, but it seems like the nix issue I am seeing is not system wide. Something is funny with nix when I run buildkite-agent daemon) You can replace all instance of ec2-user above with buildkite-agent. I ran and installed nix as buildkite-agent. this is the plist of buildkite-agent
if I launch the daemon but I can successfully build with nix if I invoke the agent directly after ssh into the machine I will also give the multi user install a shot and report back
that would be great, can you give me direction or link to the chat? |
If you already have an IRC client, you can find us on freenode. If you don't, I gather you can use the webchat via https://webchat.freenode.net/#nix-darwin |
after setting BUILDKITE_SHELL to use bin/sh the dyld error went away now seeing
|
after setting the /bin/sh as BUILDKITE_SHELL env the nix env vars somehow are not set. (still in plist) |
for my personal note - summary from yesterday's discussion
then reboot also fix the issue without security token.
|
Interesting. I do have some comments in my installer PR about enableOwnership. Have you re-tried with my hosted installer, or is this still with the official one? I'm curious what the nix volume line in /etc/fstab says. Will also ping you on IRC. |
Summarize IRC chat from yesterday for documentation purpose
|
I'm not sure whether Nix does or doesn't in this context. My understanding is that a few people have worked around issues like this by adding an FDA exemption for /bin/sh (because the launchdaemon for nix-daemon uses /bin/sh). The last comment in the other thread asked about whether you'd added the FDA exemption for I had some thoughts late yesterday about removing/replacing the remaining homedir references from your launchd plist. I'm curious if you did try that (don't feel obliged, mainly wondering if we should follow up on that possibility). |
do you mean this or something else? I am willing to test drive |
current state of the plist
|
@klardotsh I think we've exhausted out current ideas for getting this to work without the full-disk-access security exemption--do you remember what hoops you needed to jump to get a VNC session? |
👋 @klardotsh , for the above refrence |
Here is the instruction on how to get a VNC session to mac1.metal instances https://gist.github.com/sebsto/6af5bf3acaf25c00dd938c3bbe722cc1 |
Those instructions look roughly like what I followed (which was https://www.lets-talk-about.tech/2020/12/aws-create-macos-desktop.html), so they should work. I've sadly been juggling a lot of things so haven't dived too far into the Nix-on-Mac rabbit hole lately (a coworker got Nix working on his Mac with far fewer issues than on my EC2 instance, so it fell down the priority list a bit) |
I marked this as stale due to inactivity. → More info |
I also ran into build issues due to my /nix being owned by root. I'm not sure if it was initially owned by my user, because I didn't check when I installed it. I am on an M1 MacBook Pro (not a VM). I will try to reinstall and see if the owner is still root. |
Describe the bug
On aws Mac ec2 instance running Catalina 10.15.7 installed nix with recommended approach
sh <(curl -L https://nixos.org/nix/install) --darwin-use-unencrypted-nix-store-volume
works great. you can see /nix is mounted
and /nix is own by ec2-user
Problem
however when I reboot the nix vol didn't auto mount (maybe
/etc/fstab
is no longer used by Catalina?)and
/nix
is now own byroot
I can get around it by
sudo mount_apfs disk2s6 /nix
butI am using these mac ec2 instance for CI purpose and the process would fail
due to
/Users/ec2-user/.nix-profile/etc/profile.d/nix.sh: Operation not permitted
Steps To Reproduce
described above
Expected behavior
nix vol mounted when boot and /nix owned by user who executed the install scripted
nix-env --version
outputnix (Nix) 2.3.10
Additional context
I am running these on aws ec2 Mac1.metal instances
The text was updated successfully, but these errors were encountered: