-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: slow "native" performance with Mac OS X 10.14.1 and 10.12.6 #28739
Comments
Here's what I got from running the above script on a 13' macOS (seirra 10.12.6) Without Docker
With Docker
|
Sorry, forgot to cc @rsc in the original description |
Will a dtruss/strace log help here ? Just wondering if there is any OS level variable here. |
cc @jayconrod |
The same problem on my windows 10: What version of Go are you using (go version)? What operating system and processor architecture are you using (go env)? |
I don't have an answer to why macOS 10.12 and 10.14 performance would be different, but I think I know why Docker and macOS are different. As far as I understand, Docker on macOS actually runs a Linux kernel and userspace under a hypervisor. That means overhead for I/O and system calls is more similar to native Linux overhead than it is to macOS overhead. "go list" does a lot of Here are the results from a quick benchmark I ran. The macOS native:
macOS Docker
Notably, the |
We have a project with a mono repo with 40k LOC, 700k LOC dependencies and 30 binaries, code generation before build and so forth. Running all:
go build -o build/ ./gen/... # build code generators
go generate ./...
go test ./...
go build -o build/$(GOOS)_$(GOARCH)/ ./cmd/... The build cache contains only this one project. When making a simple change to the project build times change to 5.3 sec vs. 2.8 sec. Building a single binary the numbers are 1.35 sec vs 0.85 sec. We are using modules, everything is downloaded, no CGO and we are not cross-compiling. Using a RAM disk on macOS for I'm testing this on a quite beefy machine but this is more noticeable on smaller laptops of my colleagues. I'll double check but the build times were more in the range of 5-10 seconds. This can be quite noticeable and I'm curious how we can debug this a bit further. Update 1: The project and dependencies are about 1700 Go files at this point. |
The speed difference was already noticeable with go1.13 since I've started playing with a Fedora laptop a while ago. Not sure about go1.12. |
@randall77 you mean that i/o syscalls are slower on Darwin than on Linux? |
@magiconair, yes, syscalls will be somewhat slower because they go through libc now. But I was particularly thinking of fsync which is much more expensive on Darwin. The regular fsync is kinda broken, so we have to use ioctl with |
Are we sure that what we are seeing here is not just Docker for Mac virtualization doing some very aggressive in-memory caching? We are not just comparing Darwin to Linux; we are comparing Darwin to Hyperkit+Linux. In the numbers shared by @myitcv in the issue description, native macOS is actually faster on the first cold run:
versus this for Docker:
Update: I assumed everybody was using Docker for Mac, but maybe you use another solution to run Docker on your Mac? |
@randall77 is there some indication that this situation will improve in the future? Is Apple addressing the |
I don't know of anyone working on it, or anyone who has plans in this area. So, no.
I have no idea.
I think if we want to make progress here we need to understand why it is so much slower. For build times, I don't think a slow fsync would be the cause. Building Go code is mostly reading; it's probably stat or read call. But that's a guess; data would help. |
Do you have a quick tip on how to collect this data? Otherwise, I'll try and ask the internet. |
Run with strace? |
Closing as obsolete, since this has been open a while. Comparing I/O performance between native macOS and Docker (virtualized Linux) is somewhat apples and oranges. For whatever reason, Linux seems to be a lot faster on the same hardware. There is certainly room for better I/O optimization in cmd/go, and it's probably worth studying further whether I/O changes much by macOS version (holding hardware and Go version constant). |
Just FWIW, this is a problem with Apple NVMe drives (T2 and M1 Macs alike). Their cache flush performance is abysmal. This affects both macOS and (native) Linux (presumably existing VM solutions fail to forward this as F_FULLSYNC if they run faster). You get about 46 IOPS of F_FULLSYNC on macOS on an M1 machine, and about the same on native Linux with fsync(). So there isn't much to be done on the software side other than deciding whether you care about data integrity on macOS or not; on Linux there is no way for user processes to control this as far as I know (i.e. flush to cache but not the cache), though it can be emulated by setting the drive write cache type to "write through" in sysfs (which tells the kernel to never issue flush commands, so fsync() behaves like it does on macOS). |
For the benefit of others, @marcan also posted this fantastic thread: https://twitter.com/marcan42/status/1494213855387734019 |
Following a debug session with @fatih
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
The symptom that @fatih and I were investigating was
godef
performing ~2x more slowly on his fast machine than on my slower machine.@fatih's machine specs for reference:
We think we have a smaller reproduction that demonstrates what appears to be an OS X 10.14.n issue that may or may not be related to Go. But
go list
appears to be a good way to demonstrate the problem and hopefully therefore a good place to start further investigation.The following was run in a Terminal on @fatih's setup, and then within a Docker container on the same machine. Run-times of the
go list
commands were compared:The "native" terminal run shows "average" speeds of ~300ms:
Whereas the Docker run shows "average" speeds of ~180ms:
What did you expect to see?
Similar run-times in each, possibly even faster times in the "native" OS X environment.
What did you see instead?
The "native" run-times are almost 70% longer.
cc @bcmills (and FYI @ianthehat for any
go/packages
side-effects)The text was updated successfully, but these errors were encountered: