-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: OpenFile may return EINTR on OS X #11180
Comments
I can reliably reproduce this on a recent Mac Pro running OS X 10.10.3 using a fuse file system that delays open(2) by sleeping and a program that opens files and runs sub-processes concurrently. Instructions:
This always results in output like:
|
@jacobsa this sounds like a bug in fuse, Go is doing the right thing. |
@davecheney: Could you expand on your reasoning? I filed this bug thinking I'd probably hear that answer, but I would like to know why. The sigaction(2) man page doesn't explicitly guarantee that open(2) won't return |
If a signal is installed with SA_RESTART, as is always true of signals installed by the Go runtime, then only a limited set of calls should ever return EINTR. On a GNU/Linux system the complete set can be found in signal(7). I don't know about Darwin. open is not on the list for GNU/Linux. Can you find the list for Darwin? On GNU/Linux it would be a clear bug if open ever returns EINTR for a signal whose handler is installed with SA_RESTART. We do not want to sprinkle EINTR loops through the os and net packages unless they are documented as being required. I think we would prefer not to change the core Go libraries to work around bugs in FUSE. |
I wonder if the inverse is true. Aaron, can you reproduce the issue under On Fri, Jun 12, 2015 at 4:16 PM, Ian Lance Taylor [email protected]
|
@ianlancetaylor: The OS X sigaction(2) man page says this:
The sentence structure is ambiguous, but I read that as saying that open(2) is not guaranteed to be restarted for regular files. Note that the Linux man page also unambiguously says that open(2) is only restarted when "it can block". Technically that is true of fuse files, but it is not true of regular files in general and so I wouldn't be too surprised if it didn't restart for any regular file. @davecheney: I can't reproduce it on Linux with my minimized test case. The larger program where I found this issue deadlocks on Linux due to issue #11155, so I'm not sure about that one. |
Could we regard this as a bug in the OSXFuse implementation?
If open/read/write can return EINTR, then I'm afraid a lot of (if not
all) syscall functions need to be rewritten to include a retry loop.
|
EINTR, EAGAIN and EBUSY are special cases in POSIX, interestingly defined vaguely and implemented vaguely from filesystem to filesystems, and of course operating systems to operating systems. OSXFUSE sending EINTR is perfect in accordance with POSIX, it is the client application which is layered on top of OSXFUSE perhaps your own filesystem built out of OSXFUSE which could potentially handle this at POSIX level and perhaps even want to make sense of it - for simple example https://github.com/bazil/fuse/blob/master/fuse.go#L522 There are other good examples too, GlusterFS which is built on top of FUSE for example handles EINTR perfectly fine for all network operations, sockets, polling etc. While writing this i stumbled upon python PEP 0475 - they thought convenient API's on top of regular syscall's should be python's responsibility - for example here http://bugs.python.org/issue23285, https://www.python.org/dev/peps/pep-0475/ - so python3.5 handles this automatically. Potentially it can be handled gracefully to remove some burden perhaps of switch casing through os.PathError* - but given a choice personally as a user of os.OpenFile(), i would prefer it returns the EINTR up the chain and let me handle it. Can be debated further .... :-) |
@harshavardhana, addressing this:
To be clear, this is osxfuse returning Thanks for pointing out PEP 0475; I agree with the logic and conclusion there. I do think it's funny that they call out Go as a language whose standard library retries on I will file an issue with osxfuse about this to see what the author thinks. But I still do think that Go should handle this, given existing versions of osxfuse and given that the man page quoted above does not appear to exclude the possibility of |
The man page for sigaction(2) doesn't guarantee that SA_RESTART will work for open(2) on regular files: The affected system calls include open(2), read(2), write(2), sendto(2), recvfrom(2), sendmsg(2) and recvmsg(2) on a communications channel or a slow device (such as a terminal, but not a regular file) and during a wait(2) or ioctl(2). I've never observed EINTR from open(2) for a traditional file system such as HFS+, but it's easy to observe with a fuse file system that is slightly slow (cf. https://goo.gl/UxsVgB). After this change, the problem can no longer be reproduced when calling os.OpenFile. Fixes golang#11180. Change-Id: I967247430e20a7d29a285b3d76bf3498dc4773db
CL https://golang.org/cl/14484 mentions this issue. |
runtime.setsig supports calling sigaction(2) with the
SA_RESTART
flag, which on OS X causes several system calls to be restarted when they would otherwise returnEINTR
. In particular, open(2) is one of these, but only for "slow devices" and not regular files. I don't have direct evidence, but I believe this is because on OS X an open of a regular file can't usually block in a way that can be interrupted.I see that
SA_RESTART
is generally used for Go signal handlers: the default handlers all use it, and sigenable sets it. (The only exceptions appear to beSIGPIPE
andSIGABRT
). From this I conclude that it's intended that os.OpenFile not leakEINTR
out to the caller. I've never seen a retry loop around os.OpenFile, so this would make sense.However, it seems on OS X open(2) of a file can be interrupted if the file is on a fuse file system. This came up for me when using os/exec.Command in parallel with opening a file backed by fuse. When the command completes the thread calling open(2) would sometimes receive SIGCHLD, causing errors like this:
Am I correct in assuming that os.OpenFile is not intended to leak
EINTR
so that users don't need to call in a loop? If so, should it be modified to contain an internal retry loop?The text was updated successfully, but these errors were encountered: