-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: remove the mlock hack in Go 1.15 #40184
Comments
Can you link where this TODO is? The only relevant comment I found was on CL 223121, in the runtime/panic.go file, but it says:
and "after Go 1.15" means for Go 1.16. |
Since the kernel detection is done based off of major/minor numbers we actually have to apply a patch to internal builds of the compiler to include our own kernel builds which are not affected but fall in the affected range. I would love for this to be removed or at least optionally bypassed |
I don't think this needs further info from the author to be actionable... @gopherbot remove WaitingForInfo |
@nemith , can you explain why you need to patch the compiler? As of Go 1.14.1, the only effect of an mlock failure is an additional message if there's an uncaught panic, so I don't understand why that requires patching. |
If there is a way to detect the patched kernel, perhaps by reading /proc/version, I think we could implement that safely enough in 1.15. |
Re kernel version sources, see this and comments immediately preceding it: #37436 (comment) |
There's one deciding factor here: is the kernel patch widespread enough that the workaround is no longer necessary? I'm sure all the distros have it, but people aren't necessarily on top of upgrading their kernels. Unfortunately, I have no idea how to figure out the answer to this, and I think we should be fairly conservative about this given the nature of the kernel bug. |
On 7/13/20, Daniel Shaulov ***@***.***> wrote:
The resolution to #35777 was to `mlock` the top of stack on "affected"
kernels. But then it was discovered that major distros use kernels that
appear by the version number to be affected but are actually patched
(#37436).
It may be a bitch to do and contrary to Go best practices, but an
"alert" (nag?) mode that notifies users that the kernel they run under
needs a patch, or even better, gives them the ability to check if
their kernel needs attention may be the only politically correct
answer here.
Something that can easily be removed once the user has attended to it.
Or, perhaps better, something they can add as part of the installation
and drop as soon as possible.
Lucio.
|
A compromise solution: omit the mlock workaround for a few major distros when the version is known to include the kernel fix. More distros could be added in later 1.15.1+ releases by folks who can test them. |
@DanielShaulov Would it address your problem if we provided an environment variable to disable the use of |
Change https://golang.org/cl/243658 mentions this issue: |
CL 243658 adds |
Hi, I am very sorry for not responding, didn't get notifications until I got pinged. @ALTree Here is the relevant TODO, it says "in Go 1.15" and not "after Go 1.15": go/src/runtime/os_linux_x86.go Lines 31 to 33 in 11f92e9
But I see that the other places used "after Go 1.15", so this one might have been a mistake? @networkimprov At the very least - all the 5.4.0 kernels that Ubuntu ever published were patched (I checked all the 5.4.0-* tags in their git). So I think just checking for that will cover a pretty large set of false positive. @ianlancetaylor See above about the confusion about the version it was supposed to be fixed. I am OK with re-purposing the issue for 1.16, I was just worried that this was just forgotten and be left like this for the whole duration of Ubuntu 20.04. |
@DanielShaulov Thanks. Since the I'm not sure how to proceed with checking major distros. What we need is some mapping from the |
Well - first we need to get our distro name, for that I think /etc/os-release is the most reliable way. We just need the Then, looking at #37436 (comment) we can probably just parse the end of the kernel release (the part after For Ubuntu, it is a dead simple |
https://golang.org/cl/244059 is code that skips the mlock on Ubuntu 5.4 systems, based on reading |
Change https://golang.org/cl/244059 mentions this issue: |
@ianlancetaylor you could leave this issue open for discussion of other distros during the 1.16 cycle. |
In the 1.16 cycle we're going to remove the |
But you may add other distros to the don't-mlock list in 1.15.x. |
@networkimprov True, but I don't see that as a reason to keep this issue open. There is no action that we would take to close the issue. |
Change https://golang.org/cl/246200 mentions this issue: |
Go 1.14 included a (rather awful) workaround for a Linux kernel bug that corrupted vector registers on x86 CPUs during signal delivery (https://bugzilla.kernel.org/show_bug.cgi?id=205663). This bug was introduced in Linux 5.2 and fixed in 5.3.15, 5.4.2 and all 5.5 and later kernels. The fix was also back-ported by major distros. This workaround was necessary, but had unfortunate downsides, including causing Go programs to exceed the mlock ulimit in many configurations (#37436). We're reasonably confident that by the Go 1.16 release, the number of systems running affected kernels will be vanishingly small. Hence, this CL removes this workaround. This effectively reverts CLs 209597 (version parser), 209899 (mlock top of signal stack), 210299 (better failure message), 223121 (soft mlock failure handling), and 244059 (special-case patched Ubuntu kernels). The one thing we keep is the osArchInit function. It's empty everywhere now, but is a reasonable hook to have. Updates #35326, #35777 (the original register corruption bugs). Updates #40184 (request to revert in 1.15). Fixes #35979. Change-Id: Ie213270837095576f1f3ef46bf3de187dc486c50 Reviewed-on: https://go-review.googlesource.com/c/go/+/246200 Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
The resolution to #35777 was to
mlock
the top of stack on "affected" kernels. But then it was discovered that major distros use kernels that appear by the version number to be affected but are actually patched (#37436). It was also discovered that docker under systemd has a very low limit onmlock
ed pages (same issue). The resolution for that was to stop reporting failures ofmlock
and delay the warning until a crash is caught.There is a TODO comment in the code to remove all of that hack for Go 1.15, since unpatched kernels are unlikely to be encountered, but Go 1.15 is due to be released in a short time and the code is still there.
I think it is important to remove the workaround since Ubuntu 20.04 LTS uses a patched 5.4.0 kernel. This means that any user on Ubuntu 20.04 will still unnecessarily mlock pages, and if he runs in a docker container, that warning will be displayed for every crash, disregarding the fact that his kernel is not really buggy. So those users might be sent on a wild goose chase trying to understand and read all this info, and it will have nothing to do with their bug, probably for the entirety of Ubuntu 20.04 life cycle.
Relevant code is in
src/runtime/os_linux_x86.go
The text was updated successfully, but these errors were encountered: