-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto/rand: crash process on error reading randomness #66821
Comments
Do you mean crypto/rand should panic on errors? I'm not very familiar with Go but I didn't think it used the terminology of "throwing" errors. I agree, on general language-independent robustness principles, that it should panic. |
Panics are recoverable, throw is an internal name for fatal errors. Think of it as a call to If we made crypto/rand panic that risks encouraging applications to wrap the calls in defer/recover "for robustness", when really we think it's so unlikely and so unrecoverable that applications shouldn't try. Lines 1008 to 1022 in 519f6a0
|
I think all our code already panics on crypto/rand errors (via wrapppers that don't return errors) so SGTM 😀 |
WASI 0.2 random interface thankfully does not return an error: https://github.com/WebAssembly/wasi-random/blob/main/wit/random.wit |
Sounds like we need a rand v3 |
This is the best argument for always using |
Should we keep the |
This is tempting, but I think making decisions based on kernel version is opening a can of worms. I think even the urandom fallback is reasonably reliable: the file is opened only once, so either crypto/rand never works in a given process or it always works (although it might flake across process executions, if you run out of fds before the first Read call, or if the file is removed). |
Ah I thought it opened a new file each time. Then can we use import time side effects to solve this (open the file in I get the std tries to not do that, but the overwhelming majority of cases init will start running, try |
I was thinking about that but I have no intuition as to whether the cost of calling getrandom (to check if it's available) on |
If crypto/rand is imported (even indirectly), it should be fine to make a single system call during init. I'm kinda surprised it doesn't do so already. |
Does it? On linux when there is random in auxv, then it does not even open it. Lines 44 to 58 in f17b28d
|
@mateusz834 True, on linux if we get auxv randomness we don't read /dev/urandom. |
This proposal has been added to the active column of the proposals project |
Have all remaining concerns about this proposal been addressed? The proposal is to document that crypto/rand.Read and crypto/rand.Reader.Read always return the full amount requested and never return errors. If the underlying OS returns an error, the Go process will runtime.throw, meaning the process crashes with no chance to recover. But no underlying OS’s actually return errors from random reads anymore. This lets callers simplify and delete their dead error handling paths. |
FYI, the Reader is a var and people might change it to a custom Reader implementation. I wonder whether that might be an issue for us here?
@rsc I think that we cannot document rand.Read as such. I think that, we can only document the default |
Also it seems like plan9 always opens go/src/crypto/rand/rand_plan9.go Lines 36 to 49 in 74a4918
CC @0intro |
Change https://go.dev/cl/602497 mentions this issue: |
Change https://go.dev/cl/602496 mentions this issue: |
Change https://go.dev/cl/602495 mentions this issue: |
Change https://go.dev/cl/608175 mentions this issue: |
A follow-up question from the CL: should we have a |
Change https://go.dev/cl/608435 mentions this issue: |
The fallback was reachable on - Linux, where starting in Go 1.24 we require a kernel with getrandom(2), see #67001. - FreeBSD, which added getrandom(2) in FreeBSD 12.0, which we require since Go 1.19. - OpenBSD, which added getentropy(2) in OpenBSD 5.6, and we only support the latest version. - DragonFly BSD, which has getrandom(2) and where we support only the latest version. - NetBSD, where we switched to kern.arandom in CL 511036, available since NetBSD 4.0. - illumos, which has getrandom(2). (Supported versions unclear.) - Solaris, which had getrandom(2) at least since Oracle Solaris 11.4. - AIX, which... ugh, fine, but that code is now in rand_aix.go. At the end of the day the platform-specific code is just a global func(b []byte) error, so simplified the package around that assumption. This also includes the following change, which used to be a separate CL. crypto/rand: improve getrandom batching and retry logic The previous logic assumed getrandom never returned short, and then applied stricter-than-necessary batch size limits, presumably to avoid short returns. This was still not sufficient because above 256 bytes getrandom(2) can be interrupted by a signal and return short *or* it can simply return EINTR if the pool is not initialized (regardless of buffer size). https://man.archlinux.org/man/getrandom.2#Interruption_by_a_signal_handler Whether this ever failed in practice is unknown: it would have been masked by the /dev/urandom fallback before. Instead, we apply buffer size limits only where necessary (really, only Solaris in practice and FreeBSD in theory) and then handle gracefully short returns and EINTR. Change-Id: I8677b457aab68a8fb6137a3b43538efc62eb7c93 It turns out that we now know that large getrandom calls *did* fail in practice, falling back on /dev/urandom, because when we removed the fallback TestBidiStreamReverseProxy with its 4KiB read started failing. https://cr-buildbucket.appspot.com/build/8740779846954406033 For #66821 Change-Id: Iaca62997604f326501a51401cdc2659c2790ff22 Reviewed-on: https://go-review.googlesource.com/c/go/+/602495 Reviewed-by: Roland Shoemaker <[email protected]> Reviewed-by: David Chase <[email protected]> Reviewed-by: Daniel McCarney <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
Reintroduce the urandom fallback, but this time with a robust set of tests all pointing guns at each other, including a seccomp'd respawn simulating the lack of getrandom, to make sure the fallback both works and is never hit unexpectedly. Unlike the Go 1.23 fallback, the new one only triggers on ENOSYS (which is cached by unix.GetRandom) and doesn't handle the EAGAIN errors we never got an explanation for. We still crash the program from Read if we have to go to /dev/urandom and we fail to open it. For #67001 Updates #66821 Tested on legacy SlowBots (without plan9 and illumos, which don't work): TRY=aix-ppc64,dragonfly-amd64,freebsd-amd64,freebsd-386,netbsd-amd64 Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64_14,gotip-solaris-amd64,gotip-js-wasm,gotip-wasip1-wasm_wasmtime,gotip-wasip1-wasm_wazero,gotip-windows-amd64,gotip-windows-386,gotip-linux-386,gotip-linux-amd64-longtest-race,gotip-linux-amd64-boringcrypto Change-Id: Idecc96a18cd6363087f5b2a4671c6fd1c41a3b0e Reviewed-on: https://go-review.googlesource.com/c/go/+/608175 Reviewed-by: Daniel McCarney <[email protected]> Reviewed-by: Cherry Mui <[email protected]> Reviewed-by: Roland Shoemaker <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
For #66821 Change-Id: I525c308d6d6243a2bc805e819dcf40b67e52ade5 Reviewed-on: https://go-review.googlesource.com/c/go/+/608435 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Daniel McCarney <[email protected]> Reviewed-by: Michael Knyszek <[email protected]> Reviewed-by: Roland Shoemaker <[email protected]>
Change https://go.dev/cl/621979 mentions this issue: |
We realized a GODEBUG is actually a security risk here: most programs will start to ignore errors from Read because they can't happen (which is the intended behavior), but then if a program is run with Note that the proposal was accepted without the GODEBUG, which was only added later. |
Change https://go.dev/cl/622115 mentions this issue: |
A GODEBUG is actually a security risk here: most programs will start to ignore errors from Read because they can't happen (which is the intended behavior), but then if a program is run with GODEBUG=randcrash=0 it will use a partial buffer in case an error occurs, which may be catastrophic. Note that the proposal was accepted without the GODEBUG, which was only added later. This (partially) reverts CL 608435. I kept the tests. Updates #66821 Change-Id: I3fd20f9cae0d34115133fe935f0cfc7a741a2662 Reviewed-on: https://go-review.googlesource.com/c/go/+/622115 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Carlos Amedee <[email protected]> Reviewed-by: Roland Shoemaker <[email protected]> Auto-Submit: Filippo Valsorda <[email protected]> Reviewed-by: Daniel McCarney <[email protected]>
After #66821 crypto/rand.Read can't return an error. Change-Id: I185063a25ef70986448f2a300e5578de17f6e61e Reviewed-on: https://go-review.googlesource.com/c/go/+/621979 Auto-Submit: Filippo Valsorda <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Michael Knyszek <[email protected]> Reviewed-by: Russ Cox <[email protected]> Reviewed-by: Daniel McCarney <[email protected]>
@FiloSottile Does this change need to be mentioned in Go 1.24 release notes? Thanks. |
I think this does need to be in the release notes. Sent https://go.dev/cl/632036. Thanks. |
Change https://go.dev/cl/632036 mentions this issue: |
For #66821 Change-Id: Id9b640a57b9d4d1f9114769f607480b14961e7b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/632036 Reviewed-by: Russ Cox <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> Auto-Submit: Ian Lance Taylor <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
On almost all our platforms, we now have crypto/rand backends that ~never fail.
On Linux, we primarily use the getrandom(2) system call, which never fails.
getrandom()
was first available in Linux 3.17, released in October 2014. Debian oldstable is on Linux 5.10.getrandom()
can be blocked with seccomp. That's a bad (and weird) idea, and the default Docker profile doesn't do that. In that case we fall back to opening/dev/urandom
, which might fail if the file is not available or file descriptors run out.On macOS and iOS we use
arc4random()
since https://go.dev/cl/569655. From the man page:On Windows we use the ProcessPrng function. From the docs:
The BSDs use similar syscalls with similar properties (whether getrandom or getentropy) although we should switch the ones we can to arc4random.
On
js/wasm
we use getRandomValues which doesn't have documented failure modes.On WASIP1 there's
random_get
which regrettably has an error return value, making it the one platform (ignoring misconfigured Linux) where there might be errors getting platform random bytes. Since WASI rests on an underlying platform, and every underlying platform has failure-less CSPRNGs, it's hard to imagine whyrandom_get
should actually return an error.I'm proposing we make crypto/rand throw (irrecoverably crash the program) if an error occurs, and document that the error return values of
crypto/rand.Read
andcrypto/rand.Reader.Read
are always nil.This will free applications from having to do error handling for a condition that essentially can't happen, and that if it did happen is essentially not possible to handle securely by the application.
This will also allow introducing new APIs like a hypothetical
String(charset string) string
(not part of this proposal) without an error return, making them more usable and appealing.Based on a suggestion by @rsc.
/cc @golang/security @golang/proposal-review
The text was updated successfully, but these errors were encountered: