Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
runtime: fix racy allgs access on weak memory architectures
Currently, markroot is very clever about accessing the allgs slice to find stack roots. Unfortunately, on weak memory architectures, it's a little too clever and can sometimes read a nil g, causing a fatal panic. Specifically, gcMarkRootPrepare snapshots the length of allgs during STW and then markroot accesses allgs up to this length during concurrent marking. During concurrent marking, allgadd can append to allgs *without synchronizing with markroot*, but the argument is that the markroot access should be safe because allgs only grows monotonically and existing entries in allgs never change. This reasoning is insufficient on weak memory architectures. Suppose thread 1 calls allgadd during concurrent marking and that allgs is already at capacity. On thread 1, append will allocate a new slice that initially consists of all nils, then copy the old backing store to the new slice (write A), then allgadd will publish the new slice to the allgs global (write B). Meanwhile, on thread 2, markroot reads the allgs slice base pointer (read A), computes an offset from that base pointer, and reads the value at that offset (read B). On a weak memory machine, thread 2 can observe write B *before* write A. If the order of events from thread 2's perspective is write B, read A, read B, write A, then markroot on thread 2 will read a nil g and then panic. Fix this by taking a snapshot of the allgs slice header in gcMarkRootPrepare while the world is stopped and using that snapshot as the list of stack roots in markroot. This eliminates all read/write concurrency around the access in markroot. Alternatively, we could make markroot use the atomicAllGs API to atomically access the allgs list, but in my opinion it's much less subtle to just eliminate all of the interesting concurrency around the allgs access. Fixes #49686. Fixes #48845. Fixes #43824. (These are all just different paths to the same ultimate issue.) Change-Id: I472b4934a637bbe88c8a080a280aa30212acf984 Reviewed-on: https://go-review.googlesource.com/c/go/+/368134 Trust: Austin Clements <[email protected]> Trust: Bryan C. Mills <[email protected]> Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Michael Knyszek <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
- Loading branch information