-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/internal/obj: rework stack unwind metadata on LR machines #53609
Comments
Change https://go.dev/cl/412474 mentions this issue: |
…rames When we create a thread with signals blocked. But glibc's pthread_sigmask doesn't really allow us to block SIGSETXID. So we may get a signal early on before the signal stack is set. If we get a signal on the current stack, it will clobber anything below the SP. This CL makes it to save LR and decrement SP in a single MOVD.W instruction for small frames, so we don't write below the SP. We used to use a single MOVD.W instruction before CL 379075. CL 379075 changed to use an STP instruction to save the LR and FP, then decrementing the SP. This CL changes it back, just this part (epilogues and large frame prologues are unchanged). For small frames, it is the same number of instructions either way. This decreases the size of a "small" frame from 0x1f0 to 0xf0. For frame sizes in between, it could benefit from using an STP instruction instead of using the prologue for the "large" frame case. We don't bother it for now as this is a stop-gap solution anyway. This only addresses the issue with small frames. Luckily, all functions from thread entry to setting up the signal stack have samll frames. Other possible ideas: - Expand the unwind info metadata, separate SP delta and the location of the return address, so we can express "SP is decremented but the return address is in the LR register". Then we can always create the frame first then write the LR, without writing anything below the SP (except the frame pointer at SP-8, which is minor because it doesn't really affect program execution). - Set up the signal stack immediately in mstart in assembly. For Go 1.19 we do this simple fix. We plan to do the metadata fix in Go 1.20 ( #53609 ). Other LR architectures are addressed in CL 413428. Fix #53374. Change-Id: I9d6582ab14ccb06ac61ad43852943d9555e22ae5 Reviewed-on: https://go-review.googlesource.com/c/go/+/412474 Run-TryBot: Cherry Mui <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Austin Clements <[email protected]> Reviewed-by: Eric Fang <[email protected]>
…rames When we create a thread with signals blocked. But glibc's pthread_sigmask doesn't really allow us to block SIGSETXID. So we may get a signal early on before the signal stack is set. If we get a signal on the current stack, it will clobber anything below the SP. This CL makes it to save LR and decrement SP in a single MOVD.W instruction for small frames, so we don't write below the SP. We used to use a single MOVD.W instruction before CL 379075. CL 379075 changed to use an STP instruction to save the LR and FP, then decrementing the SP. This CL changes it back, just this part (epilogues and large frame prologues are unchanged). For small frames, it is the same number of instructions either way. This decreases the size of a "small" frame from 0x1f0 to 0xf0. For frame sizes in between, it could benefit from using an STP instruction instead of using the prologue for the "large" frame case. We don't bother it for now as this is a stop-gap solution anyway. This only addresses the issue with small frames. Luckily, all functions from thread entry to setting up the signal stack have samll frames. Other possible ideas: - Expand the unwind info metadata, separate SP delta and the location of the return address, so we can express "SP is decremented but the return address is in the LR register". Then we can always create the frame first then write the LR, without writing anything below the SP (except the frame pointer at SP-8, which is minor because it doesn't really affect program execution). - Set up the signal stack immediately in mstart in assembly. For Go 1.19 we do this simple fix. We plan to do the metadata fix in Go 1.20 ( golang#53609 ). Other LR architectures are addressed in CL 413428. Fix golang#53374. Change-Id: I9d6582ab14ccb06ac61ad43852943d9555e22ae5 Reviewed-on: https://go-review.googlesource.com/c/go/+/412474 Run-TryBot: Cherry Mui <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Austin Clements <[email protected]> Reviewed-by: Eric Fang <[email protected]>
Change https://go.dev/cl/424254 mentions this issue: |
Currently, gentraceback tracks the closure context of the outermost frame. This used to be important for "unstarted" calls to reflect function stubs, where "unstarted" calls are either deferred functions or the entry-point of a goroutine that hasn't run. Because reflect function stubs have a dynamic argument map, we have to reach into their closure context to fetch to map, and how to do this differs depending on whether the function has started. This was discovered in issue #25897. However, as part of the register ABI, "go" and "defer" were made much simpler, and any "go" or "defer" of a function that takes arguments or returns results gets wrapped in a closure that provides those arguments (and/or discards the results). Hence, we'll see that closure instead of a direct call to a reflect stub, and can get its static argument map without any trouble. The one case where we may still see an unstarted reflect stub is if the function takes no arguments and has no results, in which case the compiler can optimize away the wrapper closure. But in this case we know the argument map is empty: the compiler can apply this optimization precisely because the target function has no argument frame. As a result, we no longer need to track the closure context during traceback, so this CL drops all of that mechanism. We still have to be careful about the unstarted case because we can't reach into the function's locals frame to pull out its context (because it has no locals frame). We double-check that in this case we're at the function entry. I would prefer to do this with some in-code PCDATA annotations of where to find the dynamic argument map, but that's a lot of mechanism to introduce for just this. It might make sense to consider this along with #53609. Finally, we beef up the test for this so it more reliably forces the runtime down this path. It's fundamentally probabilistic, but this tweak makes it better. Scheduler testing hooks (#54475) would make it possible to write a reliable test for this. For #54466, but it's a nice clean-up all on its own. Change-Id: I16e4f2364ba2ea4b1fec1e27f971b06756e7b09f Reviewed-on: https://go-review.googlesource.com/c/go/+/424254 Run-TryBot: Austin Clements <[email protected]> Reviewed-by: Michael Pratt <[email protected]> Auto-Submit: Austin Clements <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
Didn't get chance to do it in 1.20. Will do in 1.21. |
Change https://go.dev/cl/621775 mentions this issue: |
Change https://go.dev/cl/621459 mentions this issue: |
On LR machines, our function prologue generally looks like (stack bounds check and frame pointer handling omitted): store the LR at
-frame_size(SP)
, then decrement SP toSP-frame_size
(if they cannot be done on the same instruction, either due to the lack of the instruction on the architecture or the stack frame too large).If we decrement the SP first, if a signal arrives immediately after the instruction before saving the LR, the runtime will see the junk value at the LR slot and will fail to unwind the stack. So we store the LR first then decrement the SP.
This generally works. But in some cases if the signal stack is not set (e.g. #53374 , also iOS doesn't support
sigaltstack
), the signal will arrive on the current stack and the kernel will push the signal frame immediately below the SP, clobbering our saved LR. In those cases we have to store the LR again after decrementing the SP. This makes our function prologue a bit inefficient.We can avoid the problem if we expand the stack unwind metadata to express "the SP has decremented but the return address is still in the LR register". Currently, the metadata is just about SP delta and the runtime always assumes the return address can be found at
0(SP)
(for non-leaf functions). With the expanded metadata, we can decrement SP and then store the LR.This also makes our prologue more similar to C functions.
I plan to do it in Go 1.20.
The text was updated successfully, but these errors were encountered: