-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assert failure: !CREATE_CHECK_STRING(pMT && pMT->Validate()) #67046
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue Details
GCStress wasn't set, but this is typically a GC info bug. JitStress is probably irrelevant.
|
After configuring a Linux/arm32 environment (using Docker on Linux/arm64), with Release libraries and Checked coreclr, I can easily reproduce this failure, with
The stack is:
andn it looks like we're reporting slot SP offset 0x20 in
:
but I'm less sure of that. This does match the JitStress=2 codegen from TryFormat, but the I don't currently have a working LLDB in my Linux/arm32 Docker setup, so no SOS either. |
I set up a Raspberry Pi 4 with a 32-bit server Ubuntu installation (Ubuntu 20.04.4 LTS), and I have a working LLDB 10 there, but I can't get the bug to repro in that environment. |
@janvorli @AndyAyersMS I probably need your help to make any more progress on this (or maybe @janvorli I can hand this off to you?) |
According to the CI, these are the changes where the issue was introduced: https://dev.azure.com/dnceng/public/_traceability/runview/changes?currentRunId=1670086 (assuming it's consistently failing, which I believe it is). There are 189 changes. Not sure if it makes sense trying to identify a culprit. Some notable changes:
|
Ok, I've figured this out. The problem was introduced by the enabling of fast tail call optimization for arm32 (#66282). The code that fails is:
which does a fast tail call. When it is setting up the tail call arguments, it turns off GC reporting, since the incoming argument space GC info might be incompatible with the GC info needed by the tail call arguments put in the same location. However, one of the IR nodes in the argument setup is:
When we generate this, we disable GC tracking (again) then re-enable GC tracking (which we shouldn't do). Thus, we end up reporting a range of the tailcall argument setup as interruptible. Most of the slots end up being marked untracked, and gc pointers match with gc pointers. However, one slot is a gcref incoming, and is a byref outgoing, but the untracked slot is marked gcref, not byref. So the byref value is not a legal object pointer, leading to a GC assert. I think the solution will be to have a "ref count" of "disable gc reporting" requests, not just a true/false. cc @jakobbotsch |
When doing fast tail call on arm32, we disable GC reporting. But some IR nodes, like unrolled STORE_BLK, also disable GC reporting. If one of those is within a fast tail call argument setup region, we'll attempt to disable GC reporting twice. Since we didn't keep track of nesting, we end up marking some of the tail call region after the STORE_BLK as interruptible, leading to be GC info in the argument area. Change the enable/disable GC calls to keep a nesting count, and only re-enable GC reporting when the count reaches zero. Fixes dotnet#67046
When doing fast tail call on arm32, we disable GC reporting. But some IR nodes, like unrolled STORE_BLK, also disable GC reporting. If one of those is within a fast tail call argument setup region, we'll attempt to disable GC reporting twice. Since we didn't keep track of nesting, we end up marking some of the tail call region after the STORE_BLK as interruptible, leading to be GC info in the argument area. Change the enable/disable GC calls to keep a nesting count, and only re-enable GC reporting when the count reaches zero. Fixes #67046
When doing fast tail call on arm32, we disable GC reporting. But some IR nodes, like unrolled STORE_BLK, also disable GC reporting. If one of those is within a fast tail call argument setup region, we'll attempt to disable GC reporting twice. Since we didn't keep track of nesting, we end up marking some of the tail call region after the STORE_BLK as interruptible, leading to be GC info in the argument area. Change the enable/disable GC calls to keep a nesting count, and only re-enable GC reporting when the count reaches zero. Fixes dotnet#67046
https://dev.azure.com/dnceng/public/_build/results?buildId=1675437&view=ms.vss-test-web.build-test-results-tab&runId=45952896&resultId=184681&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab
GCStress wasn't set, but this is typically a GC info bug. JitStress is probably irrelevant.
The text was updated successfully, but these errors were encountered: