-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in Int16 variables when running in M1 macs #66720
Comments
This is unlikely to be a compiler issue, since the compiler emits IL (which is independent from the specific platform). I will move this to the runtime repo. If it turns out the IL emitted is wrong, then it would be a compiler issue. |
Thanks for letting me know, and sorry if it is posted in the wrong place. I guess it is likely indeed from the ARM Jitter, but I thought (incorrectly, as it seems) this group covered IL generation and conversion from IL to actual code. If that is not the case, can you tell me where I should report it? |
@agallero no problem at all, we have a lot of repos and it's easy to transfer. this is now in the right place and labeled for that team to find it. ps welcome to our repo. |
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsVersion Used: Steps to Reproduce:
Expected Behavior: Actual Behavior: Description It looks like the compiler passes the full int32 value of an int16 variable (filling the upper bits with garbage, which causes the overflow). If you change the "checked" line to "unchecked, then you get lines like this:0x2672 = 0x80F62672 0x37 = 0x80F60037 Which seem to indicate the int16 number is just the original number with garbage in the upper bits) When you debug the code, you see: So the value it is trying to set is 0x81080EEB But in the class constructor, aVar4 changed from 0xEEB to 0x8180EEB: I can reproduce it 100% in this machine, but I don't know if it will happen in others, even other arm macs. I hope this information is sufficient to figure out what is going on, but any extra information please ask.
|
Reproducible with .NET 7 as well (both, Debug and Release configurations). Reduced reprousing System;
using System.Threading.Tasks;
internal class BaseClass
{
internal Int16 X;
internal BaseClass(int x)
{
checked
{
X = (Int16) x;
}
}
}
internal class ChildClass : BaseClass
{
internal ChildClass(int a, int b, int c, int d, bool e, bool f, int i, int last)
: base(last) {}
internal ChildClass Clone() => new ChildClass(0, 0, 0, 0, false, false, 0, X);
}
internal static class Program
{
static void Main()
{
Parallel.For(0, 100, _ =>
{
var tk = new ChildClass(0, 0, 0, 0, false, false, 0, 0);
tk = tk.Clone();
});
Console.WriteLine("finished");
}
}
|
Thanks for looking into it. Just in case it can help, the original bug in my codebase happens without Parallel.ForEach, and it is indeed happening in a normal loop. But the code is way more complex than this test case, and when I start removing code to make a minimal reproducible example, then the bug disappears. So that's why I thought to put it into a parallel.foreach, since the problem seems to be a register or memory that isn't cleared, and high concurrency can help making sure the memory and registers are not zero. But with more complex code, this is reproducible in a single thread. |
Thanks for the additional input! It is indeed an interesting edge case where (apparently) this particular set/order of arguments seem to have side-effects (while narrowing or widening the number). I was trying to pinpoint whether it is related to threading, runtime or case of bad codegen. Note that on linux-arm64, this issue does not reproduce. |
This also repros with daily build on osx-arm64 / M1 (tested with 7.0.100-preview.4.22174.1; runtime commit: c5d40c9). |
Repro without Prallel.ForEach and Action (predicate): // osx-arm64 | .NET 7 & 6 | Debug & Release
using System;
using System.Runtime.CompilerServices;
using System.Threading;
internal sealed class C
{
private readonly short X;
internal C(int a, int b, int c, int d, byte e, byte f, int g, int last)
{
if (last > 0)
throw new ApplicationException($"Unexpected value: {last}");
X = (short) last;
}
[MethodImpl(MethodImplOptions.NoInlining)] // comment this line and ApplicationException will
// disappear from release (but not debug).
internal void M() => new C(0, 0, 0, 0, 0, 0, 0, X);
}
internal static class Program
{
static void Main()
{
var tk = new C(0, 0, 0, 0, 0, 0, 0, 0);
Thread.Sleep(1); // comment this line and ApplicationException will not be thrown
tk.M();
Console.WriteLine("finished");
}
} |
Think this is a different issue than what we see in #67188 / #67152-- here we do a narrow store and a wide load; there we do wide stores. In this case it looks like it's the caller's fault, we are passing an ;; caller M(.... X)
79C01000 ldrsh w0, [x0,#8]
790003E0 strh w0, [sp] // [V01 OutArgs]
;; callee C..ctor(....)
B94083A0 ldr w0, [fp,#128]
7100001F cmp w0, #0
|
|
|
FWIW, we have code that has some (non-macOS) handling for this during import, in the same place we insert runtime/src/coreclr/jit/importer.cpp Lines 912 to 923 in 5453951
|
I think this is "too early" as we only need this for memory args, and don't know this yet? @sandreenko any suggestions on where to try and fix this? My current thinking is to add a cast in morph if we have a stack |
I'm not sure that's the right fix, a small I think the problem is here: runtime/src/coreclr/jit/codegenarmarch.cpp Lines 826 to 840 in 2d4f2d0
This needs to handle the 4 case too. Alternatively we can fix it here: runtime/src/coreclr/jit/codegenarmarch.cpp Lines 744 to 751 in 2d4f2d0
For small types it is not right to use the source type, we need the actual type of the argument. In fact, maybe we should just stop special casing macOS arm64 here and the above switch will take care of the small types for us without any change. |
Version Used:
.NET 6.0.1 - Apple M1 Pro
.NET SDK 6.0.201
Steps to Reproduce:
Expected Behavior:
It should not have any errors. We are just assigning an Int16 to an Int16, so there is no possibility to overflow.
Actual Behavior:
Unhandled exception. System.AggregateException: One or more errors occurred. (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.) (Arithmetic operation resulted in an overflow.)....
Description
This bug happens in Apple M1 chips, not in intel machines. I am not sure about iphone chips or other arm targets.
It is happening in a big application without threads, but it was difficult to reduce to a simple program that could be analyzed. To be able to reproduce it always, I had to add a parallel loop to make the effect evident. But it also happens in the big app wihtout threads at all. Also in this example, if you disable the parallel loop and change it to a single loop, I can still reproduce while debugging but not from the command line.
It looks like the compiler passes the full int32 value of an int16 variable (filling the upper bits with garbage, which causes the overflow).
If you change the "checked" line to "unchecked, then you get lines like this:0x2672 = 0x80F62672
0x37 = 0x80F60037
0x396 = 0x80F60396
Which seem to indicate the int16 number is just the original number with garbage in the upper bits)
When you debug the code, you see:
So the value it is trying to set is 0x81080EEB
If you move up in the stack trace, in the clone method we pass 0xEEB:
But in the class constructor, aVar4 changed from 0xEEB to 0x8180EEB:
I can reproduce it 100% in this machine, but I don't know if it will happen in others, even other arm macs. I hope this information is sufficient to figure out what is going on, but any extra information please ask.
uint-bug.zip
The text was updated successfully, but these errors were encountered: