Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed JIT (uint) cast workaround to elide bound-checks in CoreLib #67448

Merged
merged 13 commits into from
Jul 12, 2022

Conversation

gfoidl
Copy link
Member

@gfoidl gfoidl commented Apr 1, 2022

Contributes to #67044 -- CoreLib only


Edit: see #67448 (comment)
So when a uint-cast is needed (for the index), then for clarity it should also be on the length-side (i.e. on both sides).
Hence the mentioned pattern for span is outdated now.

The pattern (uint)index < span.Length isn't recognized by the JIT, so the bound-checks aren't removed -- cf. #67044 (comment) -- so at the current state of the PR here the uint-casts are still there, but these places are annotated with a // TODO comment, so I (or anyone) can find them easier for an update once span.Length is handled as expected. That's the reason why the PR is marked as draft.

That issue should be fixed, then the removal of the workaround can be continued without introducing (temporary regressions).


In Visual Studio I see the project System.Private.CoreLib.Shared.
JIT's optimization for the bound-checks is for .NET 7 onwards, so my question is to which other targets is that project "shared"?
Not that this PR introduces some regression elsewhere.


The change was done rather mechanical by an Regex.Replace, then verifyed (in iterations / batches) through jit-diffs and manual inspection of the git-diffs.
So a few additional bound-checks got elided by changing the code a bit.

jit-diff
Found 2 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 3188507
Total bytes of diff: 3185674
Total bytes of delta: -2833 (-0.09 % of base)
Total relative delta: -3.50
    diff is an improvement.
    relative diff is an improvement.


Top file improvements (bytes):
       -2833 : System.Private.CoreLib.dasm (-0,09 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method regressions (bytes):
         498 (7,61 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToStringFormat(byref,byref,System.ReadOnlySpan`1[System.Char],System.Globalization.NumberFormatInfo)

Top method improvements (bytes):
        -538 (-6,38 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -538 (-6,61 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -534 (-7,16 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -115 (-7,65 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetSerializedString(System.TimeZoneInfo):System.String
         -84 (-8,85 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToString(byref,byref,ushort,int,System.Globalization.NumberFormatInfo)
         -77 (-7,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:SymmetricExceptWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -75 (-11,06 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeTransitionTime(System.TimeZoneInfo+TransitionTime,byref)
         -63 (-6,98 % of base) : System.Private.CoreLib.dasm - System.ApplicationId:ToString():System.String:this
         -60 (-3,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:ComputeDisplayName(System.String,System.Version,System.String,System.Byte[],int,int):System.String
         -59 (-8,31 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:AppendArgument(byref,System.String)
         -56 (-10,63 % of base) : System.Private.CoreLib.dasm - System.Number:FormatGeneral(byref,byref,int,System.Globalization.NumberFormatInfo,ushort,bool)
         -48 (-7,92 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeMethodInfo:ToString():System.String:this
         -45 (-8,46 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:AppendQuoted(byref,System.String)
         -43 (-5,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeData:ToString():System.String:this
         -41 (-9,60 % of base) : System.Private.CoreLib.dasm - RTDynamicMethod:ToString():System.String:this
         -41 (-4,76 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1[System.__Canon],bool):System.ValueTuple`2[System.Int32, System.Int32]:this
         -40 (-6,20 % of base) : System.Private.CoreLib.dasm - System.String:Replace(System.String,System.String):System.String:this
         -36 (-5,00 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:IntersectWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -34 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeTypedArgument:ToString(bool):System.String:this
         -33 (-7,60 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimePropertyInfo:ToString():System.String:this
         -32 (-4,65 % of base) : System.Private.CoreLib.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -28 (-2,64 % of base) : System.Private.CoreLib.dasm - System.Number:FormatFixed(byref,byref,int,System.Int32[],System.String,System.String)
         -27 (-2,85 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlDecode(System.ReadOnlySpan`1[System.Char],byref)
         -27 (-3,13 % of base) : System.Private.CoreLib.dasm - System.String:Join(System.String,System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -26 (-3,16 % of base) : System.Private.CoreLib.dasm - System.Globalization.CalendarData:NormalizeDatePattern(System.String):System.String
         -26 (-7,10 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeConstructorInfo:ToString():System.String:this
         -26 (-4,45 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.ReadOnlySpan`1[System.Char],byref):this
         -22 (-6,73 % of base) : System.Private.CoreLib.dasm - System.Number:FormatExponent(byref,System.Globalization.NumberFormatInfo,int,ushort,int,bool)
         -20 (-2,62 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:Paste(System.Collections.Generic.IEnumerable`1[System.String],bool):System.String
         -19 (-2,94 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Combine(System.String[]):System.String
         -19 (-4,51 % of base) : System.Private.CoreLib.dasm - System.Number:FormatPercent(byref,byref,int,System.Globalization.NumberFormatInfo)
         -19 (-0,22 % of base) : System.Private.CoreLib.dasm - System.Span`1:ToString():System.String:this (20 methods)
         -19 (-4,92 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorListVectorized(byref,ushort,ushort,ushort):this
         -18 (-3,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.MethodBase:AppendParameters(byref,System.Type[],int)
         -18 (-2,69 % of base) : System.Private.CoreLib.dasm - System.String:ReplaceLineEndings(System.String):System.String:this
         -18 (-0,89 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendFormatHelper(System.IFormatProvider,System.String,System.ParamsArray):this
         -17 (-9,24 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeSubstitute(System.String,byref)
         -17 (-4,67 % of base) : System.Private.CoreLib.dasm - System.Number:FormatScientific(byref,byref,int,System.Globalization.NumberFormatInfo,ushort)
         -16 (-1,75 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextStringValue():System.String:this
         -16 (-1,50 % of base) : System.Private.CoreLib.dasm - System.IO.Path:GetRelativePath(System.String,System.String,int):System.String
         -16 (-3,86 % of base) : System.Private.CoreLib.dasm - System.Number:FormatCurrency(byref,byref,int,System.Globalization.NumberFormatInfo)
         -16 (-1,81 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-2,42 % of base) : System.Private.CoreLib.dasm - TimeSpanTokenizer:GetNextToken():System.Globalization.TimeSpanParse+TimeSpanToken:this
         -15 (-3,92 % of base) : System.Private.CoreLib.dasm - System.Globalization.StringInfo:ParseCombiningCharacters(System.String):System.Int32[]
         -15 (-2,58 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Join(System.String[]):System.String
         -15 (-0,22 % of base) : System.Private.CoreLib.dasm - System.ReadOnlySpan`1:ToString():System.String:this (16 methods)
         -15 (-3,08 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Object[]):System.String
         -15 (-3,80 % of base) : System.Private.CoreLib.dasm - System.TypeNameParser:EscapeTypeName(System.String):System.String
         -13 (-1,89 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlEncode(System.ReadOnlySpan`1[System.Char],byref)
         -12 (-11,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ValueListBuilder`1:Append(System.__Canon):this
         -12 (-4,07 % of base) : System.Private.CoreLib.dasm - System.DateTimeParse:TryParseQuoteString(System.ReadOnlySpan`1[System.Char],int,byref,byref):bool
         -12 (-1,92 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:RemoveRelativeSegments(System.ReadOnlySpan`1[System.Char],int,byref):bool
         -12 (-3,30 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String[],byref,byref):this
         -11 (-2,26 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:NormalizeDirectorySeparators(System.String):System.String
         -11 (-0,94 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameParser:GetNextToken(byref):int:this
         -11 (-12,50 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(System.String):this
         -11 (-16,42 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(ushort):this
         -10 (-11,76 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:GrowAndAppend(ushort):this
          -9 (-1,65 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
          -9 (-1,87 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.String]):System.String
          -9 (-2,56 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(double,System.String,System.IFormatProvider):this
          -9 (-3,31 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(int,System.String,System.IFormatProvider):this
          -9 (-3,33 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.__Canon,System.String,System.IFormatProvider):this
          -9 (-3,19 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.DateTime,System.String,System.IFormatProvider):this
          -7 (-11,48 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:MarkBit(int):this
          -7 (-2,76 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String,byref):this
          -6 (-10,34 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:IsMarked(int):bool:this
          -6 (-2,00 % of base) : System.Private.CoreLib.dasm - System.Number:FormatNumber(byref,byref,int,System.Globalization.NumberFormatInfo)
          -5 (-6,02 % of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:Append(ushort):System.Text.StringBuilder:this (2 methods)
          -2 (-0,57 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.KeyValuePair:PairToString(System.Object,System.Object):System.String
          -2 (-0,37 % of base) : System.Private.CoreLib.dasm - System.Globalization.SortKey:ToString():System.String:this
          -2 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactDigits(byref,int,int,byref,byref):bool
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.Memory`1:ToString():System.String:this (3 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:ToString():System.String:this (3 methods)
          -2 (-0,35 % of base) : System.Private.CoreLib.dasm - System.Reflection.LocalVariableInfo:ToString():System.String:this
          -2 (-0,46 % of base) : System.Private.CoreLib.dasm - System.Runtime.Loader.AssemblyLoadContext:ToString():System.String:this
          -2 (-0,52 % of base) : System.Private.CoreLib.dasm - System.Text.Rune:get_DebuggerDisplay():System.String:this
          -1 (-0,11 % of base) : System.Private.CoreLib.dasm - System.ComponentModel.Win32Exception:ToString():System.String:this
          -1 (-0,74 % of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:AppendLiteral(System.String):this

Top method regressions (percentages):
         498 (7,61 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToStringFormat(byref,byref,System.ReadOnlySpan`1[System.Char],System.Globalization.NumberFormatInfo)

Top method improvements (percentages):
         -11 (-16,42 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(ushort):this
         -11 (-12,50 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(System.String):this
         -10 (-11,76 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:GrowAndAppend(ushort):this
          -7 (-11,48 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:MarkBit(int):this
         -12 (-11,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ValueListBuilder`1:Append(System.__Canon):this
         -75 (-11,06 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeTransitionTime(System.TimeZoneInfo+TransitionTime,byref)
         -56 (-10,63 % of base) : System.Private.CoreLib.dasm - System.Number:FormatGeneral(byref,byref,int,System.Globalization.NumberFormatInfo,ushort,bool)
          -6 (-10,34 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:IsMarked(int):bool:this
         -41 (-9,60 % of base) : System.Private.CoreLib.dasm - RTDynamicMethod:ToString():System.String:this
         -17 (-9,24 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeSubstitute(System.String,byref)
         -84 (-8,85 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToString(byref,byref,ushort,int,System.Globalization.NumberFormatInfo)
         -45 (-8,46 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:AppendQuoted(byref,System.String)
         -59 (-8,31 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:AppendArgument(byref,System.String)
         -48 (-7,92 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeMethodInfo:ToString():System.String:this
        -115 (-7,65 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetSerializedString(System.TimeZoneInfo):System.String
         -33 (-7,60 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimePropertyInfo:ToString():System.String:this
        -534 (-7,16 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
         -77 (-7,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:SymmetricExceptWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -26 (-7,10 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeConstructorInfo:ToString():System.String:this
         -63 (-6,98 % of base) : System.Private.CoreLib.dasm - System.ApplicationId:ToString():System.String:this
         -22 (-6,73 % of base) : System.Private.CoreLib.dasm - System.Number:FormatExponent(byref,System.Globalization.NumberFormatInfo,int,ushort,int,bool)
        -538 (-6,61 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -538 (-6,38 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
         -40 (-6,20 % of base) : System.Private.CoreLib.dasm - System.String:Replace(System.String,System.String):System.String:this
          -5 (-6,02 % of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:Append(ushort):System.Text.StringBuilder:this (2 methods)
         -43 (-5,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeData:ToString():System.String:this
         -36 (-5,00 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:IntersectWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -19 (-4,92 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorListVectorized(byref,ushort,ushort,ushort):this
         -41 (-4,76 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1[System.__Canon],bool):System.ValueTuple`2[System.Int32, System.Int32]:this
         -17 (-4,67 % of base) : System.Private.CoreLib.dasm - System.Number:FormatScientific(byref,byref,int,System.Globalization.NumberFormatInfo,ushort)
         -32 (-4,65 % of base) : System.Private.CoreLib.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -19 (-4,51 % of base) : System.Private.CoreLib.dasm - System.Number:FormatPercent(byref,byref,int,System.Globalization.NumberFormatInfo)
         -26 (-4,45 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.ReadOnlySpan`1[System.Char],byref):this
         -12 (-4,07 % of base) : System.Private.CoreLib.dasm - System.DateTimeParse:TryParseQuoteString(System.ReadOnlySpan`1[System.Char],int,byref,byref):bool
         -15 (-3,92 % of base) : System.Private.CoreLib.dasm - System.Globalization.StringInfo:ParseCombiningCharacters(System.String):System.Int32[]
         -16 (-3,86 % of base) : System.Private.CoreLib.dasm - System.Number:FormatCurrency(byref,byref,int,System.Globalization.NumberFormatInfo)
         -15 (-3,80 % of base) : System.Private.CoreLib.dasm - System.TypeNameParser:EscapeTypeName(System.String):System.String
         -18 (-3,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.MethodBase:AppendParameters(byref,System.Type[],int)
         -60 (-3,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:ComputeDisplayName(System.String,System.Version,System.String,System.Byte[],int,int):System.String
          -9 (-3,33 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.__Canon,System.String,System.IFormatProvider):this
          -9 (-3,31 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(int,System.String,System.IFormatProvider):this
         -12 (-3,30 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String[],byref,byref):this
          -9 (-3,19 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.DateTime,System.String,System.IFormatProvider):this
         -26 (-3,16 % of base) : System.Private.CoreLib.dasm - System.Globalization.CalendarData:NormalizeDatePattern(System.String):System.String
         -27 (-3,13 % of base) : System.Private.CoreLib.dasm - System.String:Join(System.String,System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -15 (-3,08 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Object[]):System.String
         -19 (-2,94 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Combine(System.String[]):System.String
         -27 (-2,85 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlDecode(System.ReadOnlySpan`1[System.Char],byref)
          -7 (-2,76 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String,byref):this
         -18 (-2,69 % of base) : System.Private.CoreLib.dasm - System.String:ReplaceLineEndings(System.String):System.String:this
         -28 (-2,64 % of base) : System.Private.CoreLib.dasm - System.Number:FormatFixed(byref,byref,int,System.Int32[],System.String,System.String)
         -20 (-2,62 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:Paste(System.Collections.Generic.IEnumerable`1[System.String],bool):System.String
         -15 (-2,58 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Join(System.String[]):System.String
          -9 (-2,56 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(double,System.String,System.IFormatProvider):this
         -16 (-2,42 % of base) : System.Private.CoreLib.dasm - TimeSpanTokenizer:GetNextToken():System.Globalization.TimeSpanParse+TimeSpanToken:this
         -11 (-2,26 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:NormalizeDirectorySeparators(System.String):System.String
          -6 (-2,00 % of base) : System.Private.CoreLib.dasm - System.Number:FormatNumber(byref,byref,int,System.Globalization.NumberFormatInfo)
         -12 (-1,92 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:RemoveRelativeSegments(System.ReadOnlySpan`1[System.Char],int,byref):bool
         -13 (-1,89 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlEncode(System.ReadOnlySpan`1[System.Char],byref)
          -9 (-1,87 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -16 (-1,81 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-1,75 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextStringValue():System.String:this
          -2 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactDigits(byref,int,int,byref,byref):bool
         -34 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeTypedArgument:ToString(bool):System.String:this
          -9 (-1,65 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-1,50 % of base) : System.Private.CoreLib.dasm - System.IO.Path:GetRelativePath(System.String,System.String,int):System.String
         -11 (-0,94 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameParser:GetNextToken(byref):int:this
         -18 (-0,89 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendFormatHelper(System.IFormatProvider,System.String,System.ParamsArray):this
          -1 (-0,74 % of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:AppendLiteral(System.String):this
          -2 (-0,57 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.KeyValuePair:PairToString(System.Object,System.Object):System.String
          -2 (-0,52 % of base) : System.Private.CoreLib.dasm - System.Text.Rune:get_DebuggerDisplay():System.String:this
          -2 (-0,46 % of base) : System.Private.CoreLib.dasm - System.Runtime.Loader.AssemblyLoadContext:ToString():System.String:this
          -2 (-0,37 % of base) : System.Private.CoreLib.dasm - System.Globalization.SortKey:ToString():System.String:this
          -2 (-0,35 % of base) : System.Private.CoreLib.dasm - System.Reflection.LocalVariableInfo:ToString():System.String:this
         -19 (-0,22 % of base) : System.Private.CoreLib.dasm - System.Span`1:ToString():System.String:this (20 methods)
         -15 (-0,22 % of base) : System.Private.CoreLib.dasm - System.ReadOnlySpan`1:ToString():System.String:this (16 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.Memory`1:ToString():System.String:this (3 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:ToString():System.String:this (3 methods)
          -1 (-0,11 % of base) : System.Private.CoreLib.dasm - System.ComponentModel.Win32Exception:ToString():System.String:this

80 total methods with Code Size differences (79 improved, 1 regressed), 25045 unchanged.

--------------------------------------------------------------------------------

System.Number:NumberToStringFormat wasn't touched, so maybe is the regression due to different inlinees?


Other issues related to this PR:

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 1, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@@ -392,7 +392,7 @@ internal T get_Item<T>(int index)
// ! Warning: "this" is an array, not an SZArrayHelper. See comments above
// ! or you may introduce a security hole!
T[] _this = Unsafe.As<T[]>(this);
if ((uint)index >= (uint)_this.Length)
if ((uint)index >= _this.Length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to end up using 64-bit comparison? That may be slow on 32-bit platforms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be bad 😢

If we need to change back to the uint-cast for the length in order to have uint-comparison on both 64-bit and 32-bit, then the "goal" to remove the workarounds for the bound-checks is somewhat missed -- even if the bound-checks get elided.

Here in CoreLib we can do it, as we know about the tricks, and workarounds the JIT needs. But the standard-developer who wants fast .NET code may know to cast the index to uint, to avoid the extra >= 0 check, but then they produce bad code for 32-bit.

So ideally and hopefully the JIT can treat the length here (and on {ReadOnly}Span) as unsigned, to produce proper code on every platform and avoid the extra bound-check.

@EgorBo can you please comment on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've fixed it for x86 unintentionally in https://github.com/dotnet/runtime/pull/62864/files#diff-5b83397bbbdd17bb9457998b520fdaaa474d165390985b66f32371561b6d0bacR13996-R14001

image

while it is indeed an issue in .NET 6.0 as you can see from this screenshot:
image

it's not an issue in .NET 7.0. I think I recall my PR had a huge negative diff on x86 but I didn't pay attention to it as I was focused on x64 codegen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the standard-developer who wants fast .NET code may know to cast the index to uint, to avoid the extra >= 0 check

If one is using the cast to uint trick, I think it is better for clarity to cast both sides to uint to make it clear what you wanted to happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.
Although it's possible to use (uint)index < array.Length now it reads a bit strange. I'd prefer to have (uint) on both sides (at least now, having seen the new pattern).

So for this PR:

  • revert to both sides having the uint-cast
  • keep the {array | span}.Length > constant work
  • keep the code work that removes bound-checks that were previously there

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

Copy link
Member

@stephentoub stephentoub Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have (uint) on both sides (at least now, having seen the new pattern).

FWIW, I've also taken to just always using it on both sides as I can't keep track of exactly in what situations the JIT comprehends, and it's simpler to just always be consistent and do the thing that has the best chance of being recognized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, so that when uint-cast is done for the index, then the cast is for clarity on both sides.

@@ -19,19 +21,22 @@ internal BitHelper(Span<int> span, bool clear)

internal void MarkBit(int bitPosition)
{
int bitArrayIndex = bitPosition / IntSize;
if ((uint)bitArrayIndex < (uint)_span.Length)
(int bitArrayIndex, int position) = Math.DivRem(bitPosition, IntSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of cruft that the JIT has to optimize out now. Is this code with DivRem as good as it was before this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The division is done only once, thus less machine code is created.
But for "Final local variable assignments" I see that JIT has more work.
Diffs show(ed) an improvement by looking at that change in isolation.

I assume that

if ((uint)bitArrayIndex < (uint)span.Length)
will always be true, so saving the extra division should be beneficial.

asm (x64) current main
; Assembly listing for method System.Collections.Generic.BitHelper:IsMarked(int):bool:this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T01] (  4,  3.50)   byref  ->  rcx         this single-def
;  V01 arg1         [V01,T00] (  7,  5.50)     int  ->  rdx         single-def
;  V02 loc0         [V02,T02] (  3,  2.50)     int  ->  rax        
;# V03 OutArgs      [V03    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;* V04 tmp1         [V04    ] (  0,  0   )   byref  ->  zero-ref    single-def "Span.get_Item ptrToSpan"
;* V05 tmp2         [V05    ] (  0,  0   )   byref  ->  zero-ref    "Inlining Arg"
;  V06 cse0         [V06,T03] (  2,  2   )     int  ->   r8         "CSE - aggressive"
;
; Lcl frame size = 0

G_M54322_IG01:
						;; size=0 bbWeight=0.50 PerfScore 0.00
G_M54322_IG02:
       mov      eax, edx
       sar      eax, 31
       and      eax, 31
       add      eax, edx
       sar      eax, 5
       mov      r8d, dword ptr [rcx+8]
       cmp      eax, r8d
       jae      SHORT G_M54322_IG05
						;; size=22 bbWeight=1    PerfScore 5.00
G_M54322_IG03:
       mov      rcx, bword ptr [rcx]
       mov      eax, eax
       mov      eax, dword ptr [rcx+4*rax]
       mov      ecx, edx
       sar      ecx, 31
       and      ecx, 31
       add      ecx, edx
       and      ecx, -32
       sub      edx, ecx
       bt       eax, edx
       setb     al
       movzx    rax, al
						;; size=32 bbWeight=0.50 PerfScore 3.88
G_M54322_IG04:
       ret      
						;; size=1 bbWeight=0.50 PerfScore 0.50
G_M54322_IG05:
       xor      eax, eax
						;; size=2 bbWeight=0.50 PerfScore 0.12
G_M54322_IG06:
       ret      
						;; size=1 bbWeight=0.50 PerfScore 0.50

; Total bytes of code 58, prolog size 0, PerfScore 15.80, instruction count 23, allocated bytes for code 58 (MethodHash=300a2bcd) for method System.Collections.Generic.BitHelper:IsMarked(int):bool:this
; ============================================================

; Assembly listing for method System.Collections.Generic.BitHelper:MarkBit(int):this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T01] (  4,  3.50)   byref  ->  rcx         this single-def
;  V01 arg1         [V01,T00] (  7,  5.50)     int  ->  rdx         single-def
;  V02 loc0         [V02,T03] (  3,  2.50)     int  ->  rax        
;# V03 OutArgs      [V03    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;* V04 tmp1         [V04    ] (  0,  0   )   byref  ->  zero-ref    single-def "Span.get_Item ptrToSpan"
;  V05 tmp2         [V05,T02] (  3,  3   )   byref  ->  rax         single-def "dup spill"
;* V06 tmp3         [V06    ] (  0,  0   )   byref  ->  zero-ref    "Inlining Arg"
;  V07 cse0         [V07,T04] (  2,  2   )     int  ->   r8         "CSE - aggressive"
;
; Lcl frame size = 0

G_M49794_IG01:
						;; size=0 bbWeight=1    PerfScore 0.00
G_M49794_IG02:
       mov      eax, edx
       sar      eax, 31
       and      eax, 31
       add      eax, edx
       sar      eax, 5
       mov      r8d, dword ptr [rcx+8]
       cmp      eax, r8d
       jae      SHORT G_M49794_IG04
						;; size=22 bbWeight=1    PerfScore 5.00
G_M49794_IG03:
       mov      rcx, bword ptr [rcx]
       mov      eax, eax
       lea      rax, bword ptr [rcx+4*rax]
       mov      ecx, edx
       sar      ecx, 31
       and      ecx, 31
       add      ecx, edx
       and      ecx, -32
       sub      edx, ecx
       mov      r8d, 1
       mov      ecx, edx
       shl      r8d, cl
       or       dword ptr [rax], r8d
						;; size=38 bbWeight=0.50 PerfScore 5.00
G_M49794_IG04:
       ret      
						;; size=1 bbWeight=1    PerfScore 1.00

; Total bytes of code 61, prolog size 0, PerfScore 17.10, instruction count 22, allocated bytes for code 61 (MethodHash=de493d7d) for method System.Collections.Generic.BitHelper:MarkBit(int):this
; ============================================================
asm (x64) PR
; Assembly listing for method System.Collections.Generic.BitHelper:IsMarked(int):bool:this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 3 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T01] (  4,  4   )   byref  ->  rcx         this single-def
;  V01 arg1         [V01,T00] (  5,  5   )     int  ->  rdx         single-def
;  V02 loc0         [V02,T03] (  3,  2.50)     int  ->  rax        
;  V03 loc1         [V03,T07] (  2,  1.50)     int  ->  rdx        
;* V04 loc2         [V04    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op
;# V05 OutArgs      [V05    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;* V06 tmp1         [V06    ] (  0,  0   )  struct ( 8) zero-ref    "dup spill"
;  V07 tmp2         [V07,T02] (  3,  3   )     int  ->  rax         "Inline stloc first use temp"
;* V08 tmp3         [V08    ] (  0,  0   )  struct ( 8) zero-ref    "NewObj constructor temp"
;* V09 tmp4         [V09    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
;  V10 tmp5         [V10,T06] (  2,  1.50)   byref  ->   r8         single-def V04._pointer(offs=0x00) P-INDEP "field V04._pointer (fldOffset=0x0)"
;  V11 tmp6         [V11,T04] (  2,  2   )     int  ->  rcx         V04._length(offs=0x08) P-INDEP "field V04._length (fldOffset=0x8)"
;* V12 tmp7         [V12    ] (  0,  0   )     int  ->  zero-ref    V06.Item1(offs=0x00) P-INDEP "field V06.Item1 (fldOffset=0x0)"
;* V13 tmp8         [V13    ] (  0,  0   )     int  ->  zero-ref    V06.Item2(offs=0x04) P-INDEP "field V06.Item2 (fldOffset=0x4)"
;* V14 tmp9         [V14    ] (  0,  0   )     int  ->  zero-ref    V08.Item1(offs=0x00) P-INDEP "field V08.Item1 (fldOffset=0x0)"
;  V15 tmp10        [V15,T05] (  2,  2   )     int  ->  rdx         V08.Item2(offs=0x04) P-INDEP "field V08.Item2 (fldOffset=0x4)"
;
; Lcl frame size = 0

G_M54322_IG01:
						;; size=0 bbWeight=0.50 PerfScore 0.00
G_M54322_IG02:
       mov      eax, edx
       sar      eax, 31
       and      eax, 31
       add      eax, edx
       sar      eax, 5
       mov      r8d, eax
       shl      r8d, 5
       sub      edx, r8d
       mov      r8, bword ptr [rcx]
       mov      ecx, dword ptr [rcx+8]
       cmp      eax, ecx
       jae      SHORT G_M54322_IG05
						;; size=33 bbWeight=1    PerfScore 8.00
G_M54322_IG03:
       mov      eax, eax
       mov      eax, dword ptr [r8+4*rax]
       bt       eax, edx
       setb     al
       movzx    rax, al
						;; size=15 bbWeight=0.50 PerfScore 2.00
G_M54322_IG04:
       ret      
						;; size=1 bbWeight=0.50 PerfScore 0.50
G_M54322_IG05:
       xor      eax, eax
						;; size=2 bbWeight=0.50 PerfScore 0.12
G_M54322_IG06:
       ret      
						;; size=1 bbWeight=0.50 PerfScore 0.50

; Total bytes of code 52, prolog size 0, PerfScore 16.33, instruction count 20, allocated bytes for code 52 (MethodHash=300a2bcd) for method System.Collections.Generic.BitHelper:IsMarked(int):bool:this
; ============================================================

; Assembly listing for method System.Collections.Generic.BitHelper:MarkBit(int):this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 3 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T01] (  4,  4   )   byref  ->  rcx         this single-def
;  V01 arg1         [V01,T00] (  5,  5   )     int  ->  rdx         single-def
;  V02 loc0         [V02,T04] (  3,  2.50)     int  ->  rax        
;  V03 loc1         [V03,T08] (  2,  1.50)     int  ->  rdx        
;* V04 loc2         [V04    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op
;# V05 OutArgs      [V05    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;* V06 tmp1         [V06    ] (  0,  0   )  struct ( 8) zero-ref    "dup spill"
;  V07 tmp2         [V07,T02] (  3,  3   )   byref  ->  rax         single-def "dup spill"
;  V08 tmp3         [V08,T03] (  3,  3   )     int  ->  rax         "Inline stloc first use temp"
;* V09 tmp4         [V09    ] (  0,  0   )  struct ( 8) zero-ref    "NewObj constructor temp"
;* V10 tmp5         [V10    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
;  V11 tmp6         [V11,T07] (  2,  1.50)   byref  ->   r8         single-def V04._pointer(offs=0x00) P-INDEP "field V04._pointer (fldOffset=0x0)"
;  V12 tmp7         [V12,T05] (  2,  2   )     int  ->  rcx         V04._length(offs=0x08) P-INDEP "field V04._length (fldOffset=0x8)"
;* V13 tmp8         [V13    ] (  0,  0   )     int  ->  zero-ref    V06.Item1(offs=0x00) P-INDEP "field V06.Item1 (fldOffset=0x0)"
;* V14 tmp9         [V14    ] (  0,  0   )     int  ->  zero-ref    V06.Item2(offs=0x04) P-INDEP "field V06.Item2 (fldOffset=0x4)"
;* V15 tmp10        [V15    ] (  0,  0   )     int  ->  zero-ref    V09.Item1(offs=0x00) P-INDEP "field V09.Item1 (fldOffset=0x0)"
;  V16 tmp11        [V16,T06] (  2,  2   )     int  ->  rdx         V09.Item2(offs=0x04) P-INDEP "field V09.Item2 (fldOffset=0x4)"
;
; Lcl frame size = 0

G_M49794_IG01:
						;; size=0 bbWeight=1    PerfScore 0.00
G_M49794_IG02:
       mov      eax, edx
       sar      eax, 31
       and      eax, 31
       add      eax, edx
       sar      eax, 5
       mov      r8d, eax
       shl      r8d, 5
       sub      edx, r8d
       mov      r8, bword ptr [rcx]
       mov      ecx, dword ptr [rcx+8]
       cmp      eax, ecx
       jae      SHORT G_M49794_IG04
						;; size=33 bbWeight=1    PerfScore 8.00
G_M49794_IG03:
       mov      ecx, eax
       lea      rax, bword ptr [r8+4*rcx]
       mov      r8d, 1
       mov      ecx, edx
       shl      r8d, cl
       or       dword ptr [rax], r8d
						;; size=20 bbWeight=0.50 PerfScore 3.12
G_M49794_IG04:
       ret      
						;; size=1 bbWeight=1    PerfScore 1.00

; Total bytes of code 54, prolog size 0, PerfScore 17.53, instruction count 19, allocated bytes for code 54 (MethodHash=de493d7d) for method System.Collections.Generic.BitHelper:MarkBit(int):this
; ============================================================

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This division does not need to be signed. The input is always non-negative. Would it be better to cast the input to uint before the division? Then it is going to be just a single shift and a single mask without the additional instructions to get the sign right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point 👍🏻. The divisor is constant too, so that becomes a cheap operation -- like you said.
(Will update later, tomorrow -- being on mobile now).

@@ -170,9 +170,10 @@ public void Insert(int index, string? s)
public void Append(char c)
{
int pos = _pos;
if ((uint)pos < (uint)_chars.Length)
Span<char> chars = _chars;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all codegen deltas from this change possitive?

Append methods are often called a lot from a single caller. Extra aggresive inlined struct local may push us over JIT optimization limits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the bound-check didn't get elided, that change does it.
I think the JIT missed the optimization here, especially as it's a ref struct thus _chars can't be changed in between the length-check and the index-access.

The jit-diff shows for this change in isolation an improvement, only one method regresses. It's System.Number:NumberToStringFormat, and unfortunately a huge method, so I have troubles by manually checking the cause for this.

jit-diff for only this change
Found 2 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 3188507
Total bytes of diff: 3186045
Total bytes of delta: -2462 (-0.08 % of base)
Total relative delta: -2.60
    diff is an improvement.
    relative diff is an improvement.


Top file improvements (bytes):
       -2462 : System.Private.CoreLib.dasm (-0,08 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method regressions (bytes):
         498 (7,61 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToStringFormat(byref,byref,System.ReadOnlySpan`1[System.Char],System.Globalization.NumberFormatInfo)

Top method improvements (bytes):
        -538 (-6,38 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -538 (-6,61 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -534 (-7,16 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -115 (-7,65 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetSerializedString(System.TimeZoneInfo):System.String
         -84 (-8,85 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToString(byref,byref,ushort,int,System.Globalization.NumberFormatInfo)
         -75 (-11,06 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeTransitionTime(System.TimeZoneInfo+TransitionTime,byref)
         -63 (-6,98 % of base) : System.Private.CoreLib.dasm - System.ApplicationId:ToString():System.String:this
         -60 (-3,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:ComputeDisplayName(System.String,System.Version,System.String,System.Byte[],int,int):System.String
         -59 (-8,31 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:AppendArgument(byref,System.String)
         -56 (-10,63 % of base) : System.Private.CoreLib.dasm - System.Number:FormatGeneral(byref,byref,int,System.Globalization.NumberFormatInfo,ushort,bool)
         -48 (-7,92 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeMethodInfo:ToString():System.String:this
         -45 (-8,46 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:AppendQuoted(byref,System.String)
         -43 (-5,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeData:ToString():System.String:this
         -41 (-9,60 % of base) : System.Private.CoreLib.dasm - RTDynamicMethod:ToString():System.String:this
         -34 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeTypedArgument:ToString(bool):System.String:this
         -33 (-7,60 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimePropertyInfo:ToString():System.String:this
         -32 (-4,65 % of base) : System.Private.CoreLib.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -28 (-2,64 % of base) : System.Private.CoreLib.dasm - System.Number:FormatFixed(byref,byref,int,System.Int32[],System.String,System.String)
         -27 (-2,85 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlDecode(System.ReadOnlySpan`1[System.Char],byref)
         -27 (-3,13 % of base) : System.Private.CoreLib.dasm - System.String:Join(System.String,System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -26 (-3,16 % of base) : System.Private.CoreLib.dasm - System.Globalization.CalendarData:NormalizeDatePattern(System.String):System.String
         -26 (-7,10 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeConstructorInfo:ToString():System.String:this
         -22 (-6,73 % of base) : System.Private.CoreLib.dasm - System.Number:FormatExponent(byref,System.Globalization.NumberFormatInfo,int,ushort,int,bool)
         -20 (-2,62 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:Paste(System.Collections.Generic.IEnumerable`1[System.String],bool):System.String
         -19 (-2,94 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Combine(System.String[]):System.String
         -19 (-4,51 % of base) : System.Private.CoreLib.dasm - System.Number:FormatPercent(byref,byref,int,System.Globalization.NumberFormatInfo)
         -18 (-3,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.MethodBase:AppendParameters(byref,System.Type[],int)
         -18 (-2,69 % of base) : System.Private.CoreLib.dasm - System.String:ReplaceLineEndings(System.String):System.String:this
         -18 (-0,89 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendFormatHelper(System.IFormatProvider,System.String,System.ParamsArray):this
         -17 (-9,24 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeSubstitute(System.String,byref)
         -17 (-4,67 % of base) : System.Private.CoreLib.dasm - System.Number:FormatScientific(byref,byref,int,System.Globalization.NumberFormatInfo,ushort)
         -16 (-1,75 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextStringValue():System.String:this
         -16 (-1,50 % of base) : System.Private.CoreLib.dasm - System.IO.Path:GetRelativePath(System.String,System.String,int):System.String
         -16 (-3,86 % of base) : System.Private.CoreLib.dasm - System.Number:FormatCurrency(byref,byref,int,System.Globalization.NumberFormatInfo)
         -16 (-1,81 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -15 (-2,58 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Join(System.String[]):System.String
         -15 (-3,08 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Object[]):System.String
         -15 (-3,80 % of base) : System.Private.CoreLib.dasm - System.TypeNameParser:EscapeTypeName(System.String):System.String
         -13 (-1,89 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlEncode(System.ReadOnlySpan`1[System.Char],byref)
         -12 (-4,07 % of base) : System.Private.CoreLib.dasm - System.DateTimeParse:TryParseQuoteString(System.ReadOnlySpan`1[System.Char],int,byref,byref):bool
         -12 (-1,92 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:RemoveRelativeSegments(System.ReadOnlySpan`1[System.Char],int,byref):bool
         -11 (-2,26 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:NormalizeDirectorySeparators(System.String):System.String
         -11 (-0,94 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameParser:GetNextToken(byref):int:this
         -11 (-12,50 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(System.String):this
         -11 (-16,42 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(ushort):this
         -10 (-11,76 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:GrowAndAppend(ushort):this
          -9 (-1,65 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
          -9 (-1,87 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.String]):System.String
          -9 (-2,56 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(double,System.String,System.IFormatProvider):this
          -9 (-3,31 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(int,System.String,System.IFormatProvider):this
          -9 (-3,33 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.__Canon,System.String,System.IFormatProvider):this
          -9 (-3,19 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.DateTime,System.String,System.IFormatProvider):this
          -6 (-2,00 % of base) : System.Private.CoreLib.dasm - System.Number:FormatNumber(byref,byref,int,System.Globalization.NumberFormatInfo)

Top method regressions (percentages):
         498 (7,61 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToStringFormat(byref,byref,System.ReadOnlySpan`1[System.Char],System.Globalization.NumberFormatInfo)

Top method improvements (percentages):
         -11 (-16,42 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(ushort):this
         -11 (-12,50 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(System.String):this
         -10 (-11,76 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:GrowAndAppend(ushort):this
         -75 (-11,06 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeTransitionTime(System.TimeZoneInfo+TransitionTime,byref)
         -56 (-10,63 % of base) : System.Private.CoreLib.dasm - System.Number:FormatGeneral(byref,byref,int,System.Globalization.NumberFormatInfo,ushort,bool)
         -41 (-9,60 % of base) : System.Private.CoreLib.dasm - RTDynamicMethod:ToString():System.String:this
         -17 (-9,24 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeSubstitute(System.String,byref)
         -84 (-8,85 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToString(byref,byref,ushort,int,System.Globalization.NumberFormatInfo)
         -45 (-8,46 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:AppendQuoted(byref,System.String)
         -59 (-8,31 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:AppendArgument(byref,System.String)
         -48 (-7,92 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeMethodInfo:ToString():System.String:this
        -115 (-7,65 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetSerializedString(System.TimeZoneInfo):System.String
         -33 (-7,60 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimePropertyInfo:ToString():System.String:this
        -534 (-7,16 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
         -26 (-7,10 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeConstructorInfo:ToString():System.String:this
         -63 (-6,98 % of base) : System.Private.CoreLib.dasm - System.ApplicationId:ToString():System.String:this
         -22 (-6,73 % of base) : System.Private.CoreLib.dasm - System.Number:FormatExponent(byref,System.Globalization.NumberFormatInfo,int,ushort,int,bool)
        -538 (-6,61 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -538 (-6,38 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
         -43 (-5,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeData:ToString():System.String:this
         -17 (-4,67 % of base) : System.Private.CoreLib.dasm - System.Number:FormatScientific(byref,byref,int,System.Globalization.NumberFormatInfo,ushort)
         -32 (-4,65 % of base) : System.Private.CoreLib.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -19 (-4,51 % of base) : System.Private.CoreLib.dasm - System.Number:FormatPercent(byref,byref,int,System.Globalization.NumberFormatInfo)
         -12 (-4,07 % of base) : System.Private.CoreLib.dasm - System.DateTimeParse:TryParseQuoteString(System.ReadOnlySpan`1[System.Char],int,byref,byref):bool
         -16 (-3,86 % of base) : System.Private.CoreLib.dasm - System.Number:FormatCurrency(byref,byref,int,System.Globalization.NumberFormatInfo)
         -15 (-3,80 % of base) : System.Private.CoreLib.dasm - System.TypeNameParser:EscapeTypeName(System.String):System.String
         -18 (-3,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.MethodBase:AppendParameters(byref,System.Type[],int)
         -60 (-3,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:ComputeDisplayName(System.String,System.Version,System.String,System.Byte[],int,int):System.String
          -9 (-3,33 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.__Canon,System.String,System.IFormatProvider):this
          -9 (-3,31 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(int,System.String,System.IFormatProvider):this
          -9 (-3,19 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.DateTime,System.String,System.IFormatProvider):this
         -26 (-3,16 % of base) : System.Private.CoreLib.dasm - System.Globalization.CalendarData:NormalizeDatePattern(System.String):System.String
         -27 (-3,13 % of base) : System.Private.CoreLib.dasm - System.String:Join(System.String,System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -15 (-3,08 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Object[]):System.String
         -19 (-2,94 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Combine(System.String[]):System.String
         -27 (-2,85 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlDecode(System.ReadOnlySpan`1[System.Char],byref)
         -18 (-2,69 % of base) : System.Private.CoreLib.dasm - System.String:ReplaceLineEndings(System.String):System.String:this
         -28 (-2,64 % of base) : System.Private.CoreLib.dasm - System.Number:FormatFixed(byref,byref,int,System.Int32[],System.String,System.String)
         -20 (-2,62 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:Paste(System.Collections.Generic.IEnumerable`1[System.String],bool):System.String
         -15 (-2,58 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Join(System.String[]):System.String
          -9 (-2,56 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(double,System.String,System.IFormatProvider):this
         -11 (-2,26 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:NormalizeDirectorySeparators(System.String):System.String
          -6 (-2,00 % of base) : System.Private.CoreLib.dasm - System.Number:FormatNumber(byref,byref,int,System.Globalization.NumberFormatInfo)
         -12 (-1,92 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:RemoveRelativeSegments(System.ReadOnlySpan`1[System.Char],int,byref):bool
         -13 (-1,89 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlEncode(System.ReadOnlySpan`1[System.Char],byref)
          -9 (-1,87 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -16 (-1,81 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-1,75 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextStringValue():System.String:this
         -34 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeTypedArgument:ToString(bool):System.String:this
          -9 (-1,65 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-1,50 % of base) : System.Private.CoreLib.dasm - System.IO.Path:GetRelativePath(System.String,System.String,int):System.String
         -11 (-0,94 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameParser:GetNextToken(byref):int:this
         -18 (-0,89 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendFormatHelper(System.IFormatProvider,System.String,System.ParamsArray):this

54 total methods with Code Size differences (53 improved, 1 regressed), 25071 unchanged.

--------------------------------------------------------------------------------

Besides the elimination of the bound-checks I see a difference in codegen at the end of that method.

G_M27859_IG04:
       add      rsp, 40
       ret      

G_M27859_IG05:
-      movzx    rdx, dx
-      lea      rax, [(reloc)]
G_M27859_IG05:
+      movzx    rdx, dx
+      call     [System.Text.ValueStringBuilder:GrowAndAppend(ushort):this]
+      nop   

G_M27859_IG06:
       add      rsp, 40
-      tail.jmp [rax]System.Text.ValueStringBuilder:GrowAndAppend(ushort):this

-G_M27859_IG07:
-      call     [CORINFO_HELP_RNGCHKFAIL]
-      int3     

So a standard call instead of the tail-call.
When looking at the whole System.Private.CoreLib.dasm (as generated by jit-diff diff) shows that ValueStringBuilder.Append(ushort) gets inlined, so GrowAndAppend boils down a call anyway.

asm (x64) current main
; Assembly listing for method System.Text.ValueStringBuilder:Append(ushort):this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; fully interruptible
; No PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T00] (  7,  5.50)   byref  ->  rcx         this single-def
;  V01 arg1         [V01,T01] (  4,  3   )  ushort  ->  rdx         single-def
;  V02 loc0         [V02,T02] (  5,  3.50)     int  ->  rax        
;  V03 OutArgs      [V03    ] (  1,  1   )  lclBlk (32) [rsp+00H]   "OutgoingArgSpace"
;  V04 tmp1         [V04,T03] (  3,  3   )   byref  ->   r8         single-def "Span.get_Item ptrToSpan"
;* V05 tmp2         [V05    ] (  0,  0   )   byref  ->  zero-ref    "Inlining Arg"
;
; Lcl frame size = 40

G_M27859_IG01:
       sub      rsp, 40
						;; size=4 bbWeight=0    PerfScore 0.00
G_M27859_IG02:
       mov      eax, dword ptr [rcx+8]
       cmp      eax, dword ptr [rcx+24]
       jae      SHORT G_M27859_IG05
						;; size=8 bbWeight=1    PerfScore 6.00
G_M27859_IG03:
       lea      r8, bword ptr [rcx+16]
       cmp      eax, dword ptr [r8+8]
       jae      SHORT G_M27859_IG07
       mov      r8, bword ptr [r8]
       mov      r9d, eax
       mov      word  ptr [r8+2*r9], dx
       inc      eax
       mov      dword ptr [rcx+8], eax
						;; size=26 bbWeight=0.50 PerfScore 4.50
G_M27859_IG04:
       add      rsp, 40
       ret      
						;; size=5 bbWeight=0.50 PerfScore 0.62
G_M27859_IG05:
       movzx    rdx, dx
       lea      rax, [(reloc)]
						;; size=10 bbWeight=0.50 PerfScore 0.38
G_M27859_IG06:
       add      rsp, 40
       tail.jmp [rax]System.Text.ValueStringBuilder:GrowAndAppend(ushort):this
						;; size=7 bbWeight=0.50 PerfScore 1.12
G_M27859_IG07:
       call     [CORINFO_HELP_RNGCHKFAIL]
       int3     
						;; size=7 bbWeight=0    PerfScore 0.00

; Total bytes of code 67, prolog size 4, PerfScore 19.33, instruction count 20, allocated bytes for code 67 (MethodHash=b9d9932c) for method System.Text.ValueStringBuilder:Append(ushort):this
; ============================================================
asm (x64) PR
; Assembly listing for method System.Text.ValueStringBuilder:Append(ushort):this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T00] (  6,  5   )   byref  ->  rcx         this single-def
;  V01 arg1         [V01,T02] (  4,  3   )  ushort  ->  rdx         single-def
;  V02 loc0         [V02,T03] (  4,  3   )     int  ->  rax        
;* V03 loc1         [V03    ] (  0,  0   )  struct (16) zero-ref    ld-addr-op
;  V04 OutArgs      [V04    ] (  1,  1   )  lclBlk (32) [rsp+00H]   "OutgoingArgSpace"
;  V05 tmp1         [V05,T05] (  2,  1.50)   byref  ->   r9         single-def V03._pointer(offs=0x00) P-INDEP "field V03._pointer (fldOffset=0x0)"
;  V06 tmp2         [V06,T04] (  2,  2   )     int  ->   r8         V03._length(offs=0x08) P-INDEP "field V03._length (fldOffset=0x8)"
;  V07 tmp3         [V07,T01] (  3,  6   )   byref  ->   r8         single-def "BlockOp address local"
;
; Lcl frame size = 40

G_M27859_IG01:
       sub      rsp, 40
						;; size=4 bbWeight=0.50 PerfScore 0.12
G_M27859_IG02:
       mov      eax, dword ptr [rcx+8]
       lea      r8, bword ptr [rcx+16]
       mov      r9, bword ptr [r8]
       mov      r8d, dword ptr [r8+8]
       cmp      eax, r8d
       jae      SHORT G_M27859_IG05
						;; size=19 bbWeight=1    PerfScore 7.75
G_M27859_IG03:
       mov      r8d, eax
       mov      word  ptr [r9+2*r8], dx
       inc      eax
       mov      dword ptr [rcx+8], eax
						;; size=13 bbWeight=0.50 PerfScore 1.25
G_M27859_IG04:
       add      rsp, 40
       ret      
						;; size=5 bbWeight=0.50 PerfScore 0.62
G_M27859_IG05:
       movzx    rdx, dx
       call     [System.Text.ValueStringBuilder:GrowAndAppend(ushort):this]
       nop      
						;; size=10 bbWeight=0.50 PerfScore 1.75
G_M27859_IG06:
       add      rsp, 40
       ret      
						;; size=5 bbWeight=0.50 PerfScore 0.62

; Total bytes of code 56, prolog size 4, PerfScore 17.73, instruction count 18, allocated bytes for code 56 (MethodHash=b9d9932c) for method System.Text.ValueStringBuilder:Append(ushort):this
; ============================================================

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's System.Number:NumberToStringFormat, and unfortunately a huge method

We care about number formatting performance quite a bit. It would be useful to understand the performance impact of this change on number formatting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumberToStringFormat-asm.zip

Compared the asm block by block, and starting at G_M25536_IG74 I see more stack-work (spilling?) happening.
Tracked back to C# code this is at

if (number.IsNegative && (section == 0) && (number.Scale != 0))
sb.Append(info.NegativeSign);
, so the first time the ValueStringBuilder is used in that method.

I guess

Extra aggresive inlined struct local may push us over JIT optimization limits.

is the case here.

In conclusion I think it's the best to revert the change in ValueStringBuilder, and fix the JIT for

I think the JIT missed the optimization here, especially as it's a ref struct thus _chars can't be changed in between the length-check and the index-access.

so we get the best of both worlds.
If JIT is fixed we should get the improvements in the jit-diff anyway without regressing System.Number:NumberToStringFormat.

Hm, I don't find any issue for this missed optimization, but believe to remember that this was already mentioned somewhere else.

@gfoidl gfoidl marked this pull request as ready for review April 3, 2022 12:23
@gfoidl
Copy link
Member Author

gfoidl commented Apr 4, 2022

@jkotas updated to uint-division in BitHelper and reverted the span-local in ValueStringBuilder.

Now the jit-diff only shows improvements, without any regression.

jit-diff based on da31c0b
Found 2 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 3188507
Total bytes of diff: 3188057
Total bytes of delta: -450 (-0.01 % of base)
Total relative delta: -1.23
    diff is an improvement.
    relative diff is an improvement.


Top file improvements (bytes):
        -450 : System.Private.CoreLib.dasm (-0,01 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
        -111 (-10,25 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:SymmetricExceptWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -56 (-7,78 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:IntersectWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -49 (-5,68 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1[System.__Canon],bool):System.ValueTuple`2[System.Int32, System.Int32]:this
         -40 (-6,20 % of base) : System.Private.CoreLib.dasm - System.String:Replace(System.String,System.String):System.String:this
         -26 (-4,45 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.ReadOnlySpan`1[System.Char],byref):this
         -19 (-0,22 % of base) : System.Private.CoreLib.dasm - System.Span`1:ToString():System.String:this (20 methods)
         -19 (-4,92 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorListVectorized(byref,ushort,ushort,ushort):this
         -17 (-2,57 % of base) : System.Private.CoreLib.dasm - TimeSpanTokenizer:GetNextToken():System.Globalization.TimeSpanParse+TimeSpanToken:this
         -15 (-24,59 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:MarkBit(int):this
         -15 (-3,92 % of base) : System.Private.CoreLib.dasm - System.Globalization.StringInfo:ParseCombiningCharacters(System.String):System.Int32[]
         -15 (-0,22 % of base) : System.Private.CoreLib.dasm - System.ReadOnlySpan`1:ToString():System.String:this (16 methods)
         -14 (-24,14 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:IsMarked(int):bool:this
         -12 (-11,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ValueListBuilder`1:Append(System.__Canon):this
         -12 (-3,30 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String[],byref,byref):this
          -7 (-2,76 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String,byref):this
          -5 (-6,02 % of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:Append(ushort):System.Text.StringBuilder:this (2 methods)
          -2 (-0,57 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.KeyValuePair:PairToString(System.Object,System.Object):System.String
          -2 (-0,37 % of base) : System.Private.CoreLib.dasm - System.Globalization.SortKey:ToString():System.String:this
          -2 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactDigits(byref,int,int,byref,byref):bool
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.Memory`1:ToString():System.String:this (3 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:ToString():System.String:this (3 methods)
          -2 (-0,35 % of base) : System.Private.CoreLib.dasm - System.Reflection.LocalVariableInfo:ToString():System.String:this
          -2 (-0,46 % of base) : System.Private.CoreLib.dasm - System.Runtime.Loader.AssemblyLoadContext:ToString():System.String:this
          -2 (-0,52 % of base) : System.Private.CoreLib.dasm - System.Text.Rune:get_DebuggerDisplay():System.String:this
          -1 (-0,11 % of base) : System.Private.CoreLib.dasm - System.ComponentModel.Win32Exception:ToString():System.String:this
          -1 (-0,74 % of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:AppendLiteral(System.String):this

Top method improvements (percentages):
         -15 (-24,59 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:MarkBit(int):this
         -14 (-24,14 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:IsMarked(int):bool:this
         -12 (-11,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ValueListBuilder`1:Append(System.__Canon):this
        -111 (-10,25 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:SymmetricExceptWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -56 (-7,78 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:IntersectWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -40 (-6,20 % of base) : System.Private.CoreLib.dasm - System.String:Replace(System.String,System.String):System.String:this
          -5 (-6,02 % of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:Append(ushort):System.Text.StringBuilder:this (2 methods)
         -49 (-5,68 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1[System.__Canon],bool):System.ValueTuple`2[System.Int32, System.Int32]:this
         -19 (-4,92 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorListVectorized(byref,ushort,ushort,ushort):this
         -26 (-4,45 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.ReadOnlySpan`1[System.Char],byref):this
         -15 (-3,92 % of base) : System.Private.CoreLib.dasm - System.Globalization.StringInfo:ParseCombiningCharacters(System.String):System.Int32[]
         -12 (-3,30 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String[],byref,byref):this
          -7 (-2,76 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String,byref):this
         -17 (-2,57 % of base) : System.Private.CoreLib.dasm - TimeSpanTokenizer:GetNextToken():System.Globalization.TimeSpanParse+TimeSpanToken:this
          -2 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactDigits(byref,int,int,byref,byref):bool
          -1 (-0,74 % of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:AppendLiteral(System.String):this
          -2 (-0,57 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.KeyValuePair:PairToString(System.Object,System.Object):System.String
          -2 (-0,52 % of base) : System.Private.CoreLib.dasm - System.Text.Rune:get_DebuggerDisplay():System.String:this
          -2 (-0,46 % of base) : System.Private.CoreLib.dasm - System.Runtime.Loader.AssemblyLoadContext:ToString():System.String:this
          -2 (-0,37 % of base) : System.Private.CoreLib.dasm - System.Globalization.SortKey:ToString():System.String:this
          -2 (-0,35 % of base) : System.Private.CoreLib.dasm - System.Reflection.LocalVariableInfo:ToString():System.String:this
         -19 (-0,22 % of base) : System.Private.CoreLib.dasm - System.Span`1:ToString():System.String:this (20 methods)
         -15 (-0,22 % of base) : System.Private.CoreLib.dasm - System.ReadOnlySpan`1:ToString():System.String:this (16 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.Memory`1:ToString():System.String:this (3 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:ToString():System.String:this (3 methods)
          -1 (-0,11 % of base) : System.Private.CoreLib.dasm - System.ComponentModel.Win32Exception:ToString():System.String:this

26 total methods with Code Size differences (26 improved, 0 regressed), 25099 unchanged.

--------------------------------------------------------------------------------

The JIT should try to elide the bound-check in the ValueStringBuilder case (field in ref struct) so we get the improvement "for free".

{
_span[bitArrayIndex] |= (1 << (bitPosition % IntSize));
span[(int)bitArrayIndex] |= 1 << (int)position;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bound-check is still elided, even when casted back to int here.
(good job from the JIT)

@@ -19,19 +21,22 @@ internal BitHelper(Span<int> span, bool clear)

internal void MarkBit(int bitPosition)
{
int bitArrayIndex = bitPosition / IntSize;
if ((uint)bitArrayIndex < (uint)_span.Length)
(uint bitArrayIndex, uint position) = Math.DivRem((uint)bitPosition, IntSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the code with DivRem as efficient as it would be if this used unsigned div and mod directly?

DivRem is not treated as the intrinsic yet. My guess is that some unnecessary instructions compared to just using unsigned div and mod directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 this gives really nice codegen. Thanks!

The jit-diff from this change allone (based on da31c0b to d9f4e01) shows good improvements.

Top method improvements (bytes):
        -118 (-12,14 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:SymmetricExceptWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -84 (-12,65 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:IntersectWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -30 (-3,69 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1[System.__Canon],bool):System.ValueTuple`2[System.Int32, System.Int32]:this
         -10 (-22,73 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:IsMarked(int):bool:this
         -10 (-21,74 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:MarkBit(int):this

I tried without the local Span, but then there were some regressions in larger methods.
In SharpLab, etc. no bound check was shown for the method alone, but in the jit-diff I saw some, so brought the local Span back.

@jeffhandley jeffhandley added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 9, 2022
@ghost
Copy link

ghost commented Apr 9, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Contributes to #67044 -- CoreLib only


Edit: see #67448 (comment)
So when a uint-cast is needed (for the index), then for clarity it should also be on the length-side (i.e. on both sides).
Hence the mentioned pattern for span is outdated now.

The pattern (uint)index < span.Length isn't recognized by the JIT, so the bound-checks aren't removed -- cf. #67044 (comment) -- so at the current state of the PR here the uint-casts are still there, but these places are annotated with a // TODO comment, so I (or anyone) can find them easier for an update once span.Length is handled as expected. That's the reason why the PR is marked as draft.

That issue should be fixed, then the removal of the workaround can be continued without introducing (temporary regressions).


In Visual Studio I see the project System.Private.CoreLib.Shared.
JIT's optimization for the bound-checks is for .NET 7 onwards, so my question is to which other targets is that project "shared"?
Not that this PR introduces some regression elsewhere.


The change was done rather mechanical by an Regex.Replace, then verifyed (in iterations / batches) through jit-diffs and manual inspection of the git-diffs.
So a few additional bound-checks got elided by changing the code a bit.

jit-diff
Found 2 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 3188507
Total bytes of diff: 3185674
Total bytes of delta: -2833 (-0.09 % of base)
Total relative delta: -3.50
    diff is an improvement.
    relative diff is an improvement.


Top file improvements (bytes):
       -2833 : System.Private.CoreLib.dasm (-0,09 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method regressions (bytes):
         498 (7,61 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToStringFormat(byref,byref,System.ReadOnlySpan`1[System.Char],System.Globalization.NumberFormatInfo)

Top method improvements (bytes):
        -538 (-6,38 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -538 (-6,61 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -534 (-7,16 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -115 (-7,65 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetSerializedString(System.TimeZoneInfo):System.String
         -84 (-8,85 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToString(byref,byref,ushort,int,System.Globalization.NumberFormatInfo)
         -77 (-7,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:SymmetricExceptWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -75 (-11,06 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeTransitionTime(System.TimeZoneInfo+TransitionTime,byref)
         -63 (-6,98 % of base) : System.Private.CoreLib.dasm - System.ApplicationId:ToString():System.String:this
         -60 (-3,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:ComputeDisplayName(System.String,System.Version,System.String,System.Byte[],int,int):System.String
         -59 (-8,31 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:AppendArgument(byref,System.String)
         -56 (-10,63 % of base) : System.Private.CoreLib.dasm - System.Number:FormatGeneral(byref,byref,int,System.Globalization.NumberFormatInfo,ushort,bool)
         -48 (-7,92 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeMethodInfo:ToString():System.String:this
         -45 (-8,46 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:AppendQuoted(byref,System.String)
         -43 (-5,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeData:ToString():System.String:this
         -41 (-9,60 % of base) : System.Private.CoreLib.dasm - RTDynamicMethod:ToString():System.String:this
         -41 (-4,76 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1[System.__Canon],bool):System.ValueTuple`2[System.Int32, System.Int32]:this
         -40 (-6,20 % of base) : System.Private.CoreLib.dasm - System.String:Replace(System.String,System.String):System.String:this
         -36 (-5,00 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:IntersectWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -34 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeTypedArgument:ToString(bool):System.String:this
         -33 (-7,60 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimePropertyInfo:ToString():System.String:this
         -32 (-4,65 % of base) : System.Private.CoreLib.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -28 (-2,64 % of base) : System.Private.CoreLib.dasm - System.Number:FormatFixed(byref,byref,int,System.Int32[],System.String,System.String)
         -27 (-2,85 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlDecode(System.ReadOnlySpan`1[System.Char],byref)
         -27 (-3,13 % of base) : System.Private.CoreLib.dasm - System.String:Join(System.String,System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -26 (-3,16 % of base) : System.Private.CoreLib.dasm - System.Globalization.CalendarData:NormalizeDatePattern(System.String):System.String
         -26 (-7,10 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeConstructorInfo:ToString():System.String:this
         -26 (-4,45 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.ReadOnlySpan`1[System.Char],byref):this
         -22 (-6,73 % of base) : System.Private.CoreLib.dasm - System.Number:FormatExponent(byref,System.Globalization.NumberFormatInfo,int,ushort,int,bool)
         -20 (-2,62 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:Paste(System.Collections.Generic.IEnumerable`1[System.String],bool):System.String
         -19 (-2,94 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Combine(System.String[]):System.String
         -19 (-4,51 % of base) : System.Private.CoreLib.dasm - System.Number:FormatPercent(byref,byref,int,System.Globalization.NumberFormatInfo)
         -19 (-0,22 % of base) : System.Private.CoreLib.dasm - System.Span`1:ToString():System.String:this (20 methods)
         -19 (-4,92 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorListVectorized(byref,ushort,ushort,ushort):this
         -18 (-3,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.MethodBase:AppendParameters(byref,System.Type[],int)
         -18 (-2,69 % of base) : System.Private.CoreLib.dasm - System.String:ReplaceLineEndings(System.String):System.String:this
         -18 (-0,89 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendFormatHelper(System.IFormatProvider,System.String,System.ParamsArray):this
         -17 (-9,24 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeSubstitute(System.String,byref)
         -17 (-4,67 % of base) : System.Private.CoreLib.dasm - System.Number:FormatScientific(byref,byref,int,System.Globalization.NumberFormatInfo,ushort)
         -16 (-1,75 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextStringValue():System.String:this
         -16 (-1,50 % of base) : System.Private.CoreLib.dasm - System.IO.Path:GetRelativePath(System.String,System.String,int):System.String
         -16 (-3,86 % of base) : System.Private.CoreLib.dasm - System.Number:FormatCurrency(byref,byref,int,System.Globalization.NumberFormatInfo)
         -16 (-1,81 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-2,42 % of base) : System.Private.CoreLib.dasm - TimeSpanTokenizer:GetNextToken():System.Globalization.TimeSpanParse+TimeSpanToken:this
         -15 (-3,92 % of base) : System.Private.CoreLib.dasm - System.Globalization.StringInfo:ParseCombiningCharacters(System.String):System.Int32[]
         -15 (-2,58 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Join(System.String[]):System.String
         -15 (-0,22 % of base) : System.Private.CoreLib.dasm - System.ReadOnlySpan`1:ToString():System.String:this (16 methods)
         -15 (-3,08 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Object[]):System.String
         -15 (-3,80 % of base) : System.Private.CoreLib.dasm - System.TypeNameParser:EscapeTypeName(System.String):System.String
         -13 (-1,89 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlEncode(System.ReadOnlySpan`1[System.Char],byref)
         -12 (-11,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ValueListBuilder`1:Append(System.__Canon):this
         -12 (-4,07 % of base) : System.Private.CoreLib.dasm - System.DateTimeParse:TryParseQuoteString(System.ReadOnlySpan`1[System.Char],int,byref,byref):bool
         -12 (-1,92 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:RemoveRelativeSegments(System.ReadOnlySpan`1[System.Char],int,byref):bool
         -12 (-3,30 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String[],byref,byref):this
         -11 (-2,26 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:NormalizeDirectorySeparators(System.String):System.String
         -11 (-0,94 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameParser:GetNextToken(byref):int:this
         -11 (-12,50 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(System.String):this
         -11 (-16,42 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(ushort):this
         -10 (-11,76 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:GrowAndAppend(ushort):this
          -9 (-1,65 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
          -9 (-1,87 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.String]):System.String
          -9 (-2,56 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(double,System.String,System.IFormatProvider):this
          -9 (-3,31 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(int,System.String,System.IFormatProvider):this
          -9 (-3,33 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.__Canon,System.String,System.IFormatProvider):this
          -9 (-3,19 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.DateTime,System.String,System.IFormatProvider):this
          -7 (-11,48 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:MarkBit(int):this
          -7 (-2,76 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String,byref):this
          -6 (-10,34 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:IsMarked(int):bool:this
          -6 (-2,00 % of base) : System.Private.CoreLib.dasm - System.Number:FormatNumber(byref,byref,int,System.Globalization.NumberFormatInfo)
          -5 (-6,02 % of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:Append(ushort):System.Text.StringBuilder:this (2 methods)
          -2 (-0,57 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.KeyValuePair:PairToString(System.Object,System.Object):System.String
          -2 (-0,37 % of base) : System.Private.CoreLib.dasm - System.Globalization.SortKey:ToString():System.String:this
          -2 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactDigits(byref,int,int,byref,byref):bool
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.Memory`1:ToString():System.String:this (3 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:ToString():System.String:this (3 methods)
          -2 (-0,35 % of base) : System.Private.CoreLib.dasm - System.Reflection.LocalVariableInfo:ToString():System.String:this
          -2 (-0,46 % of base) : System.Private.CoreLib.dasm - System.Runtime.Loader.AssemblyLoadContext:ToString():System.String:this
          -2 (-0,52 % of base) : System.Private.CoreLib.dasm - System.Text.Rune:get_DebuggerDisplay():System.String:this
          -1 (-0,11 % of base) : System.Private.CoreLib.dasm - System.ComponentModel.Win32Exception:ToString():System.String:this
          -1 (-0,74 % of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:AppendLiteral(System.String):this

Top method regressions (percentages):
         498 (7,61 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToStringFormat(byref,byref,System.ReadOnlySpan`1[System.Char],System.Globalization.NumberFormatInfo)

Top method improvements (percentages):
         -11 (-16,42 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(ushort):this
         -11 (-12,50 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:Append(System.String):this
         -10 (-11,76 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:GrowAndAppend(ushort):this
          -7 (-11,48 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:MarkBit(int):this
         -12 (-11,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ValueListBuilder`1:Append(System.__Canon):this
         -75 (-11,06 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeTransitionTime(System.TimeZoneInfo+TransitionTime,byref)
         -56 (-10,63 % of base) : System.Private.CoreLib.dasm - System.Number:FormatGeneral(byref,byref,int,System.Globalization.NumberFormatInfo,ushort,bool)
          -6 (-10,34 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.BitHelper:IsMarked(int):bool:this
         -41 (-9,60 % of base) : System.Private.CoreLib.dasm - RTDynamicMethod:ToString():System.String:this
         -17 (-9,24 % of base) : System.Private.CoreLib.dasm - StringSerializer:SerializeSubstitute(System.String,byref)
         -84 (-8,85 % of base) : System.Private.CoreLib.dasm - System.Number:NumberToString(byref,byref,ushort,int,System.Globalization.NumberFormatInfo)
         -45 (-8,46 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:AppendQuoted(byref,System.String)
         -59 (-8,31 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:AppendArgument(byref,System.String)
         -48 (-7,92 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeMethodInfo:ToString():System.String:this
        -115 (-7,65 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetSerializedString(System.TimeZoneInfo):System.String
         -33 (-7,60 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimePropertyInfo:ToString():System.String:this
        -534 (-7,16 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
         -77 (-7,11 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:SymmetricExceptWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -26 (-7,10 % of base) : System.Private.CoreLib.dasm - System.Reflection.RuntimeConstructorInfo:ToString():System.String:this
         -63 (-6,98 % of base) : System.Private.CoreLib.dasm - System.ApplicationId:ToString():System.String:this
         -22 (-6,73 % of base) : System.Private.CoreLib.dasm - System.Number:FormatExponent(byref,System.Globalization.NumberFormatInfo,int,ushort,int,bool)
        -538 (-6,61 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
        -538 (-6,38 % of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1:ToString(System.String,System.IFormatProvider):System.String:this (12 methods)
         -40 (-6,20 % of base) : System.Private.CoreLib.dasm - System.String:Replace(System.String,System.String):System.String:this
          -5 (-6,02 % of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:Append(ushort):System.Text.StringBuilder:this (2 methods)
         -43 (-5,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeData:ToString():System.String:this
         -36 (-5,00 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:IntersectWithEnumerable(System.Collections.Generic.IEnumerable`1[System.__Canon]):this
         -19 (-4,92 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorListVectorized(byref,ushort,ushort,ushort):this
         -41 (-4,76 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.HashSet`1:CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1[System.__Canon],bool):System.ValueTuple`2[System.Int32, System.Int32]:this
         -17 (-4,67 % of base) : System.Private.CoreLib.dasm - System.Number:FormatScientific(byref,byref,int,System.Globalization.NumberFormatInfo,ushort)
         -32 (-4,65 % of base) : System.Private.CoreLib.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -19 (-4,51 % of base) : System.Private.CoreLib.dasm - System.Number:FormatPercent(byref,byref,int,System.Globalization.NumberFormatInfo)
         -26 (-4,45 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.ReadOnlySpan`1[System.Char],byref):this
         -12 (-4,07 % of base) : System.Private.CoreLib.dasm - System.DateTimeParse:TryParseQuoteString(System.ReadOnlySpan`1[System.Char],int,byref,byref):bool
         -15 (-3,92 % of base) : System.Private.CoreLib.dasm - System.Globalization.StringInfo:ParseCombiningCharacters(System.String):System.Int32[]
         -16 (-3,86 % of base) : System.Private.CoreLib.dasm - System.Number:FormatCurrency(byref,byref,int,System.Globalization.NumberFormatInfo)
         -15 (-3,80 % of base) : System.Private.CoreLib.dasm - System.TypeNameParser:EscapeTypeName(System.String):System.String
         -18 (-3,75 % of base) : System.Private.CoreLib.dasm - System.Reflection.MethodBase:AppendParameters(byref,System.Type[],int)
         -60 (-3,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameFormatter:ComputeDisplayName(System.String,System.Version,System.String,System.Byte[],int,int):System.String
          -9 (-3,33 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.__Canon,System.String,System.IFormatProvider):this
          -9 (-3,31 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(int,System.String,System.IFormatProvider):this
         -12 (-3,30 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String[],byref,byref):this
          -9 (-3,19 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(System.DateTime,System.String,System.IFormatProvider):this
         -26 (-3,16 % of base) : System.Private.CoreLib.dasm - System.Globalization.CalendarData:NormalizeDatePattern(System.String):System.String
         -27 (-3,13 % of base) : System.Private.CoreLib.dasm - System.String:Join(System.String,System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -15 (-3,08 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Object[]):System.String
         -19 (-2,94 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Combine(System.String[]):System.String
         -27 (-2,85 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlDecode(System.ReadOnlySpan`1[System.Char],byref)
          -7 (-2,76 % of base) : System.Private.CoreLib.dasm - System.String:MakeSeparatorList(System.String,byref):this
         -18 (-2,69 % of base) : System.Private.CoreLib.dasm - System.String:ReplaceLineEndings(System.String):System.String:this
         -28 (-2,64 % of base) : System.Private.CoreLib.dasm - System.Number:FormatFixed(byref,byref,int,System.Int32[],System.String,System.String)
         -20 (-2,62 % of base) : System.Private.CoreLib.dasm - System.PasteArguments:Paste(System.Collections.Generic.IEnumerable`1[System.String],bool):System.String
         -15 (-2,58 % of base) : System.Private.CoreLib.dasm - System.IO.Path:Join(System.String[]):System.String
          -9 (-2,56 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendSpanFormattable(double,System.String,System.IFormatProvider):this
         -16 (-2,42 % of base) : System.Private.CoreLib.dasm - TimeSpanTokenizer:GetNextToken():System.Globalization.TimeSpanParse+TimeSpanToken:this
         -11 (-2,26 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:NormalizeDirectorySeparators(System.String):System.String
          -6 (-2,00 % of base) : System.Private.CoreLib.dasm - System.Number:FormatNumber(byref,byref,int,System.Globalization.NumberFormatInfo)
         -12 (-1,92 % of base) : System.Private.CoreLib.dasm - System.IO.PathInternal:RemoveRelativeSegments(System.ReadOnlySpan`1[System.Char],int,byref):bool
         -13 (-1,89 % of base) : System.Private.CoreLib.dasm - System.Net.WebUtility:HtmlEncode(System.ReadOnlySpan`1[System.Char],byref)
          -9 (-1,87 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.String]):System.String
         -16 (-1,81 % of base) : System.Private.CoreLib.dasm - System.String:JoinCore(System.ReadOnlySpan`1[System.Char],System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-1,75 % of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextStringValue():System.String:this
          -2 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactDigits(byref,int,int,byref,byref):bool
         -34 (-1,72 % of base) : System.Private.CoreLib.dasm - System.Reflection.CustomAttributeTypedArgument:ToString(bool):System.String:this
          -9 (-1,65 % of base) : System.Private.CoreLib.dasm - System.String:Concat(System.Collections.Generic.IEnumerable`1[System.__Canon]):System.String
         -16 (-1,50 % of base) : System.Private.CoreLib.dasm - System.IO.Path:GetRelativePath(System.String,System.String,int):System.String
         -11 (-0,94 % of base) : System.Private.CoreLib.dasm - System.Reflection.AssemblyNameParser:GetNextToken(byref):int:this
         -18 (-0,89 % of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AppendFormatHelper(System.IFormatProvider,System.String,System.ParamsArray):this
          -1 (-0,74 % of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:AppendLiteral(System.String):this
          -2 (-0,57 % of base) : System.Private.CoreLib.dasm - System.Collections.Generic.KeyValuePair:PairToString(System.Object,System.Object):System.String
          -2 (-0,52 % of base) : System.Private.CoreLib.dasm - System.Text.Rune:get_DebuggerDisplay():System.String:this
          -2 (-0,46 % of base) : System.Private.CoreLib.dasm - System.Runtime.Loader.AssemblyLoadContext:ToString():System.String:this
          -2 (-0,37 % of base) : System.Private.CoreLib.dasm - System.Globalization.SortKey:ToString():System.String:this
          -2 (-0,35 % of base) : System.Private.CoreLib.dasm - System.Reflection.LocalVariableInfo:ToString():System.String:this
         -19 (-0,22 % of base) : System.Private.CoreLib.dasm - System.Span`1:ToString():System.String:this (20 methods)
         -15 (-0,22 % of base) : System.Private.CoreLib.dasm - System.ReadOnlySpan`1:ToString():System.String:this (16 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.Memory`1:ToString():System.String:this (3 methods)
          -2 (-0,18 % of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:ToString():System.String:this (3 methods)
          -1 (-0,11 % of base) : System.Private.CoreLib.dasm - System.ComponentModel.Win32Exception:ToString():System.String:this

80 total methods with Code Size differences (79 improved, 1 regressed), 25045 unchanged.

--------------------------------------------------------------------------------

System.Number:NumberToStringFormat wasn't touched, so maybe is the regression due to different inlinees?


Other issues related to this PR:

Author: gfoidl
Assignees: -
Labels:

area-CodeGen-coreclr, community-contribution

Milestone: -

@JulieLeeMSFT
Copy link
Member

@EgorBo Please review the community PR.

@JulieLeeMSFT
Copy link
Member

@gfoidl There is a merge conflict. Please resolve it in order to merge the code. Thanks.

@JulieLeeMSFT JulieLeeMSFT added the needs-author-action An issue or pull request that requires more info or actions from the author. label May 23, 2022
@ghost ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label May 23, 2022
@gfoidl
Copy link
Member Author

gfoidl commented May 23, 2022

@JulieLeeMSFT conflict resolved.

@jeffhandley
Copy link
Member

@tannergooding could you take a look at this next week please?

@JulieLeeMSFT
Copy link
Member

Ping @tannergooding for the stale community PR review.

@@ -19,19 +19,21 @@ internal BitHelper(Span<int> span, bool clear)

internal void MarkBit(int bitPosition)
{
Copy link
Member

@tannergooding tannergooding Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a Debug.Assert(bitPosition >= 0)? Same for the other two methods here.

The previous logic, would've done say -1 / 32 == 0 and now will do uint.MaxValue / 32 = 134217727

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might even be better to just keep this one "as is" since its really doing the same thing, just with more casts now (that is comparing uint to uint)

}
}

internal bool IsMarked(int bitPosition)
{
int bitArrayIndex = bitPosition / IntSize;
uint bitArrayIndex = (uint)bitPosition / IntSize;
Copy link
Member

@tannergooding tannergooding Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this one, its doing the same thing but via "more code" now. I'd also expect the JIT was already hoisting the span since its a field of a struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not the same thing. Unsigned division is a simple shift. Signed division is more complicated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also expect the JIT was already hoisting the span since its a field of a struct.

Unfortunately that's not the case*. Codegen was validated for this change.
* (at least when the PR was created, at the moment I don't have enough time to check again)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operation is "the same" even if the codegen is different. If we're doing this specifically to take advantage of the codegen, there should be a comment on it.

More generally, we also need some assert that the input is definitely positive -or- should be taking the input as uint and expecting the caller handle the validation/assertions. The current usages all look to be but its also always the output of GetIndex so without the corresponding outer checks, it'd be pretty easy for someone to accidentally pass in -1 here from some refactoring.

If we're going to be making the method less readable, we should include a comment explaining why these tricks are being used (just as all the current cases have a small // uint cast, per https://github.com/dotnet/runtime/issues/10596 or similar comment). Ideally the issue tracks ways to improve the JIT handling where we can't improve it ourselves via signature changes, locals, or other tweaks.

@@ -591,7 +591,7 @@ public static bool IsControl(string s, int index)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.s);
}
if (((uint)index) >= ((uint)s.Length))
if ((uint)index >= (uint)s.Length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ones seem to be touching code "stylistically", we probably should exclude it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W/o the superfluos paranthesis it's easier to read, thus I changed it.

Copy link
Member

@tannergooding tannergooding Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, but its also a "stylistic" change and so makes it harder to review the thing we actually care about. The typical PR policy is to not include such things and leave them to a separate PR, especially when its the only or primary change to a given file.

some operand swapping, etc. was done to get proper codegen. By reverting these to only have a mechanical remove of the uint-casts, we give up some (runtime) benefits.

We should have issues tracking these. We really shouldn't be getting codegen differences between x < y and y > x, especially for the cases where one input is constant.

@tannergooding
Copy link
Member

There is a lot of stylistic change and even operand swapping in this PR that makes it harder to see where its "just removing" the uint cast. Ideally this would be minified down to just the cases where we are removing now unnecessary casts to improve readability.

@gfoidl
Copy link
Member Author

gfoidl commented Jul 11, 2022

There is a lot of stylistic change and even operand swapping in this PR

The changes were motivated by removing the uint-casts. By doing jit-diffs, to prevent regressions, some operand swapping, etc. was done to get proper codegen. By reverting these to only have a mechanical remove of the uint-casts, we give up some (runtime) benefits.

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look correct to me.

However, I think we are missing some tracking issues and in code comments explaining that we're introducing new tweaks to improve codegen.

There are also, notably, several places that are being touched where its not clear that the fix is related to "Removed JIT (uint) cast workaround to elide bound-checks in CoreLib". Several look to be new optimizations/improvements that would've ideally been in separate PRs so the change can be more easily tracked and validated.

Debug.Assert(bitPosition >= 0);

uint bitArrayIndex = (uint)bitPosition / IntSize;
Span<int> span = _span;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one can use a tracking issue. The JIT should be smart enough to make it unnecessary to cache the Span here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #72004 for this.

@gfoidl
Copy link
Member Author

gfoidl commented Jul 12, 2022

@tannergooding I totally agree that PRs should be for isolated changes ideally. That's my prefered policy too.

So why not here? When working on this PR there were lots of files touched, codegen inspected, so some low-hanging-fruits were found and collected. Being in a mental tunnel then, looking for ideal codegen, I wasn't able to draw a distinction between the PR's goal and the actual changes of the PR.
This is not a justification, just my memories on how it came to this.
If this feedback (split to distinct PRs) would have come earlier, it would also be easier to manage for me, as all test-setup, etc. is right there, and thoughts about that code are fresh (in the brain).

Nevertheless, if you'd like to split the other parts out, I'll do it. But not before two weeks due to my schedule.

@jkotas jkotas merged commit 6e33ec9 into dotnet:main Jul 12, 2022
@gfoidl gfoidl deleted the remove-bound-check-hack branch July 12, 2022 18:09
@ghost ghost locked as resolved and limited conversation to collaborators Aug 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants