Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET 6 - R2R Composite image crashes the app #65879

Closed
FirehawkV21 opened this issue May 26, 2021 · 25 comments
Closed

.NET 6 - R2R Composite image crashes the app #65879

FirehawkV21 opened this issue May 26, 2021 · 25 comments

Comments

@FirehawkV21
Copy link

  • .NET Core Version: 6.0.0-preview.4.21253.7
  • Windows version: Windows 10 (Build 19043.1023)
  • Does the bug reproduce also in WPF for .NET Framework 4.8?: No
  • Is this bug related specifically to tooling in Visual Studio (e.g. XAML Designer, Code editing, etc...)? No.

Problem description:
When <PublishReadyToRunComposite> is set to true, the WPF app would crash when trying to call something from System.Windows.Media.FontFamily.

Actual behavior:
The app crashes at startup.

Expected behavior:
The app starts normally.

Minimal repro:

  • Create a simple WPF app (with a bit of text and buttons).
  • Add <PublishReadyToRunComposite> and <PublishReadyToRun> to the csproj file and set it to true.
  • dotnet publish -c Release -r win-x64 (may occur in win-x86, but I haven't tested).
@FirehawkV21 FirehawkV21 added the untriaged New issue has not been triaged by the area owner label May 26, 2021
@ryalanms ryalanms removed the untriaged New issue has not been triaged by the area owner label Jun 3, 2021
@ryalanms ryalanms self-assigned this Jun 3, 2021
@ryalanms
Copy link
Member

ryalanms commented Jun 3, 2021

@acemod13: Could you share the entire .csproj? Thanks.

@FirehawkV21
Copy link
Author

Test.zip
Here's a small project for testing.

@singhashish-wpf
Copy link
Member

singhashish-wpf commented Jan 12, 2022

@FirehawkV21 Are you facing the crash even when you have the --self-contained true?
Asking as PublishReadyToRunComposite is only supported with self-contained in .Net6

@lindexi
Copy link
Member

lindexi commented Jan 13, 2022

How about add the code to your csproj file?

  <ItemGroup>
    <!--
    The SDK will precompile the assemblies that are distributed with the application. For self-contained applications, this set of assemblies will include the framework. C++/CLI binaries are not eligible for ReadyToRun compilation.
    -->
    <PublishReadyToRunExclude Include="DirectWriteForwarder.dll" />
  </ItemGroup>

See https://docs.microsoft.com/en-us/dotnet/core/deploying/ready-to-run#:~:text=In%20.NET%206%2C%20Composite%20ReadyToRun%20is%20only%20supported%20for%20self%2Dcontained%20deployment.

@lindexi
Copy link
Member

lindexi commented Jan 13, 2022

Sorry, the PublishReadyToRunExclude is not helpful.

@singhashish-wpf
Copy link
Member

@FirehawkV21 Could you please confirm if the crash is observed even with --self-contained?

@FirehawkV21
Copy link
Author

I can confirm that the app still crashes. Even with --self-contained.

@lindexi
Copy link
Member

lindexi commented Jan 20, 2022

I can confirm that the app still crashes. Even with --self-contained.

Me too.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 25, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@mangod9
Copy link
Member

mangod9 commented Jul 6, 2022

Hi @FirehawkV21 does this still repro on .NET 6?

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2022
@mangod9 mangod9 added this to the Future milestone Jul 6, 2022
@FirehawkV21
Copy link
Author

@mangod9 , It still happens in .NET 6. Tested with SDK 6.0.302.

@mangod9
Copy link
Member

mangod9 commented Jul 18, 2022

@AntonLapounov could you please take a look?

@mangod9 mangod9 removed this from the Future milestone Jul 18, 2022
@mangod9 mangod9 added this to the 7.0.0 milestone Jul 18, 2022
@AntonLapounov
Copy link
Member

AntonLapounov commented Jul 18, 2022

The issue is caused by incorrect layout of the module's static fields used to initialize DirectWriteForwarder.dll. The _initterm_m call below invocates all initializer functions placed between __xc_ma_a and __xc_ma_z:

.method assembly static void  '<CrtImplementationDetails>.LanguageSupport.InitializePerAppDomain'(valuetype '<CrtImplementationDetails>'.LanguageSupport* modopt([System.Runtime]System.Runtime.CompilerServices.IsConst) modopt([System.Runtime]System.Runtime.CompilerServices.IsConst) A_0) cil managed
{
  ...
  ldsflda    valuetype '<CppImplementationDetails>'.__xc_ma_a$$BY0A@Q6MPEBXXZ modopt([System.Runtime]System.Runtime.CompilerServices.IsConst) __xc_ma_a
  ldsflda    valuetype '<CppImplementationDetails>'.__xc_ma_z$$BY0A@Q6MPEBXXZ modopt([System.Runtime]System.Runtime.CompilerServices.IsConst) __xc_ma_z
  call       void _initterm_m(method void modopt([System.Runtime]System.Runtime.CompilerServices.IsConst)* *() modopt([System.Runtime]System.Runtime.CompilerServices.IsConst)*,
                              method void modopt([System.Runtime]System.Runtime.CompilerServices.IsConst)* *() modopt([System.Runtime]System.Runtime.CompilerServices.IsConst)*)
  ...
}

In the generated R2R code for the non-composite mode this list is non-empty — the range 0x9870..0xa2d0 contains 331 initializer functions:

void <Module>.<CrtImplementationDetails>.LanguageSupport.InitializePerAppDomain(<CrtImplementationDetails>.LanguageSupport* modopt(System.Runtime.CompilerServices.IsConst) modopt(System.Runtime.CompilerServices.IsConst))
    ...
    lea     rsi, [0x9870]
L1:
    mov     rcx, qword ptr [rsi]
    test    rcx, rcx
    je      L2
    call    qword ptr [...]   // method void modopt(System.Runtime.CompilerServices.IsConst)* *() <Module>.<CrtImplementationDetails>.ThisModule.ResolveMethod<void const * __clrcall(void)>(method void modopt(System.Runtime.CompilerServices.IsConst)* *()) (METHOD_ENTRY_DEF_TOKEN)
    call    rax
L2:
    add     rsi, 8
    lea     rax, [0xa2d0]
    cmp     rsi, rax
    jb      L1
    ...

However, in the generated R2R for the composite mode, it becomes empty — the range 0xf82098..0xf820a0 contains only the initial zero pointer:

void <Module>.<CrtImplementationDetails>.LanguageSupport.InitializePerAppDomain(<CrtImplementationDetails>.LanguageSupport* modopt(System.Runtime.CompilerServices.IsConst) modopt(System.Runtime.CompilerServices.IsConst))
    ...
    lea     rsi, [0xf82098]
L1:
    mov     rcx, qword ptr [rsi]
    test    rcx, rcx
    je      L2
    call    qword ptr [...] // method void modopt(System.Runtime.CompilerServices.IsConst)* *() <Module>.<CrtImplementationDetails>.ThisModule.ResolveMethod<void const * __clrcall(void)>(method void modopt(System.Runtime.CompilerServices.IsConst)* *()) (METHOD_ENTRY_DEF_TOKEN)
    call    rax
L2:
    add     rsi, 8
    lea     rax, [0xf820a0]
    cmp     rsi, rax
    lb      L1
    ...

@trylek, do you know what part of the compiler may be responsible for that?

The crash is caused by __uuidof(IDWriteFactory) not being initialized in the Factory::Initialize function: https://github.com/dotnet/wpf/blob/f5c5469a67aeafb9548b2b6300f0601413685dd9/src/Microsoft.DotNet.Wpf/src/DirectWriteForwarder/CPP/DWriteWrapper/Factory.cpp#L84-L88

@trylek
Copy link
Member

trylek commented Jul 18, 2022

Thanks Anton for sharing the detailed analysis. The code dealing with static fields is here:

public override ComputedStaticFieldLayout ComputeStaticFieldLayout(DefType defType, StaticLayoutKind layoutKind)

This C# implementation must match the C++ CoreCLR runtime code residing at:

VOID MethodTableBuilder::PlaceRegularStaticFields()

VOID MethodTableBuilder::PlaceThreadStaticFields()

in combination with

void Module::BuildStaticsOffsets(AllocMemTracker *pamTracker)

The process has two phases: first, when a module is loaded, the code in ceeload.cpp does a quick pass over statics and calculates the sizes of the arrays to allocate for this module (DomainLocalModule and ThreadLocalModule); please note that in this phase "other modules" are generally not available so that the method cannot count on exact sizes of all types and such, that's why it does various tricks like boxing structs. In methodtablebuilder the exact static layout for a single class is calculated.

@trylek
Copy link
Member

trylek commented Jul 18, 2022

If I'm right to understand you that the discrepancy is related to the special <ModuleType> type, that can easily be a problem because normal C# code doesn't use this type, I believe we have next to zero Crossgen2 coverage for the rare situation of ModuleType containing fields. Technically it should work just like any other type but I can easily imagine it's getting silently skipped now for some reason.

@EgemenCiftci
Copy link

EgemenCiftci commented Nov 11, 2022

It still happens in .NET 7.

Publish Configuration:
Self-contained
win-x86 or win-x64
Produce single file => True
Enable ReadyToRun compilation => True

Details:
CoreCLR Version: 7.0.22.51805
.NET Version: 7.0.0
Description: The process was terminated due to an unhandled exception.
Exception Info: System.TypeInitializationException: The type initializer for 'System.Windows.Media.FontFamily' threw an exception.
---> System.TypeInitializationException: The type initializer for 'MS.Internal.FontCache.DWriteFactory' threw an exception.
---> System.InvalidCastException: Specified cast is not valid.
at MS.Internal.Text.TextInterface.Native.Util.ConvertHresultToException(Int32 hr)
at MS.Internal.Text.TextInterface.Factory.Initialize(FactoryType factoryType)
at MS.Internal.Text.TextInterface.Factory..ctor(FactoryType factoryType, IFontSourceCollectionFactory fontSourceCollectionFactory, IFontSourceFactory fontSourceFactory)
at MS.Internal.FontCache.DWriteFactory..cctor()
--- End of inner exception stack trace ---
at MS.Internal.FontCache.DWriteFactory.get_SystemFontCollection()
at System.Windows.Media.FontFamily..cctor()
--- End of inner exception stack trace ---
at MS.Internal.Text.DynamicPropertyReader.GetTypeface(DependencyObject element)
at MS.Internal.Text.TextProperties.InitCommon(DependencyObject target)
at MS.Internal.Text.TextProperties..ctor(FrameworkElement target, Boolean isTypographyDefaultValue)
at System.Windows.Controls.TextBlock.GetLineProperties()
at System.Windows.Controls.TextBlock.EnsureTextBlockCache()
at System.Windows.Controls.TextBlock.MeasureOverride(Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at MS.Internal.Helper.MeasureElementWithSingleChild(UIElement element, Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at System.Windows.Controls.Border.MeasureOverride(Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at System.Windows.Controls.Control.MeasureOverride(Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at System.Windows.Controls.Grid.MeasureOverride(Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at MS.Internal.Helper.MeasureElementWithSingleChild(UIElement element, Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at System.Windows.Controls.Decorator.MeasureOverride(Size constraint)
at System.Windows.Documents.AdornerDecorator.MeasureOverride(Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at System.Windows.Controls.Border.MeasureOverride(Size constraint)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at System.Windows.Window.MeasureOverrideHelper(Size constraint)
at System.Windows.Window.MeasureOverride(Size availableSize)
at System.Windows.FrameworkElement.MeasureCore(Size availableSize)
at System.Windows.UIElement.Measure(Size availableSize)
at System.Windows.Interop.HwndSource.SetLayoutSize()
at System.Windows.Interop.HwndSource.set_RootVisualInternal(Visual value)
at System.Windows.Window.SetRootVisualAndUpdateSTC()
at System.Windows.Window.SetupInitialState(Double requestedTop, Double requestedLeft, Double requestedWidth, Double requestedHeight)
at System.Windows.Window.CreateSourceWindow(Boolean duringShow)
at System.Windows.Window.ShowHelper(Object booleanBox)
at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)
at System.Windows.Threading.DispatcherOperation.InvokeImpl()
at MS.Internal.CulturePreservingExecutionContext.CallbackWrapper(Object obj)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at MS.Internal.CulturePreservingExecutionContext.Run(CulturePreservingExecutionContext executionContext, ContextCallback callback, Object state)
at System.Windows.Threading.DispatcherOperation.Invoke()
at System.Windows.Threading.Dispatcher.ProcessQueue()
at System.Windows.Threading.Dispatcher.WndProcHook(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
at MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
at MS.Win32.HwndSubclass.DispatcherCallbackOperation(Object o)
at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)
at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)
at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)
at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
at System.Windows.Threading.Dispatcher.PushFrameImpl(DispatcherFrame frame)
at System.Windows.Application.RunDispatcher(Object ignore)
at System.Windows.Application.RunInternal(Window window)
at WpfApp2.App.Main()

@trylek trylek assigned trylek and unassigned AntonLapounov Nov 15, 2022
@trylek
Copy link
Member

trylek commented Nov 20, 2022

I think I finally understand what's going on; most of it was actually discovered by Anton this summer. The problem is the algorithm for copying RVA fields in Crossgen2. They are basically copied in random order based on when they are encountered in the dependency analysis process, which doesn't work for Managed C++ that used the C++ style of global initializers involving an array of method pointers that are traversed in order by the method _initterm_m.

Interestingly enough even the Field RVA ECMA metadata table doesn't guarantee the fields to come in RVA order so that, when constructing the CopiedMetadataBlobNode while rewriting component assemblies in composite mode, we also have the potential to mess up ordering of the RVA fields.

At this moment I believe that the correct fix is to introduce a new node type representing the entire file of copied RVA fields for a given module that will be internally used by CopiedMetadataBlobNode and by CopiedFieldRvaNode so that, by the time dependency analysis has finished and we're about to start laying out the output file, we make sure to emit the marked RVA fields for each module in their RVA order. As next step I'm going to work on the actual fix.

@trylek
Copy link
Member

trylek commented Nov 20, 2022

I have realized there were several holes in my above reasoning so I investigated the matter further. The problem is not ordering of the RVA fields as I originally thought, that should be dealt with by means of CopiedRvaFieldNode comparison. The fundamental issue has to do with the core logic we use to access RVA fields. When R2R-compiling a single assembly, we potentially relocate RVA fields in CopiedMetadataBlob but we always retain all of them. On the other hand, when compiling the R2R composite image, we selectively copy those RVA fields that get marked in the dependency analysis to the output image - while we use the correct order, we lose the individual initializer method pointers between __xc_ma_a and __xc_ma_z because they aren't explicitly visible to the dependency analysis.

In light of this fact I believe the problem is basically caused by incomplete design w.r.t. this corner case i.e. it requires additional design work, not a mere hotfix. In fact the only way to hotfix this would be to basically duplicate the RVA field file in the composite image, basically doubling its size in the R2R compilation, and I think that is wasteful and unnecessary. By virtue of the R2R design we always have the primary component assemblies available at runtime even in composite mode so we shouldn't need to copy the RVA fields over to the composite image, we can (and should) access them directly in the primary component assembly metadata blobs.

I think this should be dealt with by means of a new RVA field fixup comprising the ECMA module and index of the RVA field in its metadata that would return the address of the field within the component assembly module via indirection into its RVA field metadata table. The JIT interface would then simply state that the address of the RVA field is accessible via this new fixup and it would no longer need to copy RVA fields to the composite image.

In non-composite build mode this ends up as a mere optimization where we directly know the RVA of the field based on the (copied) metadata blob. We could theoretically use the same optimization in composite mode but it would be more tricky as it would require the composite compilation to have knowledge of the process of rewriting the individual component assemblies so that it could use the rewritten RVA of the field in question.

For now I tend to think that is undesirable as otherwise rewriting of the component assemblies is completely orthogonal to composite compilation and can be potentially parallelized; I guess it is also somewhat questionable why we relocate the RVA fields at all during single-assembly compilation. From a broader perspective I would assume we should strive to mostly keep the original ECMA module intact and that includes the location of RVA fields (that can be potentially hard-coded in some native Managed C++ code although we don't support that scenario just yet). If we fixed single-module compilation to stop tampering with RVA field addresses, the described optimization could be easily implemented in composite mode too.

@dragnilar
Copy link

We're having this problem too when we try to publish with Single File and R2R enabled together.

In our application we have a XAML markup extension that changes the font application wide to Segoe UI (we include Segoe UI's font file inside our application as a resource). On startup our application tries to display its splash screen and the extension gets called to change the font on the splash screen. Before the splash screen can even be displayed, the application now crashes with the stack trace being identical to @EgemenCiftci's report.

What is interesting is that the problem doesn't happen with Dot Net 6, but using Dot Net 7 causes the problem to occur. For now we've worked around this by turning off R2R.

@harikrishnan-p-v
Copy link

The problem still exist once the target framework is changed from 7.0 to 6.0 then it is working.

@harikrishnan-p-v
Copy link

@EgemenCiftci The problem is now resolved with the latest Dotnet 7 update.

@trylek
Copy link
Member

trylek commented Apr 11, 2023

Hmm, it's great to hear if you're unblocked but I'm not aware of any recent changes to Crossgen2 that would contribute to this, I believe that the core underlying problem with RVA fields I described above still exists, I'm now working on fixing it fully in .NET 8.

@trylek trylek mentioned this issue May 3, 2023
46 tasks
@dragnilar
Copy link

@trylek Indeed it's still there (just verified it with 7.08). Do you anticipate this will get fixed for .NET 8 still or do you think we may have to wait until .NET 9? As you said on #85736 it looks like there is still a lot of work left.

@trylek
Copy link
Member

trylek commented Aug 1, 2023

Hello @dragnilar!

I believe I fixed the underlying issue with the PR

#78723

that got merged in May i.o.w. it should be included in the latest .NET 8 previews (starting with Preview 5 I believe). It would be great if you or someone on this thread could confirm that the bug is indeed gone in .NET 8 so that we can close this issue.

Thanks

Tomas

@trylek
Copy link
Member

trylek commented Aug 8, 2023

Closed as presumably fixed, please reopen or create a new issue if you're still hitting this.

@trylek trylek closed this as completed Aug 8, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Sep 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests