Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono][wasm] TrimMode=full is trimming UnmanagedCallersOnly #101434

Open
mkhamoyan opened this issue Apr 23, 2024 · 23 comments
Open

[mono][wasm] TrimMode=full is trimming UnmanagedCallersOnly #101434

mkhamoyan opened this issue Apr 23, 2024 · 23 comments
Assignees
Labels
arch-wasm WebAssembly architecture area-Tools-ILLink .NET linker development as well as trimming analyzers os-browser Browser variant of arch-wasm os-wasi Related to WASI variant of arch-wasm
Milestone

Comments

@mkhamoyan
Copy link
Contributor

Description

After #100288 still for browser UnmanagedCallersOnly is not called after publish.

Reproduction Steps

create browser sample app

   public unsafe partial class Test
   {
       public unsafe static int Main(string[] args)
       {
           UnmanagedFunc();
           return 0;
       }

     [UnmanagedCallersOnly(EntryPoint = "ManagedFunc")]
     public static int ManagedFunc(int number)
     {
         // called from UnmanagedFunc
         Console.WriteLine($"ManagedFunc({number}) -> 42");
         return 42;
     }

     [DllImport("local", EntryPoint = "UnmanagedFunc")]
     public static extern void UnmanagedFunc(); // calls ManagedFunc
  }

and in local.c

#include <stdio.h>

int ManagedFunc(int number);

void UnmanagedFunc()
{
    int ret = 0;
    printf("UnmanagedFunc calling ManagedFunc\n");
    ret = ManagedFunc(123);
    printf("ManagedFunc returned %d\n", ret);
}

Expected behavior

ManagedFunc should be called

Actual behavior

No build errors, in pinvoke-table.h wrapper is there, but ManagedFunc is not being called.

Regression?

No response

Known Workarounds

Calling it explicitly

((IntPtr)(delegate* unmanaged<int,int>)&ManagedFunc).ToString();

Configuration

No response

Other information

No response

@mkhamoyan mkhamoyan added arch-wasm WebAssembly architecture os-browser Browser variant of arch-wasm labels Apr 23, 2024
@mkhamoyan mkhamoyan self-assigned this Apr 23, 2024
Copy link
Contributor

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Apr 23, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/interop-contrib
See info in area-owners.md if you want to be subscribed.

@lewing
Copy link
Member

lewing commented Apr 24, 2024

NativeAOT appears to use UnmanagedCallersOnlyAssembly which feeds into

public class UnmanagedEntryPointsRootProvider : ICompilationRootProvider
@sbomer is there something similar for the trimmer?

@lewing lewing added area-Tools-ILLink .NET linker development as well as trimming analyzers os-wasi Related to WASI variant of arch-wasm and removed untriaged New issue has not been triaged by the area owner labels Apr 24, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/interop-contrib
See info in area-owners.md if you want to be subscribed.

@lewing lewing added this to the 9.0.0 milestone Apr 24, 2024
Copy link
Contributor

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas
See info in area-owners.md if you want to be subscribed.

@sbomer
Copy link
Member

sbomer commented Apr 24, 2024

ILLink doesn't have such support. Do we need to add support for UnmanagedCallersOnlyAssembly for this scenario to work? Or could the tooling that generates the interop logic for the UnmanagedCallersOnly method be made to pass along some extra roots to ILLink?

@MichalStrehovsky @AaronRobinsonMSFT

@MichalStrehovsky
Copy link
Member

NativeAOT appears to use UnmanagedCallersOnlyAssembly which feeds into

NativeAOT would only export these if we're building a library. Is this building an executable that also acts as a library? Feels like a pretty special thing and I agree with Sven that maybe whatever is generating pinvoke-table.h could also generate a roots descriptor that could be passed to illink to keep these.

@MichalStrehovsky
Copy link
Member

Alternatively if we expect to support trimmed native libraries at some point, a general purpose mechanism could be added to illink in preparation for that - I could see a new command argument allowing to specify a list of assemblies where UnmanagedCallersOnly methods with a non-empty EntryPoint should be considered as roots. This command line argument could be used for libraries, but also for these executables that also export things.

@agocke agocke added area-Interop-mono and removed area-System.Runtime.InteropServices area-Tools-ILLink .NET linker development as well as trimming analyzers labels Aug 5, 2024
@agocke agocke removed this from AppModel Aug 5, 2024
@agocke agocke removed this from the 9.0.0 milestone Aug 5, 2024
@lewing lewing added this to the 10.0.0 milestone Aug 9, 2024
@pavelsavara pavelsavara added area-Tools-ILLink .NET linker development as well as trimming analyzers and removed area-Interop-mono labels Aug 19, 2024
Copy link
Contributor

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas
See info in area-owners.md if you want to be subscribed.

@pavelsavara pavelsavara assigned sbomer and unassigned mkhamoyan Aug 19, 2024
@pavelsavara
Copy link
Member

It seems this is gap in ILink, please re-assign

@MichalStrehovsky
Copy link
Member

It seems this is gap in ILink, please re-assign

Is #101434 (comment) not feasible?

@pavelsavara
Copy link
Member

pavelsavara commented Aug 20, 2024

My understanding is that _GenerateManagedToNative happens after _RunILLink in Publish nested build.
Therefore the method is already gone when we get to it there.

Binlog
wasi-uco-publish.zip

But overall it feels like the UnmanagedCallersOnly is expressing the intent pretty well and that ILLink is the right tool which should honor that intent.

Those methods are created

  • for linking native code
  • for creating WASM or WASI export
  • by hands by app or lib developer
  • by code gen
    • which could be 3rd party, for example could be wit-bindgen for WASI
    • Roslyn, for NAOT + JavaScript

We could mandate that it's not enough, for those tools to do UnmanagedCallersOnly + sometimes also WasmImportLinkageAttribute, and that they have to author the link descriptors too. Is that the right DX ?

For a library in a nuget, we still don't have solution for running _GenerateManagedToNative on that either.

What are the challenges to implement it in ILLInk ? What makes it "pretty special thing" ?

Why this is not a problem for NAOT on windows when they want to publish native DLL with exported native methods ? Are they not trimming it ?

@MichalStrehovsky
Copy link
Member

Why this is not a problem for NAOT on windows when they want to publish native DLL with exported native methods ? Are they not trimming it ?

If I understand it correctly, WASM is hitting this when building an executable. UnmanagedCallersOnly methods are not considered exported roots on native AOT when building an executable either. They are considered roots only when building <OutputType>Library</OutputType>.

ILLink currently doesn't support trimming <OutputType>Library</OutputType>, AFAIK we don't have scenarios where it would be useful right now (these would not be downstream consumable) and nobody defined what trimming a self-contained library means. Without that, there's no need to even look at UnmanagedCallersOnly within ILLink (it would not be desirable to root UnmanagedCallersOnly with non-empty EntryPoint in the general case).

If I understand it correctly, this would be a one-off ILLink feature for WASM only. That's why the question of whether we can do it elsewhere so that WASM specific things stay together with WASM.

@pavelsavara
Copy link
Member

pavelsavara commented Aug 21, 2024

Why this is not a problem for NAOT on windows when they want to publish native DLL with exported native methods ? Are they not trimming it ?

After discussing it offline, I understood that NAOT doesn't need to trim native DLL, because DLLs are never self-contained.
Because they share the runtime installed elsewhere on the OS, the trimming doesn't make much sense.

ILLink currently doesn't support trimming <OutputType>Library</OutputType>

<OutputType>Library</OutputType> for Browser it makes sense to trim it when it's part of the whole application.

  • If it bundled some native code (C code or object file) to be staticaly linked into .wasm file, we also need to protect UCO. Because that C code would expect to find the native symbols during (LLVM) linking.
  • if there are JSExport, currently we find them via reflection, but it would be better to also make them into WASM export via UCO. That would unify it with NAOT LLVM branch.

For WASI <OutputType>Library</OutputType> would mean something else than for the rest of dotnet.
It would mean that we produce WASI component, which has self-contained dotnet runtime inside. If you have two such WASI components, they would not share the runtime but we would have two runtimes.
For such component it makes sense to trim it as if it was the whole APP.
The difference is that there is not just one entrypoint, but a WASI component could have multiple exports.

The next question is should we protect all UCOs ?
I think the answer is yes, because they represent explicit public native/WASI API of the component.

And if we need to express more nuance to trim some UCOs, I think that ILLInk's feature="EnableOptionalFeature" featurevalue="false" is way how we can trim the functionality, including specific UCOs.
It would be opt-in trimming, rather than opt-out.

If I understand it correctly, this would be a one-off ILLink feature for WASM only. That's why the question of whether we can do it elsewhere so that WASM specific things stay together with WASM.

I see that now, thanks. It still feels to me that ILLink is the right place to implement UCO rooting.
So that we don't have to implement another msbuild pass, which would do cecil/reflection over all the assemblies one more time.
We already have too many passes in WASM publish and it's already slow.

@sbomer do you have objections ? And guidance ?

@pavelsavara pavelsavara changed the title [mono][wasm] UnmanagedCallersOnly is not being called after publish [mono][wasm] TrimMode=full is trimming UnmanagedCallersOnly Aug 21, 2024
@sbomer
Copy link
Member

sbomer commented Aug 21, 2024

My main objection is that keeping all UCO in the app and its dependencies would lead to unnecessary bloat.

Are you aware of any scenarios where the managed code genuinely has no insight into whether the native code will call back into managed? In the sample app above, I think a better solution would be for UnmanagedFunc to have a DynamicDependencyAttribute pointing to ManagedFunc. Using that approach in a library would allow the both methods to be removed if the library code path is unused in a trimmed app.

The next question is should we protect all UCOs ?
I think the answer is yes, because they represent explicit public native/WASI API of the component.

How will this work if I'm trying to implement a WASI component, and I have some nuget dependencies with UCO, but I don't want those to be exports of my WASI Component?

@pavelsavara
Copy link
Member

pavelsavara commented Aug 26, 2024

Are you aware of any scenarios where the managed code genuinely has no insight into whether the native code will call back into managed?

I guess that's true with any WASI component which has more than one export. At build time, you don't know which of them would be used by other WASI component (for example written in python) that is consuming your component at runtime.

Also in case that there is nuget/assembly with UCOs + WasmImportLinkageAttribute.
It means the nuget library wants to contribute into list of exports of the final WASI component.
If such nuget would contain list of ILLink descriptors for all exports, we didn't learn anything new from such list.

More specific example I could imagine is WASI HTTP (server) handler, wrapper translating WASI API into C# API.
Such nuget would probably contain WASI (UCO) export on behalf of the C# application logic assembly.

How will this work if I'm trying to implement a WASI component, and I have some nuget dependencies with UCO, but I don't want those to be exports of my WASI Component?

I suggest that UCOs should be protected by default and that trimming them would be opt-in via ILLink substitution/stub.

@sbomer
Copy link
Member

sbomer commented Aug 26, 2024

Thank you, that helps me understand the scenarios a little better. A few more questions/comments:

I agree that when building a library, UCO should be preserved, so that they exist when building an application that depends on the library (but whether they should be trimmed as part of the app build is a separate question). If we added a library trimming mode, I think it would make sense to preserve UCO in that mode - so for the rest of my comments I'll assume we're talking about trimming an app or final WASI component.

Also in case that there is nuget/assembly with UCOs + WasmImportLinkageAttribute.
It means the nuget library wants to contribute into list of exports of the final WASI component.

This isn't entirely clear to me. Could it not be the case that a library has UCO for interop with library-specific native code, but doesn't want to contribute it to the exports of the final WASI component?

I suggest that UCOs should be protected by default and that trimming them would be opt-in via ILLink substitution/stub.

We don't currently have a way to do this. The feature substitutions are able to:

  • conditionally substitute method bodies to a constant
  • conditionally root a method

But there's not a way to conditionally remove a method (other than making sure it's not called). We could probably come up with a way, but the model that currently makes most sense to me is to allow UCO methods to be removed by default. This allows libraries to preserve UCO methods only when certain code paths are reachable (for example via DynamicDependencyAttribute), allowing them to be removed if the library is unused or partially unused.

@pavelsavara
Copy link
Member

Could it not be the case that a library has UCO for interop with library-specific native code

Yes, you are right that there could be UCOs aimed at native code and we don't know if that native code uses it or not.
It is actually part of this issue. Right now you have to have link descriptor for those you want to use from native code.

I don't know how bad it would be if we rooted all of them. Are there (native) interop scenarios which generate UCOs opportunistically assuming it would be trimmed later ?
(Leading to unexpected bloat as opposed to necessary functionality)

The feature substitutions are able to:
* conditionally substitute method bodies to a constant

That should be enough to trim extra functionality which UCO rooted.
I see now that it would not remove the UCO/export.

If such nuget would contain list of ILLink descriptors for all exports, we didn't learn anything new from such list.

@lewing suggested that wit-bindgen is in better position to categorize the methods into groups by interface they implement.

So that the app developer would have easier time to express what is to be preserved or trimmed by that group/interface.
They would still have to do something (root or trim the group), it can't be automagic end to end.

Let's continue thinking about it.

@SingleAccretion
Copy link
Contributor

After discussing it offline, I understood that NAOT doesn't need to trim native DLL, because DLLs are never self-contained.
Because they share the runtime installed elsewhere on the OS, the trimming doesn't make much sense.

NAOT does "trim" code from DLLs, because from the NAOT perspective, the output is always a self-contained native artifact. The difference between executables and libraries then becomes only about what are the entrypoints.

The current NAOT model is that you need to specify the assemblies for which EntryPointed UCOs are to be considered roots. The assembly from the project you're publishing is considered to be in this set implicitly, and additional ones can be specified via UnmanagedEntryPointsAssembly (as yet undocumented). This works fine for NAOT native libraries because they're not widely used (yet), and the exports naturally tend to live in one central place - that top-level project.

It works less well for WASI components, where it's expected that you can have exported UCOs in referenced assemblies. It does not work for JSExports.

The NativeAOT-LLVM experience with this problem suggests that some more automatic solution is needed; this is easily the top recurring problem people open issues about. See e. g. dotnet/runtimelab#2626.

@SingleAccretion
Copy link
Contributor

This is to above point about the need for some UnmanagedEntryPointsAssembly-like mechanism:

We've talked a bit with Pavel on Discord, and it was pointed out that it would be undesirable to specify the capability for rooting UCOs on an "unconditional" basis in a way that would force the trimmer (or ILC) to look at all of the referenced assemblies for unmanaged exports, many of which don't need to be looked at today.

@sbomer
Copy link
Member

sbomer commented Aug 26, 2024

Are there (native) interop scenarios which generate UCOs opportunistically assuming it would be trimmed later?

My understanding of the xamarin-macios managed static registrar, after reading through some of the discussion and code around #80912, is that it:

  • generates UCO wrappers for all methods that may need to be exported to Objective-C
  • runs trimming to determine which of the wrapped methods are actually needed
  • has custom ILLink logic (that runs after marking) to ensure the UCO methods are preserved only for the preserved wrapped methods
  • passes the ILLink output to NativeAot, and sets UnmanagedEntryPointsAssembly to include all assemblies so that all of the remaining UCO methods are kept in the native code

@lewing
Copy link
Member

lewing commented Aug 29, 2024

@sbomer my understanding was that we want to move away from custom linker passes as a solution, are you suggesting that is the correct approach?

@pavelsavara
Copy link
Member

Notes from discussion with @SingleAccretion earlier this week

  • use UnmanagedEntryPointsAssembly to tag limited set of assemblies that ILLink should load and should preserve UCO's in
    • this already works for NAOT and it's passing --generateunmanagedentrypoints to ILLink
    • we need to add it also to WASM publish/trim targets
    • it probably doesn't float with ProjectReference and there could be challenge
    • it probably doesn't float with libraries inside of nuget
  • we could add (Roslyn analyzer?) warning for missing UnmanagedEntryPointsAssembly in library projects with UCOs targeting WASM
  • we could add new --exclude-unmanaged-entrypoints <namespace-prefix> to allow users to opt-out
    • this is similar to wit-bindgen generated descriptor groups, but simpler and it would also work for non-WASI UCOs
    • we could also do it for class name or method name in same format as ILLink
    • we could also introduce new XML file for this
    • we can't really automatically do what macios does, because we generally don't have visibility into what will be used (as dicussed above)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-wasm WebAssembly architecture area-Tools-ILLink .NET linker development as well as trimming analyzers os-browser Browser variant of arch-wasm os-wasi Related to WASI variant of arch-wasm
Projects
Status: No status
Development

No branches or pull requests

7 participants