Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeAOT Data flow annotations are not applied to DI created generic types #81358

Closed
vitek-karas opened this issue Jan 30, 2023 · 5 comments · Fixed by #82392
Closed

NativeAOT Data flow annotations are not applied to DI created generic types #81358

vitek-karas opened this issue Jan 30, 2023 · 5 comments · Fixed by #82392

Comments

@vitek-karas
Copy link
Member

This only occurs in NativeAOT if we stop doing generic parameter data flow validation
from generic dictionaries and instead switch to the approach taken in #80956.

Problem description

If there's a service with a generic parameter which has data flow annotations (DynamicallyAccessedMembers)
and such a service is injected into the application due to some other service having a .ctor parameter
of its type with specific instantiation, the annotation is not applied to the instantiation.

This can lead to failures due to missing members or similar problems. This most readily reproes itself
in many of the Microsoft.Extensions.* tests. For example the Microsoft.Extensions.Logging tests
hit this problem because they have a logger factory type which takes IOptions<LoggerOptions> parameter
in its constructor. IOptions<T> has a data flow annotation on the T.

Repro

Sample code which reproes the problem:

using System;
using System.Diagnostics.CodeAnalysis;
using System.Reflection;

namespace DIBreak
{
    public class Program
    {
        public static void Main()
        {
            Services services = new();
            services.RegisterService(typeof(INameProvider<>), typeof(NameProviderService<>));
            services.RegisterService(typeof(IDataObjectPrinter), typeof(DataObjectPrinterService));

            var printer = services.GetService<IDataObjectPrinter>();
            printer.Print(new DataObject());
        }
    }

    public class DataObject {
        public string Name => "Hello World";
    }

    // Simplistic implementation of DI which is comparable in behavior to our DI
    class Services {
        private Dictionary<Type, Type> _services = new Dictionary<Type, Type>();

        public void RegisterService(Type interfaceType, [DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type implementationType)
        {
            // TODO: Validate that implementationType implements interfaceType

            _services.Add(interfaceType, implementationType);
        }

        public T GetService<T>() {
            return (T)GetService(typeof(T));
        }

        public object GetService(Type interfaceType) {
            Type typeDef = interfaceType.IsGenericType ? interfaceType.GetGenericTypeDefinition() : interfaceType;
            Type implementationType = GetImplementationType(typeDef);

            if (implementationType.IsGenericTypeDefinition) {
                for (int i = 0; i < implementationType.GetGenericArguments().Length; i++) {
                    Type genericArgument = implementationType.GetGenericArguments()[i];
                    Type genericParameter = interfaceType.GetGenericArguments()[i];

                    // Validate that DAM annotations match
                    if (!DamAnnotationsMatch(genericArgument, genericParameter))
                        throw new InvalidOperationException();

                    if (genericParameter.IsValueType)
                        throw new InvalidOperationException();
                }

                implementationType = InstantiateServiceType(implementationType, interfaceType.GetGenericArguments());
            }

            ConstructorInfo constructor = implementationType.GetConstructors()[0]; // Simplification
            if (constructor.GetParameters().Length > 0) {
                List<object> instances = new();
                foreach (var parameter in constructor.GetParameters()) {
                    instances.Add(GetService(parameter.ParameterType));
                }

                return Activator.CreateInstance(implementationType, instances.ToArray())!;
            }
            else {
                return Activator.CreateInstance(implementationType)!;
            }

            [UnconditionalSuppressMessage("", "IL2068", Justification = "We only add types with the right annotation to the dictionary")]
            [return: DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)]
            Type GetImplementationType(Type interfaceType) {
                if (!_services.TryGetValue(interfaceType, out Type? implementationType))
                    throw new NotImplementedException();

                return implementationType;
            }

            [UnconditionalSuppressMessage("", "IL2055", Justification = "We validated that the type parameters match - THIS IS WRONG")]
            [UnconditionalSuppressMessage("", "IL3050", Justification = "We validated there are no value types")]
            [return: DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)]
            Type InstantiateServiceType([DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type typeDef, Type[] typeParameters) {
                return typeDef.MakeGenericType(typeParameters);
            }
        }

        private bool DamAnnotationsMatch(Type argument, Type parameter) {
            // ....
            return true;
        }
    }

    interface INameProvider<[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicProperties)] T>
    {
        string? GetName(T instance);
    }

    class NameProviderService<[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicProperties)] T>
        : INameProvider<T> {
        public string? GetName(T instance) {
            return (string?)typeof(T).GetProperty("Name")?.GetValue(instance);
        }
    }

    interface IDataObjectPrinter {
        void Print(DataObject instance);
    }

    class DataObjectPrinterService : IDataObjectPrinter {
        // The data flow is not applied on the INameProvider<DataObject> here, or in the method parameter
        // or in the call to the GetName below inside Print.
        INameProvider<DataObject> _nameProvider;

        public DataObjectPrinterService(INameProvider<DataObject> nameProvider) {
            _nameProvider = nameProvider;
        }

        public void Print(DataObject instance) {
            // This throws because DataObject.Name is not preserved
            string? name = _nameProvider.GetName(instance);
            Console.WriteLine(name);
        }
    }
}

Root cause

The real root cause is that the suppression on InstantiateServiceType is wrong. Annotations can't be carried over from one
generic argument to another, even if the generic parameter is the same. This is because the compiler doesn't guarantee
that the annotations are applied if it doesn't see the type in question used to run code (so calling its constructor, or calling a static method).
In the above example, the compiler doesn't see the equivalent of new NameProviderService<DataObject>() and thus it
won't necessarily keep all the necessary things to make it work.

Why does it work in main

The code above actually works in 7.0 and main just fine. This is because:

  • There's a call to INameProvider<DataObject>.GetName instance method
  • There's a visible access to constructor on one of the implementation types of INameProvider<>
    • Namely the call RegisterService(typeof(INameProvider<>), typeof(NameProviderService<>)) applies the PublicConstructors annotation to the second parameter
    • That means we make NameProviderService<>..ctor reflection accessible
    • That means we treat NameProviderService<__Canon> as constructed
    • Which means that all instantiations of INameProvider<> are also considered constructed (as in newed up)
  • This requires generation of a generic dictionary for INameProvider<DataObject>
  • In the current implementation the creation of the generic dictionary implies processing generic parameter data flow on the type in question
    • So this applies the PublicProperties annotation to the DataObject type

How can we fix this

Option 1 - fix the suppression

The most correct way to fix this would be to get rid of the wrong suppression. Unfortunately this is not feasible.
For one we've already shipped like this. And second we don't have a solution to the problem in DI without modifying public APIs
and even with that it's not clear how such a solution would look.
Additionally it's likely that other DI systems have a similar problem without a good solution.

Option 2 - rely on detecting instance code access to the specific generic type

In the current implementation the whole thing works only because we see a call to fully instantiated interface method
like INameProvider<DataObject>.GetName. This is not detected by the changes in #80956 because there we reasoned
that it's not necessary to perform the generic parameter data flow on instance member accesses as it's enough to apply
them to constructors and static members. But this assumption is broken by the DI implementation above.
So we could detect even instance member accesses just like we do static member accesses.

This is a very simple change to the #80956 and adds little additional complexity to the compiler.
Downside is that this will have performance implications, as the compiler will have to do additional checks
on all calls (not just static calls). This will produce more warnings in case of a data flow violation
as the same warning would be reported on all instance callsites, and not just the static/ctor callsites.
This is the same behavior ILLink already has though.

Option 3 - guarantee generic data flow on all types in signatures of reflected members

Alternative solution would be to detect all reflectable members (methods, constructors and fields) and
go over all of the types in their signatures and apply generic data flow. In the above sample the constructor
of DataObjectPrinterService is reflectable and thus we would go over its parameters and apply generic data flow
on all of the parameter types - fixing the problem.

We should implement this for all members, not just constructors, because other DI systems may inject
parameters/calls/accesses to fields and/or methods (property accessors). For example it's a highly requested feature
for our DI to support auto-injecting values to properties.

This would again have a performance impact on the compiler but smaller then in Option 2, because the compiler
already tracks reflectable members. The amount of reflectable members is relatively small to all members
and thus this would only trigger to relatively small number of cases.

The implementation is more complex though as it needs more additions to the #80956.

It has a similar downside by producing more warnings, but the number here is probably smaller than in Option 2
and it is again consistent with ILLink behavior.

Recommendation

Option 3 feels as the best solution due to its low impact on compiler performance and lower number of additional warnings.
Also the warnings this would produce are in a better place (declaration of the type instantiation as oppose to callsites).
That said it requires more work and it's hard to prove that it's complete. It's not clear that it covers all possible
ways to get to a Type of the generic argument with the annotation. But it should cover our DI usages and all of the DI usages
we anticipate in the future. But other DI systems might find a "hole" in it.

Note though that even Option 2 is not clearly completely. There might be ways to get to call the affected methods
via some obscure reflection without causing additional warnings and still side stepping the requirement of a visible
instantiated access. The risk is higher than Option 3, because the accesses are in user code, not inside the DI. So we can't guarantee
that it works for "our DI" at least.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 30, 2023
@ghost
Copy link

ghost commented Jan 30, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

This only occurs in NativeAOT if we stop doing generic parameter data flow validation
from generic dictionaries and instead switch to the approach taken in #80956.

Problem description

If there's a service with a generic parameter which has data flow annotations (DynamicallyAccessedMembers)
and such a service is injected into the application due to some other service having a .ctor parameter
of its type with specific instantiation, the annotation is not applied to the instantiation.

This can lead to failures due to missing members or similar problems. This most readily reproes itself
in many of the Microsoft.Extensions.* tests. For example the Microsoft.Extensions.Logging tests
hit this problem because they have a logger factory type which takes IOptions<LoggerOptions> parameter
in its constructor. IOptions<T> has a data flow annotation on the T.

Repro

Sample code which reproes the problem:

using System;
using System.Diagnostics.CodeAnalysis;
using System.Reflection;

namespace DIBreak
{
    public class Program
    {
        public static void Main()
        {
            Services services = new();
            services.RegisterService(typeof(INameProvider<>), typeof(NameProviderService<>));
            services.RegisterService(typeof(IDataObjectPrinter), typeof(DataObjectPrinterService));

            var printer = services.GetService<IDataObjectPrinter>();
            printer.Print(new DataObject());
        }
    }

    public class DataObject {
        public string Name => "Hello World";
    }

    // Simplistic implementation of DI which is comparable in behavior to our DI
    class Services {
        private Dictionary<Type, Type> _services = new Dictionary<Type, Type>();

        public void RegisterService(Type interfaceType, [DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type implementationType)
        {
            // TODO: Validate that implementationType implements interfaceType

            _services.Add(interfaceType, implementationType);
        }

        public T GetService<T>() {
            return (T)GetService(typeof(T));
        }

        public object GetService(Type interfaceType) {
            Type typeDef = interfaceType.IsGenericType ? interfaceType.GetGenericTypeDefinition() : interfaceType;
            Type implementationType = GetImplementationType(typeDef);

            if (implementationType.IsGenericTypeDefinition) {
                for (int i = 0; i < implementationType.GetGenericArguments().Length; i++) {
                    Type genericArgument = implementationType.GetGenericArguments()[i];
                    Type genericParameter = interfaceType.GetGenericArguments()[i];

                    // Validate that DAM annotations match
                    if (!DamAnnotationsMatch(genericArgument, genericParameter))
                        throw new InvalidOperationException();

                    if (genericParameter.IsValueType)
                        throw new InvalidOperationException();
                }

                implementationType = InstantiateServiceType(implementationType, interfaceType.GetGenericArguments());
            }

            ConstructorInfo constructor = implementationType.GetConstructors()[0]; // Simplification
            if (constructor.GetParameters().Length > 0) {
                List<object> instances = new();
                foreach (var parameter in constructor.GetParameters()) {
                    instances.Add(GetService(parameter.ParameterType));
                }

                return Activator.CreateInstance(implementationType, instances.ToArray())!;
            }
            else {
                return Activator.CreateInstance(implementationType)!;
            }

            [UnconditionalSuppressMessage("", "IL2068", Justification = "We only add types with the right annotation to the dictionary")]
            [return: DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)]
            Type GetImplementationType(Type interfaceType) {
                if (!_services.TryGetValue(interfaceType, out Type? implementationType))
                    throw new NotImplementedException();

                return implementationType;
            }

            [UnconditionalSuppressMessage("", "IL2055", Justification = "We validated that the type parameters match - THIS IS WRONG")]
            [UnconditionalSuppressMessage("", "IL3050", Justification = "We validated there are no value types")]
            [return: DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)]
            Type InstantiateServiceType([DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type typeDef, Type[] typeParameters) {
                return typeDef.MakeGenericType(typeParameters);
            }
        }

        private bool DamAnnotationsMatch(Type argument, Type parameter) {
            // ....
            return true;
        }
    }

    interface INameProvider<[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicProperties)] T>
    {
        string? GetName(T instance);
    }

    class NameProviderService<[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicProperties)] T>
        : INameProvider<T> {
        public string? GetName(T instance) {
            return (string?)typeof(T).GetProperty("Name")?.GetValue(instance);
        }
    }

    interface IDataObjectPrinter {
        void Print(DataObject instance);
    }

    class DataObjectPrinterService : IDataObjectPrinter {
        // The data flow is not applied on the INameProvider<DataObject> here, or in the method parameter
        // or in the call to the GetName below inside Print.
        INameProvider<DataObject> _nameProvider;

        public DataObjectPrinterService(INameProvider<DataObject> nameProvider) {
            _nameProvider = nameProvider;
        }

        public void Print(DataObject instance) {
            // This throws because DataObject.Name is not preserved
            string? name = _nameProvider.GetName(instance);
            Console.WriteLine(name);
        }
    }
}

Root cause

The real root cause is that the suppression on InstantiateServiceType is wrong. Annotations can't be carried over from one
generic argument to another, even if the generic parameter is the same. This is because the compiler doesn't guarantee
that the annotations are applied if it doesn't see the type in question used to run code (so calling its constructor, or calling a static method).
In the above example, the compiler doesn't see the equivalent of new NameProviderService<DataObject>() and thus it
won't necessarily keep all the necessary things to make it work.

Why does it work in main

The code above actually works in 7.0 and main just fine. This is because:

  • There's a call to INameProvider<DataObject>.GetName instance method
  • There's a visible access to constructor on one of the implementation types of INameProvider<>
    • Namely the call RegisterService(typeof(INameProvider<>), typeof(NameProviderService<>)) applies the PublicConstructors annotation to the second parameter
    • That means we make NameProviderService<>..ctor reflection accessible
    • That means we treat NameProviderService<__Canon> as constructed
    • Which means that all instantiations of INameProvider<> are also considered constructed (as in newed up)
  • This requires generation of a generic dictionary for INameProvider<DataObject>
  • In the current implementation the creation of the generic dictionary implies processing generic parameter data flow on the type in question
    • So this applies the PublicProperties annotation to the DataObject type

How can we fix this

Option 1 - fix the suppression

The most correct way to fix this would be to get rid of the wrong suppression. Unfortunately this is not feasible.
For one we've already shipped like this. And second we don't have a solution to the problem in DI without modifying public APIs
and even with that it's not clear how such a solution would look.
Additionally it's likely that other DI systems have a similar problem without a good solution.

Option 2 - rely on detecting instance code access to the specific generic type

In the current implementation the whole thing works only because we see a call to fully instantiated interface method
like INameProvider<DataObject>.GetName. This is not detected by the changes in #80956 because there we reasoned
that it's not necessary to perform the generic parameter data flow on instance member accesses as it's enough to apply
them to constructors and static members. But this assumption is broken by the DI implementation above.
So we could detect even instance member accesses just like we do static member accesses.

This is a very simple change to the #80956 and adds little additional complexity to the compiler.
Downside is that this will have performance implications, as the compiler will have to do additional checks
on all calls (not just static calls). This will produce more warnings in case of a data flow violation
as the same warning would be reported on all instance callsites, and not just the static/ctor callsites.
This is the same behavior ILLink already has though.

Option 3 - guarantee generic data flow on all types in signatures of reflected members

Alternative solution would be to detect all reflectable members (methods, constructors and fields) and
go over all of the types in their signatures and apply generic data flow. In the above sample the constructor
of DataObjectPrinterService is reflectable and thus we would go over its parameters and apply generic data flow
on all of the parameter types - fixing the problem.

We should implement this for all members, not just constructors, because other DI systems may inject
parameters/calls/accesses to fields and/or methods (property accessors). For example it's a highly requested feature
for our DI to support auto-injecting values to properties.

This would again have a performance impact on the compiler but smaller then in Option 2, because the compiler
already tracks reflectable members. The amount of reflectable members is relatively small to all members
and thus this would only trigger to relatively small number of cases.

The implementation is more complex though as it needs more additions to the #80956.

It has a similar downside by producing more warnings, but the number here is probably smaller than in Option 2
and it is again consistent with ILLink behavior.

Recommendation

Option 3 feels as the best solution due to its low impact on compiler performance and lower number of additional warnings.
Also the warnings this would produce are in a better place (declaration of the type instantiation as oppose to callsites).
That said it requires more work and it's hard to prove that it's complete. It's not clear that it covers all possible
ways to get to a Type of the generic argument with the annotation. But it should cover our DI usages and all of the DI usages
we anticipate in the future. But other DI systems might find a "hole" in it.

Note though that even Option 2 is not clearly completely. There might be ways to get to call the affected methods
via some obscure reflection without causing additional warnings and still side stepping the requirement of a visible
instantiated access. The risk is higher than Option 3, because the accesses are in user code, not inside the DI. So we can't guarantee
that it works for "our DI" at least.

Author: vitek-karas
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

@vitek-karas
Copy link
Member Author

/fyi @eerhardt, @MichalStrehovsky

@MichalStrehovsky
Copy link
Member

I agree that option 3 sounds the best. I'm not worried about "holes" - this is already an invalid suppression that we just try to accommodate. Anybody who finds a hole in it is already in bad-suppressions-are-your-problem territories.

vitek-karas added a commit to vitek-karas/runtime that referenced this issue Feb 2, 2023
…cyInjection library

See dotnet#81358 for details.

Functional change:
When method or field are reflectable (we produce metadata for them) also go over all of their signature types and perform generic parameter data flow on them.

Added tests for these cases into the data flow suite.
Added a smoke test which is a simplified version of the DI scenario which was broken.
vitek-karas added a commit that referenced this issue Feb 7, 2023
[This is a revert of a revert of #80956 with additional fixes for #81358)

This reworks how generic parameter data flow validation is done in the NativeAOT compiler.

Previously generic data flow was done from generic dictionary nodes. Problem with that approach is that there's no origin information at that point. The warnings can't point to the place where the problematic instantiation is in the code - we only know that it exists.
Aside from it being unfriendly for the users, it means any RUC or suppressions don't work on these warnings the same way they do in linker/analyzer.

This change modifies the logic to tag the method as "needs data flow" whenever we spot an instantiation of an annotated generic in it somewhere.
Then the actual validation/marking is done from data flow using the trim analysis patterns.

The only exception to this is generic data flow for base types and interface implementations, that one is done on the EEType nodes.

Note that AOT implements a much more precise version of the generic data flow validation as compared to linker/analyzer. See the big comment at the beginning of `GenericParameterWarningLocation.cs` for how that works.

Due to an issue with DependencyInjection, this change also implements a behavior where if a method or field is reflection accessible, the compiler will perform generic argument data flow on all types in the signature of the method/field (which it normally wouldn't do). See #81358 for details about the issue and discussions on the fix approach.

Test changes:
Adds the two tests from linker which cover this functionality.

Change the test infra to use token to compare message origins for expected warnings. Consistently converting generic types/methods into strings across two type systems is just very difficult - the tokens are simple and reliable.

Changes the tests to avoid expecting specific generic types/methods formatting in the messages - again, it's too hard to make this consistent without lot of effort. And the tests don't really need it.

Adds a smoke test which has a simplified version of the DI problem from #81358.
@MichalStrehovsky
Copy link
Member

@vitek-karas is this one fixed?

@vitek-karas
Copy link
Member Author

Not yet - the latest version of the fix has been reverted again. #81783

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 20, 2023
vitek-karas added a commit that referenced this issue Feb 21, 2023
[This is a revert of a revert of #81532 with additional fixes for #81779]

This reworks how generic parameter data flow validation is done in the NativeAOT compiler.

Previously generic data flow was done from generic dictionary nodes. Problem with that approach is that there's no origin information at that point. The warnings can't point to the place where the problematic instantiation is in the code - we only know that it exists.
Aside from it being unfriendly for the users, it means any RUC or suppressions don't work on these warnings the same way they do in linker/analyzer.

This change modifies the logic to tag the method as "needs data flow" whenever we spot an instantiation of an annotated generic in it somewhere.
Then the actual validation/marking is done from data flow using the trim analysis patterns.

The only exception to this is generic data flow for base types and interface implementations, that one is done on the EEType nodes.

Note that AOT implements a much more precise version of the generic data flow validation as compared to linker/analyzer. See the big comment at the beginning of `GenericParameterWarningLocation.cs` for how that works.

Due to an issue with DependencyInjection, this change also implements a behavior where if a method or field is reflection accessible, the compiler will perform generic argument data flow on all types in the signature of the method/field (which it normally wouldn't do). See #81358 for details about the issue and discussions on the fix approach.

Due to the DI behavior described above, there's also the problem with nested generics. If a nested generic applies annotation on a specific type and this whole thing is done from within a DI, the compiler will not apply the annotation, since it doesn't see the type being used anywhere for real. See #81779 for detailed description of the issue. The fix for this is to extend the "needs data flow analysis" logic to look into generic arguments recursively and finding any annotation then triggers the data flow processing of the calling code. Then in that processing when applying generic argument data flow, do so recursively over all generic parameters.

Test changes:
Adds the two tests from linker which cover this functionality.

Change the test infra to use token to compare message origins for expected warnings. Consistently converting generic types/methods into strings across two type systems is just very difficult - the tokens are simple and reliable.

Changes the tests to avoid expecting specific generic types/methods formatting in the messages - again, it's too hard to make this consistent without lot of effort. And the tests don't really need it.

Adds a test for marking behavior related to generic argument data flow. This is to catch issues like #81779.

Adds a smoke test which has a simplified version of the DI problem from #81358.

Fixes #77455
Fixes #75898
Fixes #81358
Fixes #81779
@ghost ghost removed in-pr There is an active PR which will close this issue when it is merged untriaged New issue has not been triaged by the area owner labels Feb 21, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants