-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal]: Practical existential types for interfaces #5556
Comments
I can see the utility but I really don't like how this causes the source declarations and the metadata to differ so much in what I think would be non-obvious ways. IIRC this was a serious sticking point in the earlier discussions on shapes, that the witness type had to be a generic type parameter, and that the team overall would have preferred that the runtime offer a solution that did not require something like that. |
Could "optional type parameters" solve this? A bit borrowed from C++ templates, not sure how this would be represented in metadata though. interface Iface<Ext = MemoryStream> where Ext : IDisposable
{
Ext P { get; }
}
class A : Iface // implicitly: Iface<MemoryStream>
{
MemoryStream P => new MemoryStream();
}
void Dispose<T>(T t)
where T : Iface
{
t.P.Dispose();
}
Dispose(new A()); Or for the fast enumerable case: public interface IFastEnum<TElement, TEnumerator = int>
{
TEnumerator Start { get; }
bool TryGetNext(ref TEnumerator enumerator, out TElement value);
}
class List<TElement> : IFastEnum<TElement> // implicitly: IFastEnum<TElement, int>
{
int Start => 0;
public bool TryGetNext(ref int enumerator, out TElement value)
{
if (index >= Count) {
value = default(T);
return false;
}
value = _array[index++];
return true;
}
}
// implicit: any IFastEnum<string, int>
void M<TEnum>(TEnum e) where TEnum : IFastEnum<string>
{
foreach (var elem in e)
{
Console.WriteLine(elem);
}
}
// explicit? it does allow overriding the value, while "type Foo" would not
// would only be called when the implemetation is specialized and Fancy(TM)
// question about overload resolution though.
// and how TEnumerator could be hoisted up to be a type parameter of M.
void M<TEnum>(TEnum e) where TEnum : IFastEnum<string, FancyEnumerator<string>>
{
foreach (var elem in e)
{
Console.WriteLine(elem);
}
} |
@BhaaLseN This seems wrong to me I think that "optional type parameters" is a separated feature that doesn't really solve the problem here but enhances generics that the proposed solution take advantage of and might benefit from the enhancement as well should it ever added to the language. |
I'm confused by the interface IFastEnumerator<T>
{
bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
IFastEnumerator<T> GetFastEnumerator();
}
class List<T> : IFastEnumerable<T>
{
class FastEnumerator : IFastEnumerator<T>
{
public FastEnumerator(List<T> list) => this._list = list;
int _index;
List<T> _list;
// or
T[] _array; int _size; // to get rid of the indirection
public bool TryMoveNext(out T item)
{
// ...
}
}
public IFastEnumerator<T> GetFastEnumerator() => new FastEnumerator(this);
}
foreach(var x in list) { ... } C# is expanded to look for |
What about abstract classes? |
I wonder if there's space here to do somethign akin to how VB works with extension methods and type parameters. Where tehre can be the 'original def' with some number, and the 'reduced' version with less. Except that here it woudl be somewhat the reverse. I'm just speaking at the Language level how we might represent this, while the emitted level could def look different. |
Improved version of my code sample: interface IFastEnumerator<T>
{
bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
Type GetFastEnumeratorType(); // return type which has a ctor accepting one parameter, the collection; and has TryMoveNext method
}
class List<T> : IFastEnumerable<T>
{
struct FastEnumerator : IFastEnumerator<T>
{
public FastEnumerator(List<T> list) => this._list = list;
int _index;
List<T> _list;
// or
T[] _array; int _size; // to get rid of the indirection
public bool TryMoveNext(out T item)
{
// ...
}
}
public Type GetFastEnumeratorType() => typeof(FastEnumerator);
}
foreach(var x in list) { ... }
// becomes
var enumerator = new List<T>.FastEnumerator(list); // look for ctor which takes the list as a parameter
// the enumerator now lives on the stack
while(enumerator.TryGetNext(out readonly x)) // this call can be inlined, virtually removing any overhead
{
} And the best part, all this is possible today, with minor changes to the C# compiler (recognize new types/method names/etc.) |
That Enumerator type is not something that the compiler can determine at compile time, meaning it would have to be reflection. Associated types is how you achieve your interface proposal with actual compile time knowledge. |
How about: interface IFastEnumerator<T>
{
bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
IFastEnumerator<T> GetFastEnumerator(); // this is for regular usage
}
class List<T> : IFastEnumerable<T>
{
public struct FastEnumerator : IFastEnumerator<T> // this can be public - who cares
{
public FastEnumerator(List<T> list) => this._list = list;
int _index;
List<T> _list;
// or
T[] _array; int _size; // to get rid of the indirection
public bool TryMoveNext(out T item)
{
// ...
}
}
public IFastEnumerator<T> GetFastEnumerator() => new FastEnumerator(this); // this is for regular usage
public FastEnumerator GetFastEnumeratorDirect() => new FastEnumerator(this); // this is for highest performance
}
foreach(var x in list) { ... }
// becomes
var enumerator = list.GetFastEnumeratorDirect(); // C# looks for a method named GetFastEnumeratorDirect
// the enumerator now lives on the stack
while(enumerator.TryGetNext(out readonly x)) // this call can be inlined, virtually removing any overhead
{
} |
That's pretty much what we have today, with the same performance issues in fully generic code. |
Oh yea, I didn't even realize it's there. Given that's already there, what are the performance issues? How can that be improved any further? (Other than getting rid of the |
the problem is when you are workign with IFastEnumerable. The compiler doesn't know about GetFastEnumeratorDirect to do this quickly. With existential types, it could. |
I'm completely confused now. The compiler already looks for |
It can do that when workign with the concrete type. But if you're workign with the interface it has no idea. That's what existential types aims to solve. You can both be working with the interface, but pass along enough information that the compielr/jit can statically know all the information necessary to generate more optimal code (namely code that doesn't need to call through virtual-dispatch or allocate on the heap). |
Using Andy's example, you could pass around an IFastEnum, but get the same perf as if you passed around a concrete impl of that interface. |
Oh I see what you mean. Still, is it really that important to get rid of one allocation per entire loop? My first example - basically adding |
Yes. |
Again, this is what we have today :) However, you only get the benefits if you know the concrete types. But that's highly problematic. In many domains you do not. And now there are tons of allocations happening as you have to work over abstract interfaces with no additional information present to help the runtime/jit/etc. avoid them. |
@CyrusNajmabadi @333fred
.NET FW 4.8:
In .NET 5, the first 2 approaches ( |
@TahirAhmadov please show a BenchmarkDotnet result, including memory allocations. FUrthermore, this only shows TryGetNext appraoch which is not extendible to all the cases where existential types may be used. |
This proposal is also about the ergonomics of generic types, not performance. Those scenarios where performance benefits can be achieved but currently necessitate painful generic signatures are just a bonus. |
My BDN results:
Gist: https://gist.github.com/CyrusNajmabadi/9a1acd7f5c2f2b63c194870533e6af62 So anywhere from 5-7x faster (not even counting things like allocations). |
The code is self explanatory, I don't have the time now to make a new project with that BenchmarkDotnet. I do agree that the allocation will add some memory consumption, but again, we're talking about something like 20 bytes; this only becomes a problem if the loop is executed many times - which is usually a code smell for a whole bunch of reasons, not only allocating the enumerator object. In short, I think there may be a miniscule performance benefit to squeeze out of this improvement; but the cost of creating a whole new syntax constructs plus necessary runtime changes just doesn't seem to be worth it right now. |
The code is not acceptable as it does hand written measurements without any of the efforts or smarts that BDN puts into reliable, repeatable, and statistically significant measurements.
No it's not. That's the common case. Components are producing enumerables and other parts of hte system are consuming it.
You are incorrect. Even in the basic examples you showed it's 5-7x, and no allocs. |
The FastEnumerator2 is the concrete type struct approach - I just included it for comparison sake. |
No. The existential type approach would allow for the speed of FastEnumerator2 as there will be no virtual dispatch. The jit will literally know that 'start' is an int, and that there is a not-virt call to: public bool TryGetNext(ref int enumerator, out TElement value)
{
if (index >= Count) {
value = default(T);
return false;
}
value = _array[index++];
return true;
} which will likely get entire inlined and optimized. |
@jnm2 Ok for that, but my other points still remains. |
@FaustVX I gave a real-world example here: #5556 (comment) |
@fabianoliver I read through a bit of that proposal. It's basically existential types, but without the restrictions that make them efficiently implementable. Consider the example void IterateIfPossible(object obj)
{
if (obj is IEnumerable<?> items)
{
foreach (var item in items)
Console.WriteLine(item);
}
} This can't be implemented without boxing. It's easy to see if you try to type-check it at the CLR level. Despite the fact that In general, if you want efficient value type representations, they must remain unboxed. |
Well, addendum, you can do it efficiently, but the only way I know of is a transformation to Skolem normal form as I proposed in #1328, and this proposal was explicitly designed to be much easier to implement and add fewer new typing rules to the language (i.e., path-dependent types) |
Agreed, I think with wildcard matching, you could do
How would IterateOptimisedIfPossible look under #5556 - would it require explicit reflection to Invoke "Iterate" with the actual type of obj as a generic arg? The fact you could do a more unoptimised iteration using wildcards:
might be a blessing and a curse at the same time. I'd think the downside is that it'd be very easy to miss this isn't optimal. You're probably right that a fully fledged wildcard matching system is probably a lot harder to implement/get right. |
@fabianoliver That doesn't work. At the call to |
…per types" With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action. The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
Just sharing some random thoughts - if I'm reading this correctly, it seems like the proposed syntax with the associated type being an "interface member" would be preferrable as it'd make things a lot more ergonomic when using source generators? Let me make an example so I can also double check this actually make sense 😄 In ComputeSharp I have a couple of interfaces (eg. Conceptually, imagine something like this: interface IFoo
{
void SomeMethodUsersShouldImplement();
void SourceGeneratedMethod1();
void SourceGeneratedMethod2();
// ...
} What users do is the following: partial struct MyType : IFoo
{
public void SomeMethodUsersShouldImplement() { /* ... */ }
} The generator then kicks in and generates all those additional methods. With associated types, this could be: interface IFoo
{
type TGenerated : IGenerated;
void SomeMethodUsersShouldImplement();
interface IGenerated
{
static abstract void SourceGeneratedMethod1();
static abstract void SourceGeneratedMethod2();
// ...
}
} And users would then write: partial struct MyType : IFoo
{
public void SomeMethodUsersShouldImplement() { /* ... */ }
} And the generator would declare a partial declaration of this file, with some file-local type implementing the interface, and would declare that type as the associated type for this interface. Then APIs using this would just use that directly, eg.: // Before
static void UseType<T>(T type)
where T : IFoo
{
type.SourceGeneratedMethod1();
}
// After
static void UseType<T>(T type)
where T : IFoo
{
type.TGenerated.SourceGeneratedMethod1();
} Which seems pretty nice to me. Couple things:
|
…per types" With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action. The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
A few more thoughts on this (and comment above) after discussing the issue of "adding new members" from source generators with @CyrusNajmabadi and @jkoritzinsky the other day. Under this proposal, would existential types support being declared as partial in implementing types? Because if that was the case, the pattern mentioned in #5556 (comment) could be supported by having the user declare the partial type themselves, and the generator would just implement it, instead of also declaring it from the start (which I understand makes the generator less efficient, in theory). That is, would this be supported? // Interface declaration shipped in a library
interface IFoo
{
type TBar : IBar;
void SomeMethodUsersShouldImplement();
interface IBar
{
void SourceGeneratedMethod();
}
}
// User code in a project referencing the library
partial class Foo : IFoo
{
[GeneratedBar]
partial type TBar;
public void SomeMethodUsersShouldImplement() { }
}
// Generated code
partial class Foo
{
partial type TBar = Bar;
private class Bar
{
public void SourceGeneratedMethod() { }
}
} This seems like a very powerful and flexible pattern in general 👀 |
@Sergio0694 you almost certainly wouldn't be able to use a file type there. |
@333fred updated to use a private type, just to keep things simpler and not risk getting sidetracked 😄 But in general, ignore the specific accessibility, could also just be public, doesn't matter. |
Probably |
I know this may be a late opinion, but I'm more inclined towards "use site" existential types, which would allow this functionality to be used for existing interfaces with no change to them whatsoever. Furthermore, the programmer can choose what he makes an explicit type parameter, an implicit type parameter, and an existential type parameter. On top of that, we wouldn't need a new concept of interfaces that can't be used like regular ones (that is, can only be used as a generic constraint). public static class LinqExtensionMethods
{
// `TEnumerable` and `TReturn` are explicit.
// `TItem` is implicit and inferred by the compiler from the `where` constraints.
public static <TItem> TReturn Select<TEnumerable, TReturn>(this TEnumerable source, Func<TItem, TReturn> func) where TEnumerable : IEnumerable<TItem>
{
// Implementation doesn't matter.
}
} And if I don't need to know about public static class PrinterExtensionMethods
{
// Here `T` from `IEnumerable<T>` is existential.
public static void PrintAll<TEnumerable>(this TEnumerable source) where TEnumerable : IEnumerable<?>
{
// Implementation doesn't matter.
}
} I believe this feature would not require any changes to the CLR, we can just make everything syntax sugar and during lowering the compiler could just make everything an explicit parameter. How these methods show up in reflection, as well as the order of the compiler-generated parameters, however, are another discussion. |
@Sergio0694 The problem with partial associated types (they are associated types in this proposal, the original proposal did have something akin to existential types with full Skolemization, and this proposal was a response to that one, but this one doesn't really have existentials) is that ordering is important, so you couldn't have, for instance partial interface I1
{
type T1;
}
partial interface I1
{
type T2;
} As the ordering of those members is semantically important in the lowering. That's not to say that we couldn't choose the member syntax, but we would have to restrict members to appearing in only one declaration. @abner-commits Your proposal doesn't really solve one of the main things that this one tries to, which is that writing signatures with tons of generic parameters really sucks. |
Except it does. I fail to see how it would "suck" to discard parameters I don't care about like this: void DoSomething<T>(T target) where T : IInterface<?, ?, ?, ?>
{
} Is that any less ergonomic than having to declare an interface just for usage within generics? Plus there's the advantage that this applies to already existing interfaces, and I can add constraints to the implicit types if I want, further enhancing what we can accomplish with genetics. Here's an example of what would be possible to do with implicit type parameters + existential types: public <TEnum> void DoSomething<TDict>(TDict target)
where TDict : IReadOnlyDictionary<TEnum, ?>
where TEnum : struct, Enum
{
} |
@agocke apologize for the potentially dumb question, but could you elaborate why the ordering would be important here? As in, why would the various associated types for the interface not be handled like other members (eg. fields, methods, etc.) that can be scattered around over multiple partial declarations, and then they're just handled accordingly by name? I'm not sure I'm following why would a |
@Sergio0694 The lowered encoding is as type parameters, so it matters whether @abner-commits Writing out discards is worse than not writing out discards. The other problem is that the purpose of the associated type are different. When you have an associated type that can only be referred to by the implementor, it creates a contract where the type parameter is abstract -- an implementation detail. If you allow them to be manually substituted, then you remove the "hiding" functionality of making it an implementation detail. There's a fundamental question of where the abstraction sits. |
Ooh, of course, yeah that makes perfect sense now. Thank you! 😄 |
I believe having to write out discards is a minor inconvenience that in return allows you to have fine-grained control over generic parameters. I can have Your point about the existential type being an implementation detail I would argue depends on the context. I may want to use the interface in a context where I do care about the existential type, but I still don't want to be tied to a specific implementation of that interface. Likewise, I may want to use that same interface and not care about its generic types. In my opinion, the level of abstraction should be up to the API consumer to decide. If I feel like the generic type doesn't matter for the context I'm in, I can discard it, otherwise I can use it as an implicit parameter, which will even make consuming my API simpler. |
…per types" With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action. The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
…per types" With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action. The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
…per types" With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action. The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action. The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action. The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
Practical existential types for interfaces
Terminology note: What's called "existential types" in this proposal is more correctly called "associated types" in other languages. The previous proposal, referenced below, did not have the same restrictions and thus implemented something much closer to a pure existential type without type erasure. This proposal is more restrictive and doesn't support using interfaces with associated types in all locations. You can think of every mention of "existential type" in this proposal as meaning "associated type."
Intro
Previously I proposed adding some form of existential types to C#. In that proposal, I describe existential types at a high level and describe how some features could be added to C#. However, I didn't propose any particular implementation strategy and left many questions open, including the mechanism for enforcing type safety.
In this proposal I will describe a full implementation and type safety mechanism, as well as a metadata representation.
To start, the syntactic presentation. I propose that existential types be represented as abstract type members of interfaces. For example,
The existential type is substituted when the interface is implemented, e.g.
This syntax is similar to other languages with existential types and provides a clear separation from existing type parameters on types.
This syntax also emphasizes the differences described in the earlier proposal: unlike regular type parameters, which are provided by the creator of a type, existential types are provided by the implementer of the interface and are invisible at type creation.
This does raise the question left open in the previous proposal: how to define type equality. Since each interface implementation may have a unique subsitution for the existential type, type equality depends on the exact type of the implementation. Notably, the interface itself is not an implementation, so the following would not type check:
Worse, the type of
x
is difficult to express in the language, as-is. It is in some sense a type parameter, but there isn't a named type parameter in scope to use to refer to it. Inside the interface we call itIface.Ext
, but this is not actually a type, it is a type parameter. The actual type is whatever was substituted by the implementation. In the case of our exampleC
above, the type isint
.However, we can improve the power of the feature using a different feature: existing C# generics. If we avoid using the type parameter as a type, and instead use it as a constraint, things get simpler:
With this usage, we can be confident that the implementations will produce "compatible" types. This leads to the following proposal: interfaces with type members should only be usable as generic constraints. With this restriction, we can treat type members as relatively standard C# types, usable in the places where type parameters would be permitted. That this is type safe may not be obvious, but the proposed reduction to existing .NET metadata will verify that the resulting code is type safe.
Motivating example
The examples above demonstrate simple usages, but don't give an example of practical advantages. One opportunity is improved optimizations. Consider LINQ. As Jared Parsons described in a blog post, two of the biggest weaknesses of
IEnumerable<T>
are the repeated interface dispatches, and the abstraction of the enumerator type. As he describes, we could improve the pattern using generics:One big problem with this pattern is it makes the enumerator into either an additional type parameter which needs to be manually propagated, or public surface area. This is a job much better left to the compiler. This is how it could be written with existential types:
The enumerator type is now appropriately elided for everyone except the implementor. A user might write
This is more verbose than not using generics, but that is a more general concern about verbosity of generics.
And on the implementor side, it would look like this:
This is much the same code that Jared wrote, and should provide the same performance benefits.
Compilation
In the previous proposal, I described a lowering strategy based on logic theorems around existential and universal type equivalence. This technique is powerful and flexible, but much more complicated and difficult to implement. The above design has substantial limitations on how existential types can be used, therefore the implementation can be much simpler. If these limitations prove too onerous in the future, some restrictions may be loosened with a more complex compilation strategy.
The proposed compilation strategy is broadly quite simple: turn existential types into hidden generic parameters. This may seem extreme, but note that the language rule for type members requires the interface which contains them to only be used as constraints. This means that compilation may add generic parameters, but it will never make a method generic which wasn't before, and the type parameters will not spread past the introduction of the constraint.
Here's a simple example of the code before and after.
Before:
After:
The important transformations are:
in the interface implementation.
with type members, all type members must be added to the parameter list.
inferred.
Most of the transformations are simple, but the synthesizing and inferring of type parameters may be worth some elaboration.
First, we need to establish what is necessary for inference. To do so, we need to determine which synthesized type parameters "belong" to which type parameter. This should be doable using synthesized attributes or modreqs to point to the "owning" parameter's index.
Once we know the owning parameters and the synthesized parameters, we can temporarily remove the synthesized parameters and perform type inference as currently specified in C#, or use the manual substitutions. Once the substitution is identified, we can identify the substitutions for the synthesized parameters. First, identify the needed interface by walking the constraint list in order. Next, determine the substitutions in the interface implementation. As currently proposed, the syntax only allows for a single implementation of a given interface for a given type. By searching types from most to least derived for the first implementation of the target interface, we can identify the substitutions on the argument. By matching the substitutions of the argument with the synthesized type parameters, synthesized type arguments can be generated.
The process above can be repeated for all type definitions and substitutions.
Consider also banning re-implementation of the same interface across inheritance. It's not a type safety violation (and therefore shouldn't need a runtime check), but it could lead to confusing behavior on which implementation is chosen, especially since it is always fully inferred.
Conclusion
Advantages
Drawbacks
Design Meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-02-16.md#practical-existential-types-for-interfaces
The text was updated successfully, but these errors were encountered: