-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add IList<T> extension methods providing in-place update operations #76375
Comments
Tagging subscribers to this area: @dotnet/area-system-collections Issue DetailsBackground and motivationThe A possible alternative to the above would be to roll your own extension method, but it would be even better if we shipped extension methods like that out of the box. API Proposalnamespace System.Collections.Generic;
public partial static class CollectionExtensions
{
public static void RemoveAll<T>(this IList<T> list, Predicate<T> predicate);
public static void RemoveRange<T>(this IList<T> list, int index, int count);
public static void Reverse<T>(this IList<T> list);
// List.Sort is not stable -- should this extension be stable?
public static void Sort<T>(this IList<T> list);
public static void Sort<T>(this IList<T> list, Comparison<T> comparison);
public static void Sort<T>(this IList<T> list, IComparer<T> comparer);
public stsatic void Sort<T>(this IList<T> list, int index, int count, IComparer<T> comparer);
// TOCONSIDER BinarySearch extension methods?
} API UsageRemoving properties of a given type in System.Text.Json can now be expressed as follows: public void ModifyTypeInfo(JsonTypeInfo ti)
{
if (ti.Kind != JsonTypeInfoKind.Object)
return;
ti.Properties.RemoveAll(prop => _ignoredTypes.Contains(prop.PropertyType));
}
} Alternative Designs
RisksNo response
|
If we add such extension methods, they should have the same semantics as IList<T> list = new List<T>() { ... };
list.Sort(...); to: List<T> list = new List<T>() { ... };
list.Sort(...); is a very subtle and unexpected source of bugs. If we update |
Assuming it doesn't regress perf we could do that, to my knowledge there is no stable in-place sorting implementation available. |
And if not, we might consider using a different name, e.g. |
There's been a long discussion of this in #68436. |
How are the Sort methods expected to be implemented? How efficient is the implementation going to be and is it going to require yet another copy of the whole Sort algorithm? In-place sorting of IList will require a virtual method call for each element get and set. It expect that it is often going to be faster to convert the IList to array first and then sort the array. |
Possibly. We might consider using an implementation that sorts using a temporary buffer, i.e. public static void Sort<T>(this IList<T> list, IComparer<T>? comparer = null)
{
T[] array = new T[list.Count];
list.CopyTo(array, 0);
Array.Sort(array, comparer);
list.Clear();
for (int i = 0; i < array.Length; i++)
list[i] = array[i];
} This might reduce the number of virtual calls, but does more allocations and copying. Benchmarking that vs. a bespoke |
... for that matter, checking if it is an actual |
Checking for List/Array would definitely be added to the implementation if we do follow that approach. |
namespace System.Collections.Generic;
public partial static class CollectionExtensions
{
public static void InsertRange<T>(this IList<T> list, int index, IEnumerable<T> collection);
public static void RemoveAll<T>(this IList<T> list, Predicate<T> predicate);
public static void RemoveRange<T>(this IList<T> list, int index, int count);
public static void Reverse<T>(this IList<T> list);
public static void Sort<T>(this IList<T> list);
public static void Sort<T>(this IList<T> list, Comparison<T> comparison);
public static void Sort<T>(this IList<T> list, IComparer<T> comparer);
public static void Sort<T>(this IList<T> list, int index, int count, IComparer<T> comparer);
} |
One thing that came to mind as I was watching the API review: Introducing At least, I think that could happen, please correct me if I'm wrong. |
Yes, although that's the case pretty much any time we add any instance method or extension method. For instance methods, if someone else had named an extension method the same thing, the newly-added instance method would start winning. And for extension methods, if someone else also has an extension method with the same shape, it would introduce ambiguities that break compilation. |
Possibly they could live in the different namespace to avoid this problem (i.e. System.Collections.Generic.Extensions) |
Default Interface Methods... basically adding virtual methods to an interface. |
Collection capabilities could be added with interfaces such as: public interface ISortable //No base type
{
void Sort(...);
} And Conceptual sample implementation: public static void Sort<T>(this IList<T> list, ...)
{
if (list is ISortable<T> sortable) sortable.Sort(...);
else SortIList(list); //Generalized algorithm for any IList<T>
} This idea could be applied to other algorithms such as binary search as well. Others that come to mind: This has some overheads but it results in reasonable behavior and performance in most cases. It is hard for users to really go wrong with this. Invocations on small data sets will feel some overhead, but performance at scale will be near optimal both asymptotically as well as practically. This should also play nice with overload resolution. The compiler will automatically prefer ways of calling that have less indirection in them. It also should be a highly compatible design. It's easy to upgrade types and add new algorithms. |
* I would love if the implementation attempted to cast to |
@CyrusNajmabadi, @cston, if we add an AddRange extension method on |
When constructing a collection for a collection expression targeting Currently, the compiler is not using class MyList<T> : IList<T> { ... }
IEnumerable<int> x = [1, 2, 3];
MyList<int> y = [..x]; // y = new MyList(); ((IList<int>)y).AddRange(x); |
What is the type of the underlying instance? According to our guidelines, we don't expect libraries to expose their data as |
@stephentoub @cston why does it matter if |
@terrajobst, it's about populating the destination type. If you have: class Foo : IList<T>
{
public Foo() { }
public void Add(T item) { ... }
... // IList implementation
} and someone writes IEnumerable<T> data = ...;
Foo foo = [..data]; does the C# compiler generate: Foo foo = new Foo();
foreach (T item in data) foo.Add(item); or: Foo foo = new Foo();
CollectionExtensions.AddRange(foo, data); or... |
Right, so what matters is whether |
I don't understand. Here But there's also the case where the target type is an interface, e.g. this is perfectly legal: IList<T> list = [..data]; and the question there applies as well: would the compiler use an extension AddRange. Chuck's answer was that the planned answer to both questions is "yes". |
Ah, got it. That makes sense. I'd hope that the compiler would prefer an |
In most cases it's |
See these, specifically
|
Ah, yeah. That one. The biggest regret of my life :-) |
Seems to cut both ways though. Often times where we shipped a collection property as |
Pulling in @jnm2 @cston @RikkiGibson and @captainsafia for the collection-expr side of this. I have thoughts (and my initial thinking is that this is actually a good thing for collection-exprs). But i'm heads down on fire, so i'll write things out later :) |
Ok. So the topic of // Where SomeDestCollection does *not* have the CollectionBuilderAttribute on it.
SomeDestCollection x = [x, y, z, .. w, .. v]; The default emit for this will effectively be: SomeDestCollection __result = new();
__result.EnsureCapacity(3 + __w.Length + __v.Length); // if EnsureCapacity exists and `w/v` are statically known to expose lengths.
__result.Add(__x);
__result.Add(__y);
__result.Add(__z);
foreach (var __t in __w)
__result.Add(__t);
foreach (var __t in __v)
__result.Add(__t); However, if we do find available SomeDestCollection __result = new();
__result.EnsureCapacity(3 + __w.Length + __v.Length); // if EnsureCapacity exists and `w/v` are statically known to expose lengths.
__result.AddRange(new InlineArray3<> { __x, __y, __z }.AsSpan()); // If there's an AddRange that takes spans.
__result.AddRange(__w); // if there are AddRange's that support the instances being spread.
__result.AddRange(__v); However, we do have some general concerns about this. Consider a real world example: HashSet<int> hashSet = ...;
List<int> list = [1, 2, 3, .. hashSet]; With this proposal, we would now see an As such, while we're still only in the preliminary thinking stage, we're leaning toward only using these helpers when the exact types match (so we're not losing information, or the ability to enumerate without allocation), or perhaps when we know the input types, and we're calling some well known BCL AddRange method. For example, if we knew the above AddRange was fine when you passed in a |
I'd be hesitant to let the compiler call an
However, it is less IL emitted at the collection expression site, and I'm not sure if that's interesting as a code size benefit. |
Instead of extension methods, maybe these could be
They could be non-DIM on the interface, if we're afraid of DIMs on core types. It would also mean they don't show up on concrete implementations, thus alleviating the compiler concern. |
I usually define these extension methods so that they could be specialized for common types like |
I mean that's what DIMs are for 😄 I think it's really time for the DIMs. They are already used all over the place for core primitive types like |
I wouldn't add all of these overloads as DIMs though: public static void Sort<T>(this IList<T> list);
public static void Sort<T>(this IList<T> list, Comparison<T> comparison);
public static void Sort<T>(this IList<T> list, IComparer<T> comparer);
public static void Sort<T>(this IList<T> list, int index, int count, IComparer<T> comparer); I think only one of them should be a DIM and virtual, probably the last one. The rest should be non-virtual (sealed) and delegate to that one. |
I don't agree that we shouldn't add the default implementation because it would be slow. It would be no more slower than what people can already do today on top of the interface manually (and what they do do), or what they could possibly ever do on top of the interface. If we had a second interface like EDIT:
Or you'd add your own extension method that does the appropriate thing. At which point, we might as well have the DIM, since this would be equivalent to that. Actually, even the default DIM would be much better than people doing their own custom logic or their own extension methods like today, because they will probably implement it naively by repeatedly doing Insert or Remove from the middle when it's possible to implement it in a better way by copying and removing from the end. |
Maybe it's worth adding more methods public void Move(int oldIndex, int newIndex);
public void MoveRange(int fromIndex, int toIndex, int count); |
@terrajobst is there more context on why this specific guideline was created? I've seen it before but I'll be honest in saying I never follow it, instead preferring to return either I'm aware there is some overhead with the fact it's an interface (with the virtual calls and such), but does the advice still hold as strongly as it did way back when it was defined at the "beginnings" of .NET, considering all the advancements made to performance? Were there more concerns other than performance itself?
I noticed nobody commented on this proposal from @GSPP . It was one of the first things that came to my mind when reading this proposal too. Is this not a good approach for some reason? Maybe there would need to be a distinction between "sortable in-place" vs the standard concept of being sortable which is achievable just by implementing |
Background and motivation
The
List<T>
type provides a number of methods providing bulk in-place updates, such asRemoveAll
,RemoveRange
,Reverse
andSort
. The same cannot be said aboutIList<T>
where common in-place operations are cumbersome to achieve. Consider for instance the following System.Text.Json example used in our own docs repo:https://github.com/dotnet/docs/blob/6c6942bcb4594dc063db293e5fff33c6bea0b331/docs/standard/serialization/system-text-json/snippets/custom-contracts/IgnoreType.cs#L26-L38
A possible alternative to the above would be to roll your own extension method, but it would be even better if we shipped extension methods like that out of the box.
API Proposal
API Usage
Removing properties of a given type in System.Text.Json can now be expressed as follows:
Alternative Designs
Sort
implementation be stable? Or matchList
/Array.Sort
semantics?BinarySearch
extension methods?Risks
No response
Related to dotnet/core#2199
The text was updated successfully, but these errors were encountered: