Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to omit memory allocation for closures #18

Open
xzxzxc opened this issue Jun 17, 2024 · 4 comments
Open

Make it possible to omit memory allocation for closures #18

xzxzxc opened this issue Jun 17, 2024 · 4 comments

Comments

@xzxzxc
Copy link

xzxzxc commented Jun 17, 2024

Problem

Consider the following code

using Cathei.LinqGen;
 
int[] array = new int[] { 1, 2, 3, 4, 5 };
int upperLimit = 3;

int result = array.Gen()
                  .Where(x => x < upperLimit) // <- closure, memory allocation
                  .Sum();

A new instance of a compiler-generated class will be created each time this code executes to store closure values (upperLimit in this case). We could omit an allocation if we pass the upperLimit variable to the predicate. For example, GetOrAdd method of a concurrent dictionary has a factoryArgument parameter to achieve the same improvement.

Proposed solution

Add an overload for the Gen method with an argument representing the state needed for delegates. Then, the previous code can be replaced with the

using Cathei.LinqGen;
 
int[] array = new int[] { 1, 2, 3, 4, 5 };
int upperLimit = 3;

int result = array.Gen(upperLimit)
                  .Where(static (x, upperLimit) => x < upperLimit) // <-  no closure, delegate will be cached
                  .Sum();
@cathei
Copy link
Owner

cathei commented Jun 17, 2024

Thank you for your proposal. This is interesting idea- I have two concerns about it.

  1. If we put this argument in Gen, it may should effect all the subsequent Linq query, and that could be complex when there are multiple object to pass. We might end up with verbose query like:
array.Gen(context1, context2)
   .Where(static (x, c1, _) => x < c1)
   .Select(static (x, _, c2) => x + c2)

Ofc, this can be bypassed if we pass context to each query instead.

array.Gen()
   .Where(context1, static (x, c1) => x < c1)
   .Select(context2, static (x, c2) => x + c2)
  1. This could be done with current IStructFunction support, and faster.
struct LimitPredicate : IStructFunction<int, bool>
{
    private int _limit;
    public Predicate(int limit) => _limit = limit;
    public bool Invoke(int value) => value < _limit;
}

int result = array.Gen()
                  .Where(new LimitPredicate(upperLimit))
                  .Sum();

This way, we can pass context and enable inlining via value type generic specialization.

@xzxzxc
Copy link
Author

xzxzxc commented Jun 17, 2024

Thank you for considering my idea! I have a few comments about the concerns.

  1. We can use value tuples with named items to simplify delegates' code. Then, the example you provided can be changed to
array.Gen((context1, context2))
   .Where(static (x, tuple) => x < tuple.context1)
   .Select(static (x, tuple) => x + tuple.context2)

Also, it would be nice to have an overload accepting a delegate without the second argument:

array.Gen((context1, context2))
   .Where(static (x, tuple) => x < tuple.context1)
   .Select(static (x, tuple) => x + tuple.context2)
   .Select(static x => x + 5)
  1. IStructFunction is excellent for performance but requires more code. Also, LINQ has a well-known API, and adding something like LimitPredicate can confuse the uninitiated reader.

@xzxzxc
Copy link
Author

xzxzxc commented Jun 17, 2024

After I wrote the previous comment, I understood that there is another concern - we already have an overload of the Select method which accepts a delegate with two arguments - Select<TSource,TResult>(IEnumerable, Func<TSource,Int32,TResult>. And I'm not sure what is the best way to resolve this collision. Perhaps it's to have the following overloads:

  • Select<TSource,TResult>(IEnumerable<TSource>, Func<TSource,TResult>)
  • Select<TSource,TContext,TResult>(IEnumerable<TSource>, Func<TSource,TContext,TResult>)
  • Select<TSource,TContext,TResult>(IEnumerable<TSource>, Func<TSource,TContext,int,TResult>)

I assume that if you pass the "context argument" to the Gen, you would need to use an index less often than the "context".

@timcassell
Copy link

I implemented similar functionality in my async linq. You can pass a capture value to each query. Basically you just pass an extra value right before the delegate, and include that value as the first argument to the delegate. I think you could do the same here. It should be very easy to translate it to IStructFunction under the hood.

array.Gen()
   .Where(context1, static (c1, x) => x < c1)
   .Select(context2, static (c2, x) => x + c2)
   .Select(static x => x + 5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants