-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add new batch overload (bucket factory) #833
Conversation
var newArray = bucketFactory(count); | ||
Array.Copy(bucket, 0, newArray, 0, count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the same as Array.Resize(ref bucket, count)
but manually to use bucketFactory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can certainly understand the argument for reducing memory pressure when working with very large batch sizes and thus arrays, but I think the proposed solution of simply adding an array factory argument to Batch
is very tricky to get right for the caller. In the simples of cases, it will lead to incorrect usage and bugs because it requires side effects (like lastSize
in your example and test), additional work like clipping each bucket (because the actual array length may be greater than the requested bucket size, especially when renting from a pool) and finally returning each array to the pool at the right time for the reuse to be effective through the iterations. You can see that even in the example you posted in your opening comment, only the last bucket is returned to the pool whereas the rest leak. All this also make the composing with Batch
very hard.
I think a sequence of buckets is probably the wrong return type for such an overload because it claims that each bucket is independent of another when that's not true, especially if the backing array is pooled. Each bucket will end up looking at the same array. This can cause even more severe logical bugs.
I've experimented with what a design could look for a version of Batch
using pooled arrays and will share it through another PR. Unfortunately, it's not a simple set of changes or tweaks so it's best to share, discuss and review the code in a PR.
@leandromoh See #856 and review welcome.
I think I managed to workaround this to still return a sequence. |
This is a squashed merge of PR #856 that supersedes #833. Co-authored-by: Stuart Turner <[email protected]>
This has been superseded by #856. @leandromoh Happy to get your review in future follow-up issue or PR when you get the time. |
It is useful to have an overload where user can pass the bucket factory. Pratical example:
In both cases are desirable to have the option to custome Batch behaviour, expecially when dealing with large arrays (e.g. +100k elements) where allocate a new array for every new batch is expensive.
Once added this new overload, users are able to create their customizated
Batch
version in their own applications, like the bellow for example.