-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Channel.select_impl
#12756
Comments
This could be a patch for this, but I'd like to get it a bit more concise and it's not sufficiently tested. The spec suite has only one example with duplicates and removing the uniqueness filter doesn't break it. --- i/src/channel.cr
+++ w/src/channel.cr
@@ -458,12 +458,30 @@ class Channel(T)
end
private def self.select_impl(ops : Indexable(SelectAction), non_blocking)
+ # Avoid heap allocation for typical small ops sizes (< 10 actions)
+ ops_buffer = uninitialized StaticArray(typeof(ops.each{|o| break o}.not_nil!), 10)
+ if ops.size <= ops_buffer.size
+ ops.each_with_index do |op, i|
+ ops_buffer.to_unsafe[i] = op
+ end
+ ops_locks = ops_buffer.to_slice[0, ops.size]
+ else
+ ops_locks = ops.to_a
+ end
+
# Sort the operations by the channel they contain
# This is to avoid deadlocks between concurrent `select` calls
- ops_locks = ops
- .to_a
- .uniq!(&.lock_object_id)
- .sort_by!(&.lock_object_id)
+ ops_locks.unstable_sort_by!(&.lock_object_id)
+ duplicates = 0
+ ops_locks.each_with_index do |lock, i|
+ next if i == 0
+ if lock.lock_object_id == ops_locks[i - duplicates - 1].lock_object_id
+ duplicates += 1
+ else
+ ops_locks[i - duplicates] = lock
+ end
+ end
+ ops_locks = ops_locks[0, ops_locks.size - duplicates]
ops_locks.each &.lock We need two separate features for implementing this which I think could be extracted because they are useful on their own:
|
I don't think this is the way to go. The compiler rewrites a |
A bigger refactoring is suggested in #12694 (comment)
|
It shouldn't take much to switch the compiler to emit On its own, this doesn't change anything though. In order to improve performance we have to avoid heap allocations in This array is basically an ordered list of distinct lock objects. It prevents deadlocks from trying to lock the same lock twice (uniqueness) as well as deadlocks between concurrent calls (deduplication). Lines 423 to 459 in d40b44b
The uniqueness can easily be avoided when the list is sorted anyways. Identical lock_ids are ordered in sequence and can be skipped on repeat. If the sorting would apply to the original slice, it disrupts the order of actions so that their index wouldn't correlate to the index of the |
We could take inspiration from the I believe this could be a good solution. It's comparatively trivial to implement (compared to the current complexity of chosing the appropriate data type) and doesn't need a compiler change. |
Channel.select_impl
allocates an array for all action locks, even if there is only a single one (#12694 (comment)). This can be avoided.Most typically it should only be a handful of actions (< 10 or), so we could consider using StaticArray for stack allocation.
The text was updated successfully, but these errors were encountered: