-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.x] Faster queue free #62444
[3.x] Faster queue free #62444
Conversation
caf404c
to
aea2d1c
Compare
98705d4
to
b4a0351
Compare
b4a0351
to
97447ff
Compare
329e5fe
to
2195681
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I believe the overflow when calculating child_list_id
is destined to happen as long as the numbers are large enough. The bit operation removed quite a lot possibilities and it requires a huge number of sibling nodes to overflow now. It's relatively safe even when overflows (not undefined behavior), so I think this is okay :)
Made me wonder what the maximum number of children might be! Not sure this is correct but 😀 The group can be a maximum of 536,870,911 .. (29 bits) so I think we'd need 2147483647 - 536870911 to overflow: So I think that's enough for 1 and a half billion children of a node. (It could happen!) |
I'm wondering if we should consider this a compat breaking change since it can affect observable behavior (e.g., user scripts handling |
Also, I'm wondering if it may be better that, instead of sorting just when about to mass-delete, the delete queue is backed by an already sorted container, such as hash set. |
Sure I can put this in. I would strongly suggest that any sort of guarantee of order should be discouraged - I'll add this in the classref for
I'm not super familiar with hash set - what advantage are you thinking in terms of hash set over sorting? I'm assuming the standard Godot sort is something like qsort (i.e. in place and not using dynamic allocation) .. I haven't looked in detail. |
Since we're speculating here, I prefer going with the "ask for forgiveness" approach. Change the behavior without opt-out, see if anyone complains, and consider adding an option for those users if they exist. Otherwise we're just adding configuration options for things which may not be needed. Betas don't have to be production ready, so if this behavior change is a showstopper for some users, they should report it after testing the beta and stick to using 3.5.x until their feedback is addressed in later 3.6 betas. |
2195681
to
493b40b
Compare
I actually agree here... and I'd just added the project setting lol. I'll change it back, but keep a comment in the EDIT: Done. |
Calling queue_free() for large numbers of siblings could previously be very slow, with the time taken rising exponentially with number of children. This looked partly due to ordered_remove from the child list and notifications. This PR identifies objects that are nodes, and sorts the deletion queue so that children are deleted in reverse child order. This minimizes the costs of reordering.
493b40b
to
edc85d2
Compare
Fair enough. |
To be honest, it's hard to tell for sure without benchmarking, but my idea with a sorted container (it'd be |
+1 for this. I am developing a hex turn-based strategy game with thousands of thousands of tiles and need a fast |
We discussed this yesterday with @reduz and @lawnjelly. |
Thanks! |
Nice! |
Calling
queue_free()
for large numbers of siblings could previously be very slow, with the time taken rising exponentially with number of children. This looked partly due to ordered_remove from the child list and notifications.This PR identifies objects that are nodes, and sorts the deletion queue so that children are deleted in reverse child order. This minimizes the costs of reordering.
Fixes #61929
Supercedes #61932
Performance testing, number of child nodes deleted
50000 nodes before 137017 ms, after 422 ms seconds (325x faster)
20000 nodes before 19425 ms, after 250 ms seconds (78x faster)
5000 nodes before 1228 ms, after 169 ms seconds (7.3x faster)
1000 nodes before 194 ms, after 150 ms seconds (1.3x faster)
(at lower node numbers the measurement from the MRP in #61929 becomes less useful as there are some fixed delays, however the pattern is that the benefit really starts to show with a decent number of children)
Notes
data->pos
, and sorting by child id is a superior approach. It achieves the same upside of deleting in reverse order, but eliminates the possible downside of deleting small numbers of children from a parent that has a large number of children.NOTIFICATION_MOVED_IN_PARENT
. While this increased performance over baseline, it wasn't nearly as effective as the reverse ordering in this PR.