-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Godot Physics solver optimization #47846
Conversation
Several optimizations in the way solver islands are processed in both 2D and 3D physics: - Use LocalVector instead of linked list to avoid cache misses (with persistent storage based on worst case scenario) - Remove pairs when setup fails (no valid contact) to avoid unnecessary solving of non-colliding rigid bodies just to return immediately
d16d353
to
b65d6b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code seems good, just few comments.
|
||
for (const List<Pair<Constraint2DSW *, int>>::Element *E = p_body->get_constraint_list().front(); E; E = E->next()) { | ||
// Faster with reversed iterations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, if you can add why it's faster, it would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The truth is I'm not sure exactly why, but I could see a clear difference in all the tests I've run :D
I'll check a bit more if I can figure it out, otherwise I'll leave a note with a TODO to investigate that more later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm that's even more interesting then. Most likely this improvement is just related to this specific project, so if you change project setup the forward may be faster. I think that this highlight a possible optimization: If you can understand what's the determining factor, you could consider to add a sorting mechanism (somehow) so you can apply this optimization to any project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense. I agree it would be interesting to investigate that specifically with different use cases. For now I'm keeping a reversed order, because it's closer to what happened before this PR, due to how islands where fed in a reversed order (each new pair was set as the new root, and the previous root was set as the next pair).
} | ||
LocalVector<Constraint2DSW *> &constraint_island = constraint_islands[island_count - 1]; | ||
constraint_island.clear(); | ||
constraint_island.reserve(ISLAND_SIZE_RESERVE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clear doesn't de-allocate the memory, like reset
does, probably you can avoid call reserve
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still using reserve
here because of the case below where an island is created for just one area and this island would then be re-used here, to make sure it allocates for enough constraints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, make sense, another thing you could consider is pre-initialize those vectors. I'm saying this because the first time I saw such call there, it felt strange. However, it's not really important.
void _solve_island(Constraint3DSW *p_island, int p_iterations, real_t p_delta); | ||
void _check_suspend(Body3DSW *p_island, real_t p_delta); | ||
LocalVector<LocalVector<Body3DSW *>> body_islands; | ||
LocalVector<LocalVector<Constraint3DSW *>> constraint_islands; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local vector doesn't deallocate the memory when you remove an object, so it works really well when you use it a lot.
When you copy a vector, you lost any pre-allocated memory: https://github.com/godotengine/godot/blob/master/core/templates/local_vector.h#L238-L251
By calling constraint_islands.resize(island_count);
, you may lose the pre-allocated memory for the sub vectors. Each time you add a new object to the isle, since the vector is ""new"", it may need to re-allocate the memory. If you can avoid that, it would be better.
Have you considered to create a LocalVector
pool, where you reuse the vectors instead? A naive solution may be use new
:
- LocalVector<LocalVector<>>
+ LocalVector<LocalVector<> *>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a LocalVector
pool is a great idea, I'll see if I can update to add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After testing this again, it doesn't look like any copy happens in the situation where constraint_islands.resize(island_count);
has to re-allocate. It just does a memrealloc
and the sub vectors are still keeping their capacity intact:
godot/core/templates/local_vector.h
Lines 141 to 150 in b4b7c97
if (unlikely(p_size > capacity)) { | |
if (capacity == 0) { | |
capacity = 1; | |
} | |
while (capacity < p_size) { | |
capacity <<= 1; | |
} | |
data = (T *)memrealloc(data, capacity * sizeof(T)); | |
CRASH_COND_MSG(!data, "Out of memory"); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct for the vectors that are not removed, but in that function we have:
https://github.com/godotengine/godot/blob/master/core/templates/local_vector.h#L136
for (U i = p_size; i < count; i++) {
data[i].~T();
}
and
https://github.com/godotengine/godot/blob/master/core/templates/local_vector.h#L153
for (U i = count; i < p_size; i++) {
memnew_placement(&data[i], T);
}
That take care to remove and allocate the new sub vectors, in this case.
This mean that if in frame 1 you have 10 islands, the following frame it drops to 5, then it reach 10 again basically you have to reallocate 5 vectors again. Which is a waste.
If, it doesn't take too much time, you could research a way to avoid that: maybe by never resizing the main vector, but using an external variable to track the "active" isles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, it shouldn't be too hard to only grow the vector's size and keep track of the active islands as you suggested. I'll test in a scenario with many islands to see if it helps when I have a chance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndreaCatania Just as a follow-up, I've checked this scenario, and I didn't remember but the logic is already that it only grows. Basically the main vector can only resize following this logic:
if (constraint_islands.size() < island_count) {
constraint_islands.resize(island_count);
}
Which will prevent any previous sub-vectors from being deallocated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code looks good, and my feedbacks are not blocking since could even be implemented in separate PRs or not at all.
Good job!
Thanks! |
Several optimizations in the way solver islands are processed in both 2D and 3D physics:
LocalVector
instead of linked list to avoid cache misses (with persistent storage based on worst case scenario)Tested in:
https://github.com/godotengine/godot-demo-projects/blob/master/3d/physics_tests/tests/performance/test_perf_contacts.tscn
Performance improvement: 5-10% of the physics step time with 500 bodies.