-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change/Expand Rect.collideobjectsall
for optimization
#2787
Comments
I would be okay with this, as it wouldn't break existing code. Callables should continue to be handled the way they are, we are just adding an extra string handle path. |
I'm skeptical of adding API surface like this for optimization. It's more branches, more code to test, more to maintain, and it provides no benefit to existing users. Especially since this is a niche function. Plus, I've found a far easier solution-- Rewrite it in Python. from pygame import Rect
import random
import time
random.seed(36)
use_python = False
class Obj:
def __init__(self, x, y, w, h):
self.xa = Rect(x, y, w, h)
r = Rect(-20, -20, 100, 100)
objs = [
Obj(
random.randint(-100, -100),
random.randint(-100, 100),
random.randint(-100, 100),
random.randint(-100, 100),
)
for _ in range(5000)
]
start = time.time()
if use_python:
for _ in range(1000):
colliding_indices = r.collidelistall([obj.xa for obj in objs])
colliding_objs = [objs[colliding_index] for colliding_index in colliding_indices]
else:
for _ in range(1000):
colliding_objs = r.collideobjectsall(objs, key=lambda e: e.xa)
print(time.time() - start) use_python = True does the same thing as collideobjectsall but is more than twice as fast. 0.14 seconds versus 0.34 seconds, Python 3.12. |
Some thoughts:
|
Your program on my machine (Python 3.12):
So my proposed change is still 77.5% faster than the python alternative while not requiring more python. |
You don’t get it? We should replace our implementation of this function with a Python one. There’s no reason our implementation has to be written in C. Might be a little tricky, but we’ve done a couple similar things before. This improves our implementation’s speed for existing code by taking a complex C function and rewriting it in 2 lines of Python. This would be a huge win. It’s faster on 3.9 too, but it’s more faster on 3.12. And maybe even better with future versions! |
Alright, Hopefully someone will do this eventually. Actually your idea of using collidelistall will get even faster with my PR: #2786 so the future is bright for a python implementation. Now that we're on this, should i remove the optimization for collideobjects/collideobjectsall in this PR or no? I guess a ltl speedup with no drawbacks isn't really an issue but i'm not rly attacched to it or anything so it's whatever. |
Hmm, my example was unrealistic because I didn't show how it could be made compatible with the existing arguments. It didn't use the lambda function. This is more realistic, but doesn't show nearly as good of a speedup: from pygame import Rect
import random
import time
random.seed(36)
use_python = True
class Obj:
def __init__(self, x, y, w, h):
self.xa = Rect(x, y, w, h)
r = Rect(-20, -20, 100, 100)
objs = [
Obj(
random.randint(-100, -100),
random.randint(-100, 100),
random.randint(-100, 100),
random.randint(-100, 100),
)
for _ in range(5000)
]
start = time.time()
def collideobjectsall(objects, key):
colliding_indices = r.collidelistall([key(obj) for obj in objects])
return [objects[colliding_index] for colliding_index in colliding_indices]
if use_python:
for _ in range(1000):
collideobjectsall(objs, key=lambda e: e.xa)
else:
for _ in range(1000):
colliding_objs = r.collideobjectsall(objs, key=lambda e: e.xa)
print(time.time() - start) It is still faster to do it in Python in Python 3.12, but not in Python 3.9.
Also good point these will all be sped up with your optimization PR, even the Python version. I'm bummed I didn't test it correctly and it's not as good as I thought, but I still think it's a good opportunity to pursue. |
I mean I guess we could drop the python 3.12 version down to about 0.15 with the optimization but if optimizing C still gives us 0.08s of runtime AND it's python version independent then i think it's better to bite the bullet and implement the C version. I still think you're making it look worse than it actually is on the C side but these are my 2 cents anyway. |
I've managed to get a good improvement inside the current status quo here by using vectorcall: #3048 |
The current Rect.collideobjectsall() method is slow, like really slow, especially when compared with all the other multicollision methods:
After analysis, it became evident that the bulk of the runtime is attributed to parsing and invoking the key argument, which only accepts a callable. Strikingly, this is also the slowest scenario:
Surprisingly, invoking a lambda function to access an object's attribute in C is markedly slower than passing the attribute's name as a string directly and retrieving the attribute.
Hence, I propose either of the following solutions:
key
keyword argument to accept strings as well, enabling direct attribute retrieval in such instances.from_attr
, which exclusively accepts strings for direct attribute retrieval.In practice, this would allow the following alternatives:
Either of these would work and make it possible to massively optimize the algorithm. From my testing this change alone brings a 190% improvement on average:
PS. this is also true for
.collideobjects()
since it shares most of this function's behaviour.The text was updated successfully, but these errors were encountered: