-
-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize vector parsing in math.c #2443
Optimize vector parsing in math.c #2443
Conversation
It combines the functionality of pgVectorCompatible_Check and PySequence_AsVectorCoords, which optimizes the overall runtime because those two functions have duplicate checks.
9ab246c
to
f6e5d4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice PR. I'm sure we can build on it further in the future, the whole module is full of places we can work on.
I'd like to suggest you use mean(timeit.repeat(...)) with a high repeat number for benchmarking instead of a single timeit that accumulates runtimes. This is because it yields smoother and better results overall since outliers won't count towards increasing the times in a suboptimal way.
And a side note.
Generally speaking, the idea of having 2 pointers, one with a stack allocated array in the case of a sequence being passed and a single pointer that switches between the array and an actual vector's coords memory address is a valid optimization across the whole module. It avoids having to use memcpy in the vector case, which is probably what matters most.
I swapped out my timeit.timeit call with Results
I'm happy to see it shows a higher speedup, with a +14.87% average now. |
You've suggested this in 4 separate comments on this PR, I replied when you brought it up in Ankith's thread: #2443 (comment). My initial testing actually found the code was slower without the memcpy, somehow-- maybe the compiler is optimizing a fixed side memcpy away? Assuming this PR gets merged, you can try this yourself and maybe show a speedup and maybe make a follow up PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the PR 🎉
Left a review for your consideration, resolve at will
Alright @itzpr3d4t0r now it's on you: further optimizations 😄 📈 |
Strategy-- The vector compatible check and sequence_vectorcoords functions duplicated some checks, so a unified function could optimize the overall runtime.
Conclusions: I had hoped to optimize vector_generic_math, but the results are mixed as you can see below. The runs are too variable for me to confidently say it's a speedup or slowdown in that area. However, this is definitely a speedup for Vector methods that take in and process vectors, especially when called with sequences instead of other vectors.
Vector methods are now 7% faster on average when taking vector arguments, 18% faster when taking sequence arguments.Vector methods are now 5.5% faster on average when taking vector arguments, 20% faster when taking sequence arguments.There are more instances of pgVectorCompatible_Check + PySequence_AsVectorCoords coupled in the code, but in harder to test areas, like in the elementwise proxy, so I'm leaving those be for now.
Benchmarking results (several runs before/after averaged):
Benchmarking script