-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPU: Inline and batch MFC list transfers #12763
Conversation
51dc9ad
to
8c85bb3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion here is to move the method to its own file to try and keep SPUThread.cpp navigable (not that its in good shape even now). The inlining just makes the function way too large. Create a SPUListTransfer.cpp or something and move it there. This whole file needs some refactoring for the large meta-functions.
rpcs3/Emu/Cell/SPUThread.cpp
Outdated
constexpr usz _128 = 128; | ||
|
||
// Force constexpr std::max | ||
#define mov_t(type, index, _ea) { const usz ea = _ea; *reinterpret_cast<type*>(dst + index * std::integral_constant<u64, std::max<u64>(sizeof(type), sizeof(v128))>::value + (ea & 0xf)) = *reinterpret_cast<const type*>(src + ea); } void() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ALL_CAPS style for macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done,
|
||
u64 addr = begin; | ||
|
||
// Optimization: if range_locked is not used, the addr check will always pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did this code go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complete check is now in the function so register limit won't be reached, most of the time it's unlocked so the initial check suffices. I've optimized the underlying function for it.
Enabled fully the optimization for Atomic RSX FIFO which makes the perf boost more significant yet stability hasn't been sacrificed. |
166c905
to
b1ae277
Compare
Bad conditions led to optimization misses and long-generated code.
Reduces overall CPU profiling load of this function from 9.6% to 6.8% and grants me about 2 fps in Sly 4.
This has a bit more effect when disabling Atomic RSX FIFO although it's partially active without it.