Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execute_Prim: Add a smaller "inner interpreter" to speed long up sequences of PRIM commands #10658

Merged
merged 2 commits into from
Feb 28, 2018

Conversation

hrydgard
Copy link
Owner

@hrydgard hrydgard commented Feb 27, 2018

To avoid going back to the runloop during long sequences of PRIM commands, and thus also avoiding all sorts of checking for the current framebuffer etc that are part of Execute_Prim.

Helps performance by a quite measurable 1-5% in several PRIM-heavy games.

Not particularly pretty, but effective.

This is also a first step towards fixing the performance of Earth Defense Force 2 and I believe possibly a couple of other games (Gundam?) which flip cull direction every few primitives.

@hrydgard hrydgard added this to the v1.6.0 milestone Feb 28, 2018
…o the runloop during long sequences of PRIM commands.

Helps performance by a quite measurable 1-4% in several PRIM-heavy games.
@hrydgard
Copy link
Owner Author

Rebased on master.

@hrydgard hrydgard merged commit 995a1cf into master Feb 28, 2018
@hrydgard hrydgard deleted the inner-prim-interpreter branch February 28, 2018 20:50
@@ -1543,6 +1543,9 @@ void GPUCommon::Execute_Prim(u32 op, u32 diff) {
// PRIM commands with other commands. A special case that might be interesting is that game
// that changes culling mode between each prim, we could just change the triangle winding
// right here to still be able to join draw calls.
if (debugRecording_)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to add || host->GPUDebuggingActive() here so that step prim works.

-[Unknown]

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that depends on what you're debugging, it's kind of nice when stepping prim to see that the inner interpreter succeeds in running them in one pass, plus we don't flush after each prim when it does anyway so there's not much to see. But yeah, I guess in the usual case it's confusing to have it jump like that..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants