Execute_Prim: Add a smaller "inner interpreter" to speed long up sequences of PRIM commands #10658

hrydgard · 2018-02-27T23:25:27Z

To avoid going back to the runloop during long sequences of PRIM commands, and thus also avoiding all sorts of checking for the current framebuffer etc that are part of Execute_Prim.

Helps performance by a quite measurable 1-5% in several PRIM-heavy games.

Not particularly pretty, but effective.

This is also a first step towards fixing the performance of Earth Defense Force 2 and I believe possibly a couple of other games (Gundam?) which flip cull direction every few primitives.

…o the runloop during long sequences of PRIM commands. Helps performance by a quite measurable 1-4% in several PRIM-heavy games.

hrydgard · 2018-02-28T08:19:20Z

Rebased on master.

unknownbrackets · 2018-03-02T02:11:39Z

GPU/GPUCommon.cpp

@@ -1543,6 +1543,9 @@ void GPUCommon::Execute_Prim(u32 op, u32 diff) {
 	// PRIM commands with other commands. A special case that might be interesting is that game
 	// that changes culling mode between each prim, we could just change the triangle winding
 	// right here to still be able to join draw calls.
+	if (debugRecording_)


Might be nice to add || host->GPUDebuggingActive() here so that step prim works.

-[Unknown]

I guess that depends on what you're debugging, it's kind of nice when stepping prim to see that the inner interpreter succeeds in running them in one pass, plus we don't flush after each prim when it does anyway so there's not much to see. But yeah, I guess in the usual case it's confusing to have it jump like that..

hrydgard added this to the v1.6.0 milestone Feb 28, 2018

hrydgard added 2 commits February 28, 2018 09:18

Execute_Prim: Add a smaller "inner interpreter" to avoid going back t…

292f116

…o the runloop during long sequences of PRIM commands. Helps performance by a quite measurable 1-4% in several PRIM-heavy games.

Don't use the inner interpreter when debug recording

ef3341e

hrydgard force-pushed the inner-prim-interpreter branch from 318cbab to ef3341e Compare February 28, 2018 08:19

hrydgard merged commit 995a1cf into master Feb 28, 2018

hrydgard deleted the inner-prim-interpreter branch February 28, 2018 20:50

unknownbrackets reviewed Mar 2, 2018

View reviewed changes

unknownbrackets mentioned this pull request Apr 22, 2018

Update README.md for 1.6 #10954

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Execute_Prim: Add a smaller "inner interpreter" to speed long up sequences of PRIM commands #10658

Execute_Prim: Add a smaller "inner interpreter" to speed long up sequences of PRIM commands #10658

hrydgard commented Feb 27, 2018 •

edited

Loading

hrydgard commented Feb 28, 2018

unknownbrackets Mar 2, 2018

hrydgard Mar 2, 2018

Execute_Prim: Add a smaller "inner interpreter" to speed long up sequences of PRIM commands #10658

Execute_Prim: Add a smaller "inner interpreter" to speed long up sequences of PRIM commands #10658

Conversation

hrydgard commented Feb 27, 2018 • edited Loading

hrydgard commented Feb 28, 2018

unknownbrackets Mar 2, 2018

Choose a reason for hiding this comment

hrydgard Mar 2, 2018

Choose a reason for hiding this comment

hrydgard commented Feb 27, 2018 •

edited

Loading