Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Diva Extend - Speed regression since 1.6.3-221 #11227

Closed
Leopard20 opened this issue Jun 27, 2018 · 26 comments
Closed

Project Diva Extend - Speed regression since 1.6.3-221 #11227

Leopard20 opened this issue Jun 27, 2018 · 26 comments
Labels

Comments

@Leopard20
Copy link
Contributor

Leopard20 commented Jun 27, 2018

The game lags when there's a scene transition. This didn't use to happen in builds 1.6.3 and prior.
Version 1.6.3:
20180628_013444.mp4.zip

Version 1.6.3-271:
20180628_013030.mp4.zip

It may well be happening in other games too.

I'll see if I can narrow it down to the commit that caused this.

@Leopard20
Copy link
Contributor Author

Leopard20 commented Jun 27, 2018

@unknownbrackets
It's caused by your commit in build 221.
06340bf

@Leopard20 Leopard20 changed the title Project Diva Extend - Speed Regression Project Diva Extend - Speed regression since 1.6.3-221 Jun 27, 2018
@hrydgard
Copy link
Owner

Very surprising that this change would cause such a noticeable problem...

@unknownbrackets
Copy link
Collaborator

What device?

-[Unknown]

@Leopard20
Copy link
Contributor Author

Leopard20 commented Jun 28, 2018

Xiaomi Mi 5s Plus. Android 7.0. Snapdragon 821

My vulkan extensions and stuff:
screenshot_2018-06-28-11-21-21-684_org ppsspp ppsspp
screenshot_2018-06-28-11-21-28-601_org ppsspp ppsspp

My OGL extensions:
screenshot_2018-06-28-17-19-38-319_com finalwire aida64
screenshot_2018-06-28-17-19-30-424_com finalwire aida64
screenshot_2018-06-28-17-19-24-609_com finalwire aida64

@Leopard20
Copy link
Contributor Author

@unknownbrackets If you need more info let me know.

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Jun 29, 2018

Based on #9830, could it just be that we were drawing incorrectly before (due to the driver bug), and now that we're doing the testing, it's slower (which it was probably already slower on say OpenGL)?

If that's true, once the driver is fixed, the buggy performance will be gone for good.

If this happens on a device without Adreno, not affected by that bug, then it might be an argument against.

Also possible that fixing #11214 might affect this in some way, either positively or negatively...

-[Unknown]

@unknownbrackets
Copy link
Collaborator

Actually - is it possible to capture a GE dump during the time it is slower? Maybe it's possible to detect and optimize out whatever it's doing.

-[Unknown]

@Leopard20
Copy link
Contributor Author

@unknownbrackets
I captured 5 GE dumps around the same time when this happened on Android:
DUMP.zip
But it didn't happen on PC in that particular scene.

However, the PC version had lots of slowdowns in other scenes (which didn't happen on Android). I've captured a few of those scenes here:
DUMP2.zip

P.S: Sorry about the delay. I've been busy and I forgot about this completely :(

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Jul 8, 2018

The first and largest dump (0007) uses one shader that activates the new code:

00000000:0000d022 Tex TexAlpha TFuncMod AlphaTest >

Which is a pretty bog-standard shader. There are only four flushes that use this shader. The flushes are:

  • The text at the top (the part with "(DIVA edit)", drawn from basically a font atlas.)
  • The same again, for shadow.
  • A pointless draws where all X and U coords are equal, two triangles. Nothing is drawn on screen.
  • The same again. Probably it's a string "" being drawn.

So then, it should be the text. Hopefully the pointless draw isn't an issue (it's drawn with the same blending as the text anyway.) I'm surprised it's significant because it's only 144x12.

Alpha test: a > 0x19
Alpha blend: standard
Stencil test: disabled
Depth test: disabled

Two things come to mind here:

  1. In this case, because there's no stencil or depth activity, the workaround isn't needed. It's detectable by adding more shader bits.

  2. In theory, this could be changed to standard blending by pushing 0x19 as a uniform, and setting fragment alpha to 0 when lower. This would eliminate the discard for PowerVR/etc. Not sure how common depth test being off is, though.

Still, it's very weird that two small 144x12 draws had a significant perf impact.

Another dump (0008) for Android shows more draws which use the shader:

  • Lots of shrubs. Alpha > 7f, no blend, with depth >= test. Technically not safe to eliminate, may cause weird z issues for pixels that should be discarded.
  • A flower. Drawn the same way as the shrubs.
  • The same four draws as before

This second dump looks like it should be a lot slower if the change impacted everything evenly (there's so many shrubs), but they also seem like they would be drawn throughout - not only during scene transitions. So I guess they aren't responsible?

Is the text at the top only shown during transitions?

The PC dump is similar, at least in respect to draws changed by #11197, in that it is just text. This time the text at the bottom is not zero width, but those are the only draws that should be changed.

Unless layout (early_fragment_tests) in; has somehow made things slower, which I've been assuming wouldn't be the case. To validate this, can you try:

  1. Comment out WRITE(p, "layout (early_fragment_tests) in;\n");. If there's no change in speed, we might as well keep it. If it hurts speed, then let's get rid of it.

  2. Comment out WRITE(p, " gl_FragDepth = gl_FragCoord.z;\n");. If there's no change in speed, try commenting out WRITE(p, "layout (depth_unchanged) out float gl_FragDepth;\n"); too (this will tell us if we might have "ubershader" problems.)

Just to confirm, are the PC slowdowns new also since the same commit? Which GPU was it (sorry)?

-[Unknown]

@Leopard20
Copy link
Contributor Author

Leopard20 commented Jul 9, 2018

This second dump looks like it should be a lot slower if the change impacted everything evenly (there's so many shrubs), but they also seem like they would be drawn throughout - not only during scene transitions. So I guess they aren't responsible?

No they are not drawn throughout. They are completely separate scenes. The flower and the shrub are drawn at another scene two, but it looked different than this one.
BTW you can check out the two videos at the top to see how the scene changes.

Is the text at the top only shown during transitions?

No it's always there.

Just to confirm, are the PC slowdowns new also since the same commit? Which GPU was it (sorry)?

No it's always been a little laggy for me, at least since v1.5.4-900-ish. (the first version I tried PD on PC)
PPSSPP version during these captures was 1.6.3-271.
My GPU: Nvidia GTX 760m (mobile), using driver version 398.xx.

Another dump (0008) for Android...
The PC dump...

Just to be clear, all captures are made on PC. Is it possible to do this on Android too?

Unfortunately I can't build the app because I don't have Android Studio and stuff like that. If you could build the app yourself and post it here (or upload it somewhere else and put the link here) I'll test it for you. :)

@unknownbrackets
Copy link
Collaborator

What I mean is: if the shrubs or text are causing the slowdown, they should do it in every single frame when they are visible. You said the slowdown is during transitions, which tells me that it doesn't just chug for the entire duration the shrubs are displayed on screen.

Meaning: if the change to the shader made drawing the text silly slow, that shader is used EVERY FRAME that the text is drawn. It doesn't make a ton of sense that the text (and therefore shader) shows 100% of the time, but the slowdown doesn't happen 100% of the time.

If PC is laggy but not since the same commit, it could easily be texture scaling or render resolution or anything else. Sounds completely unrelated, then.

It's fine that the dumps were created on PC, I'm focusing on the slowdown.

Android Studio is free.

-[Unknown]

@Leopard20
Copy link
Contributor Author

Leopard20 commented Jul 11, 2018

I know it's free. But I don't feel like installing something just to build an app for a test. Even more so now that I have no use for it after.
Plus, last time I installed and uninstalled a SDK it totally f***ed up my PC and I certainly don't want to go through that again.

unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Jul 28, 2018
Where possible, replace alpha and color testing with a zero alpha value.

This allows early fragment tests more often, which may help hrydgard#11227.  It
may also generally help performance on PowerVR devices.
@unknownbrackets
Copy link
Collaborator

Does the latest git build change this behavior?

-[Unknown]

@Leopard20
Copy link
Contributor Author

Leopard20 commented Jul 31, 2018

@unknownbrackets
Sadly no. Though I feel like it's slightly better.

@Leopard20
Copy link
Contributor Author

@unknownbrackets Another thing I found is that every time I play that same scene again, it gets even worse (the first time it is close to lag-free). So you might be on the right track regarding discard and similar stuff.

@Leopard20
Copy link
Contributor Author

BTW, any chance you would want to revert that commit? It's clearly not helping anything, is it? There may not be an "adreno discard bug" after all.

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Aug 6, 2018

Why do you say it's clearly not helping anything? There is definitely an Adreno discard bug and that commit is helping the bug. The driver bug causes it to draw incorrectly, which can in some cases draw faster.

Basically, it's kinda like the "disable alpha test" setting that we removed a while ago (but more like half of it.) Your driver has part of that old setting forced on, by accident.

It's possible that in these specific scenes, it wasn't causing obvious rendering artifacts. It's very hard to detect which games would be affected how much (if we could, the disable alpha test setting would've been a bunch better.) It's also likely Adreno will at some point fix this bug, meaning you'll always get the slower version.

-[Unknown]

@unknownbrackets
Copy link
Collaborator

Maybe we should change this into a feature request to add a setting that will force "early_layout_tests" even when discard is used. That would make it more predictable and probably affect more GPUs than just the affected Adreno cards, if the performance gain is significant. It'd cause rendering glitches though.

It doesn't really make sense that it is slower the more times that scene renders. It should be equally lag free or slow each time. Am I crazy @hrydgard?

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented Aug 6, 2018

It really doesn't make sense indeed, barring extremely odd driver bugs, or driver falling back to some low performance mode, or similar ... bizarre.

@Leopard20
Copy link
Contributor Author

Leopard20 commented Aug 6, 2018

@unknownbrackets I meant I didn't notice any difference in any game whatsoever (except for this where it got slower)!

I'll conduct more tests to make absolutely sure it wasn't because of my phone clocking down or something!

@unknownbrackets
Copy link
Collaborator

Does it even get slower if you save state right before the scene, and then load state and repeat the scene over and over? Or is it only slow if you get to it a second time from the same session?

-[Unknown]

@Leopard20
Copy link
Contributor Author

Leopard20 commented Aug 9, 2018

The slowdowns from running the scene multiple times in the same session and without savestates have a random pattern. So I guess they are caused by CPU clockspeed variations.
But I can confirm that saving the state before the scene and loading it causes an even bigger slowdown. They plateau after a few tries.

@Leopard20
Copy link
Contributor Author

Leopard20 commented Aug 9, 2018

Here's another scene with slowdown. I captured 4 GE dumps. Each pair consisting of one dump right before the slowdown and one after (you can just check the first two, use the other two if I captured incorrectly).
DUMP.zip
Not sure if it matters, but the backend was set to Vulkan.

Maybe you can find something in common with the previous ones.

@unknownbrackets
Copy link
Collaborator

0016, the not slow one - uses the shader on flowers (you can see them near the bottom.) I think it's just one draw for all of them.

0015, the slow one - uses the shader on exactly the same flowers. Still looks like a single draw, but in this case, they are off screen (the camera is not pointing at them.) It's common for games to draw things that are actually off screen, they get automatically clipped by the GPU.

They are similar to the shrubs before, but those shrubs were visible in both cases. Seems strange.

-[Unknown]

@Leopard20
Copy link
Contributor Author

Leopard20 commented Sep 21, 2018

@unknownbrackets Actually, the above commit did help a little (or maybe it was some other commit!). The lag is now just barely noticeable.

@Leopard20
Copy link
Contributor Author

@unknownbrackets Good news! You hit two birds with one stone! This one is fixed too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants