Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final Fantasy 4 - dialog boxes have "holes" #4617

Closed
unknownbrackets opened this issue Nov 22, 2013 · 41 comments · Fixed by #6737
Closed

Final Fantasy 4 - dialog boxes have "holes" #4617

unknownbrackets opened this issue Nov 22, 2013 · 41 comments · Fixed by #6737

Comments

@unknownbrackets
Copy link
Collaborator

This may also be affecting Final Fantasy 3 and Patapon, which have similar "lines" in various places. I have not checked those yet.

Rendering resolution does not impact this. Enabling the vertex cache makes it flicker, but with vertex cache off, the issue is persistent. It looks like this:

+?-----------------------+
|????????????????????????|
|?XXXXXXXXXXXXXXXXXXXXXXX|
|?XXXXXXXXXXXXXXXXXXXXXXX|
|?XXXXXXXXXXXXXXXXXXXXXXX|
|?XXXXXXXXXXXXXXXXXXXXXXX|
|?XXXXXXXXXXXXXXXXXXXXXXX|
+------------------------+

Where the ?s represent transparency. Clearly it seems to be an off-by-one sort of issue.

Anyway, FF4's drawing philosophy appears to be "the spoon does not bend." I haven't checked the actual vertices, but it generally adjusts the world matrix for every tile it draws, and this applies to the dialogs as well (for the letters, it adjusts the offset for each letter... this results in a lot of flushes, heh.)

Anyway, the middle center matrix looks like this:

World data # 880.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 156.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 1.000000
World data # 20.000000
World data # 7.000000
World data # 10.000000

And here's the top left (note, this doesn't meet the middle center above):

World data # 31.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 31.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 1.000000
World data # 8.500000
World data # -6.500000
World data # 0.000000

And here's the top left (doesn't meet the left side or center):

World data # 432.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 31.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 1.000000
World data # 240.000000
World data # -6.500000
World data # 0.000000

And the left middle (which DOES meet the top left, but not the middle center):

World data # 31.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 72.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 1.000000
World data # 8.500000
World data # 45.000000
World data # 0.000000

For good measure, here's the bottom right (this does meet):

World data # 432.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 31.000000
World data # 0.000000
World data # 0.000000
World data # 0.000000
World data # 1.000000
World data # 240.000000
World data # 96.500000
World data # 0.000000

My matrix math may suck, but the .5's there are conspicuous. Also, to note the order in which it draws the box, in case it matters:

92223
41115
67778

Where 9, 4, and 2 do not connect to 1.

I suppose this either means an FPU rounding difference, or (more likely?) a difference in the way the GE rounds coordinates.

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

Oops, forgot - actually the top middle and top left corner aren't connected either. Editing with details.

-[Unknown]

@dbz400
Copy link
Contributor

dbz400 commented Nov 23, 2013

I think the dialog issue is something like this

screen00067

@dbz400
Copy link
Contributor

dbz400 commented Nov 23, 2013

However , if turn off HW T&L , it looks correct.

screen00068

@hrydgard
Copy link
Owner

Hm, are these coordinates supplied as s8, s16 or float? It could be that OpenGL's interpretation of say s16 attributes is off by half or something. In software transform we do the conversion ourselves.

@dbz400
Copy link
Contributor

dbz400 commented Nov 23, 2013

How to check are these coordinates supplied as s8, s16 or float? (VertexDecoder??)

@hrydgard
Copy link
Owner

Just check the vertex type command before the drawcall in the ge debugger

@dbz400
Copy link
Contributor

dbz400 commented Nov 23, 2013

SetVertexType: through ,ARGB 8888 colors, u16 coords

@hrydgard
Copy link
Owner

It can just as well be the frame that is off as the box filling background. Are both really through? If so it shouldn't be bothering to set transform matrices per quad..

@dbz400
Copy link
Contributor

dbz400 commented Nov 25, 2013

@hrydgard , i think should be this screenshot .It is Float UV , u8 coords

1

@hrydgard
Copy link
Owner

u8 coords, OK (actually S8). I can imagine that our software mode might be scaling them slightly differently than OpenGL does in hw mode..

@thedax
Copy link
Collaborator

thedax commented Dec 17, 2013

Not sure if this is of interest, but two draws before it begins drawing the text box, the game squishes the screen like this (on the next draw, the screen is restored to its regular width):
Screenshot 01

When it's restored, one draw before the text box background draw:
Screenshot 02

@thedax
Copy link
Collaborator

thedax commented Dec 17, 2013

FF3 is also using through for its menu:

Screenshot 01

If you look closely, you can even see the prim is weird looking (seems to have a diagonal line on either side)..

@solarmystic
Copy link
Contributor

Well it looks like #5134 6eb493a has made it worse somehow, the issue is still present when HW TnL is enabled:-

screen00296

But now SW TnL (disabling HW TnL) doesn't rectify it anymore. Instead, you'll get the following screen:-

screen00297

In previous revisions (e.g. v0.9.6-463-g091ddd9 091ddd9), disabling HW TnL would solve holes in the dialogue box issue:-

screen00329

@unknownbrackets
Copy link
Collaborator Author

Oh, maybe I need to ignore hasTexcoord in software.

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

I had actually tested software transform before, but clearly not in the right places. Does it work better now?

2e91adc

-[Unknown]

@solarmystic
Copy link
Contributor

@unknownbrackets

Much better now with SW TnL, thanks!

screen00330

@unknownbrackets
Copy link
Collaborator Author

This also affects the world map, and causes a white line to appear persistently (from it drawing the sky.)

The game uses raw positions (0,0)-(64,64) using a triangle strip to draw the sky in both cases.

The first draw (right side) uses this world matrix:
World 0 1024.000000 0.000000 0.000000 0.000000
World 1 512.000000 0.000000 0.000000 0.000000
World 2 1.000000 113.000000 -88.927734 400.000000

The second draw (left side) uses this world matrix:
World 0 1024.000000 0.000000 0.000000 0.000000
World 1 512.000000 0.000000 0.000000 0.000000
World 2 1.000000 -399.000000 -88.927734 400.000000

Unfortunately, this doesn't work properly with or without hardware transform. In both cases, there's overlap. Software actually has greater overlap. The softgpu has the same overlap. Possibly the world matrix is being calculated incorrectly?

This hack "fixes" it:

    if (dirty & DIRTY_WORLDMATRIX) {
        if (gstate.worldMatrix[9] == 113.0f) {
            NOTICE_LOG(HLE, "HACK");
            gstate.worldMatrix[9] = 115.0f;
        }
        SetMatrix4x3(u_world, gstate.worldMatrix);
    }

114 isn't sufficient, but 115 looks perfect. So it seems like it's pretty far off...

This happens with jit on or off.

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented Jun 8, 2014

Is the game using fixed point coordinates (8-bit or 16-bit)? We may be off by 1/2 unit or so as we rely on OpenGL to convert them to float and it might do it wrong. Although if it's drawing rectangles, we're doing that transform in software ourselves. If it's a mix of draws that go through the two pipes, it may mismatch enough that it becomes a real issue...

@unknownbrackets
Copy link
Collaborator Author

It's drawing using triangle strips, and yes, it's using fixed point (8 bit.) But even in software transform it's not right.... in this case, both the halves of the sky are drawn the same way.

A screenshot might be clearer:
ff4-skyline

The two draws are basically identical in all settings except for the world matrix.

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented Jun 8, 2014

Hm, but maybe we are dequantizing 8-bit coordinates slightly wrong in all modes. I don't have a better explanation that would be plausible...

@unknownbrackets
Copy link
Collaborator Author

With:

                    for (int i = 0; i < 3; i++)
                        pos[i] = b[i] * (1.f / 128.f);

The problem goes away in software transform. Also in the softgpu.

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented Jun 8, 2014

Maybe it's that simple then. But then we need to compensate in the shader for OpenGL's 8-bit and 16-bit conversions being wrong, and in that case maybe we might as well expand them to float like that manually in the vertex decoder...

@unknownbrackets
Copy link
Collaborator Author

Okay, I've written a test to verify it. For s8, non-through (these are x coords):

-125 - 127: COLOR= first=ffffffff, total=127440, x=6 - 477, y = 1 - 270
-126 - 127: COLOR= first=ffffffff, total=127980, x=4 - 477, y = 1 - 270
-127 - 127: COLOR= first=ffffffff, total=128520, x=2 - 477, y = 1 - 270
-128 - 127: COLOR= first=ffffffff, total=129060, x=0 - 477, y = 1 - 270
0 - 127: COLOR= first=ffffffff, total=64260, x=240 - 477, y = 1 - 270
0 - 128: COLOR= first=ffffffff, total=64800, x=0 - 239, y = 1 - 270
0 - 255: COLOR= first=ffffffff, total=540, x=238 - 239, y = 1 - 270
0 - 1: COLOR= first=ffffffff, total=540, x=240 - 241, y = 1 - 270
0 - 2: COLOR= first=ffffffff, total=1080, x=240 - 243, y = 1 - 270

Changing it to 128 makes these match. I still have no idea what I'm doing wrong for s8/through, I can only get it to draw 1 pixel at 0,0 with any coords I throw at it.

Still figuring out s16. Throughmode there works as one would expect. Maybe s8/through is simply unsupported, since you couldn't use it very usefully anyway.

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

So s16, these should be the cutoffs:

-32768 - 32767: COLOR= first=aa000000, total=130560, x=0 - 479, y = 0 - 271
-32692 - 32767: COLOR= first=aa000000, total=130560, x=0 - 479, y = 0 - 271
-32691 - 32767: COLOR= first=ffffffff, total=130288, x=1 - 479, y = 0 - 271
-32555 - 32767: COLOR= first=ffffffff, total=130288, x=1 - 479, y = 0 - 271
-32554 - 32767: COLOR= first=ffffffff, total=130016, x=2 - 479, y = 0 - 271
-32419 - 32767: COLOR= first=ffffffff, total=130016, x=2 - 479, y = 0 - 271
-32418 - 32767: COLOR= first=ffffffff, total=129744, x=3 - 479, y = 0 - 271

This is with a "normal" viewport and offset (2048, 2048, 480, 272 - 2048-480/2, 2048-272/2 like most games use.)

On the other side, 32708 is the last pixel that should fill 478. 32709 - 32767 fill all through 479. I guess that means they reference 480, since I think (even in throughmode) polygons are non-inclusive or something.

Anyway, this suggests that it could be not 128 as above but something more involved. Our cutoffs are wrong, 1 is at -32698, 2 is at -32561, and 3 is at -32425. 32768 makes this -32426, so clearly not the right fix.

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented Jun 9, 2014

s8/through is not too useful, dunno if it's used in any games at all.

Strange that /128 would work for 8-bit but 32768 not for 16-bit.

Polygons probably follow standard fill conventions like OpenGL so that adjacent polys that share some vertices don't overlap (avoids double-drawn pixels which matters when blending).

@unknownbrackets
Copy link
Collaborator Author

I don't understand normals well. Is there any way I could test normals between 127 and 128?

-[Unknown]

@hrydgard
Copy link
Owner

Normals can only be tested indirectly, by how they affect lighting and texcoord generation... Should be possible to set up a test but in both cases there will be other factors/precision issues involved too.

Getting the lighting very slightly wrong matters a lot less than getting positions wrong though.

@unknownbrackets
Copy link
Collaborator Author

That's true enough. By the way, we've logged s8 being used in throughmode, but maybe they were all corrupted displaylists:

http://report.ppsspp.org/logs/kind/526

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

If we assume it's converted to fixed s.11 point... it might be scaled to ((x + 32768) * 480/4096) on width? I think the drawing space is that size.

-32768 -> 0 <-- 0

-32692 -> 8.9063 <-- 0
-32691 -> 9.0234 <-- 1

-32555 -> 24.9609 <-- 1
-32554 -> 25.0781 <-- 2

-32419 -> 40.8984 <-- 2
-32418 -> 41.0156 <-- 3

32708 -> 7672.9688 <-- 478
32709 -> 7673.0859 <-- 479

Those numbers look sorta promising, but I'm just guessing. Obviously it's scaled to the wrong cap, though...

The space between pixels (above) would be 16 (4 bits, maybe it's s.11.4?) The point it goes over is at 9, 9+16, 9+16+16, etc. so that would make sense...

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

Well, I think that's basically it. Here's some data:

https://docs.google.com/spreadsheets/d/1b2nocvhQw4wfrein6YUxOt8uIvSS4bp7WBtqimZXCBo/edit#gid=0

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

Floating point seems similar:

https://docs.google.com/spreadsheets/d/1b2nocvhQw4wfrein6YUxOt8uIvSS4bp7WBtqimZXCBo/edit#gid=116555188

So basically, it's first scaled to a fixed point with 4 bits behind the decimal, essentially int(pos * 8 * viewport_width) / 16, with the viewport offset applied. It rounds up at 10/16 (1010) which seems odd...

Anything that goes outside the bounds of the drawing area [0, 4095] is entirely clipped (no portion of it is drawn.) Games generally center within the drawing area.

Haven't tested depth yet.

-[Unknown]

@hrydgard
Copy link
Owner

That roundup sounds odd, and not sure how we can emulate that (maybe it's enough to apply a very small lateral offset and then let things round?), but having 4 bits of subpixel precision is pretty normal for the age of hardware the PSP is (2006+ -era chips usually do 8 bits).

@unknownbrackets
Copy link
Collaborator Author

Depth is definitely scaled wrong. Trying to work my head around what z1/z2/minz/maxz do still.

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

Alright, so minz/maxz just clip the entire fragment based on depth (only outside throughmode.)

Depth (only tested s8 so far) is definitely scaled by 128 as well, not 127. See:
https://docs.google.com/spreadsheets/d/1b2nocvhQw4wfrein6YUxOt8uIvSS4bp7WBtqimZXCBo/edit?pli=1#gid=661508165

However, there are some rounding issues I'm not sure about. It seems like it rounds the same way as positions (at 0.625 fixed point .4), perhaps so that 0.5 is the "center" of 0? Hmm, actually, not sure.

I guess it would make it easier to test this if memcpy() supported depth. That's tricky due to overlapping framebuffers... but maybe best effort would help anyway. Obviously won't work on mobile...

-[Unknown]

@thedax
Copy link
Collaborator

thedax commented Aug 16, 2014

Should I add this to the depth metaissue?

@unknownbrackets
Copy link
Collaborator Author

So here's a change that fixes it. Not sure what the performance impact will be on various configurations.

https://github.com/unknownbrackets/ppsspp/compare/pos-scale

-[Unknown]

@hrydgard
Copy link
Owner

My guess would be a tiny, tiny, maybe not even noticeable performance impact on most hardware, but hard to be sure without testing.

@solarmystic
Copy link
Contributor

@unknownbrackets @hrydgard
Did some prelim testing in FFIV after applying your pos-scale branch on my ATI/Core 2 laptop and performance is pretty much identical before and after it was applied. (328 VPS on the world map before and after.)

@unknownbrackets
Copy link
Collaborator Author

Well, @solarmystic this will affect pretty much ALL games. I'm most concerned about games that push a lot of vertices (maybe God of War?), and also not sure how mobile chipsets will react. But it sounds like ATI is reacting well enough.

I forget, do you have SSE 4 or not? The s16/s8 stuff uses SSE4 so on supported processors I'm expecting performance not to be much different on desktop, gpu drivers willing.

-[Unknown]

@solarmystic
Copy link
Contributor

@unknownbrackets My laptop's mobile Core 2 CPU (T9550) is of the Penryn family which supports SSE4.1 so that could be why the performance wasn't visibly impacted

menu_00000

@solarmystic
Copy link
Contributor

@unknownbrackets @hrydgard In case you needed a more "holistic" view of the potential impact of your pos-scale branch in other games on my hardware configuration (pretty much ancient tech by 2014 standards):-

capture

As you will note, most of the games tested before and after the branch was merged are unaffected performance-wise and the few that are (e.g. Gods Eater Burst, DOA:P, Crisis Core and GTA:VCS) aren't heavily affected. (x<5%)

System Specs:-
sysspec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants