Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for rendering in multiple windows #5497

Merged

Conversation

eoineoineoin
Copy link
Contributor

Two items:

  • When rendering non-main windows on worker threads, a semaphore was released too early, resulting in a race condition changing the gl context, potential crashes/junk rendering. Probably fixes Ahelp Popout Button is lets say, fucked space-station-14#31752

  • Workaround steam overlay being non-functional when multiple windows are open - doesn't seem to be anything that Robust is doing wrong here, but opening multiple 3.3 window contexts seems to cause problems, so just move that renderer down the priority list. Hopefully this is temporary.

Comment on lines 215 to 219
window.BlitDoneEvent?.Set();
Clyde._windowing!.WindowSwapBuffers(window.Reg);
window.BlitDoneEvent?.Set();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a semaphore was released too early, resulting in a race condition changing the gl context

I don't understand what you mean by this or what the race condition is supposed to be. First of all, the GL context never gets changed, that's the whole point of the worker thread system.

BlitDoneEvent only guards the window's RenderTexture, not the copy of it that gets made with the DrawArrays up there, so my understanding is that having the semaphore before the swap buffers is good enough.

Is the bug that the main GL context could write to the render texture for the next frame, before the DrawArrays is done? In that case surely the proper fix would be another GL fence, not a rough semaphore. Hell I'm not even sure this semaphore is a reliable way to even fix that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was too eager on that explanation - thought I saw MakeContextCurrent() being called using the worker thread's contexts at the same time. There's some race condition which is fixed by this change, though, as, when activating the overlay, I was seeing crashes, hangs, or gl errors binding objects with bonkers numbers. All of those issues went away with this fix or by setting display.thread_window_blit false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. So the thing is, as far as I know there's nothing out-of-spec we're strictly doing here (barring maybe the thing with another GL wait I mentioned). That said, I do suppose effectively serializing the presentation as this change effectively does would maybe reduce the chance of something else bugging out, and I honestly doubt the perf impact is significant.

Maybe make this a CVar whether the semaphore is before or after, so we can experiment with it easier? I'd be fine with defaulting it to after.

FWIW, I tested and display.thread_window_blit=false still causes Steam overlay to break :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've made this a cvar.

FWIW, I tested and display.thread_window_blit=false still causes Steam overlay to break :(

Right, the steam overlay is still broken, but you stop seeing crashes/hangs, at least? I suspect that this lock is the cause of space-wizards/space-station-14#31752, since the reporter explicitly mentions not using any overlay.

Comment on lines 46 to 52
GetVersionSpec(RendererOpenGLVersion.GL33),
GetVersionSpec(RendererOpenGLVersion.GL31),
GetVersionSpec(RendererOpenGLVersion.GLES3),
GetVersionSpec(RendererOpenGLVersion.GLES2),
// Preferentially create renderers with versions other than 3.3
// As activating the Steam Overlay with multiple 3.3 windows
// causes the overlay to get corrupted.
GetVersionSpec(RendererOpenGLVersion.GL33),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested and this doesn't fix steam overlay for me.

Without any analysis of how Steam overlay's GL hooking works to justify why this magic hail mary fix would work, I would say this isn't worth investing in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's unfortunate - it works on my machines, one using Mesa and one using NVIDIA GL implementations. Are you sure it doesn't end up falling back to 3.3 on your machine anyway?

Obviously, I can't dig into how the Steam Overlay works, but the kind of rendering errors it's generating feel like it's reading "random" data, which is consistent with incorrect offsets into the GL context. I wrote a bare-minimum test (https://gist.github.com/eoineoineoin/7873ad11bff891e0ae7629e12a0acc3b) to compare different configurations and on both machines, 3.3 was the only one with overt overlay problems. If it's broken for you with a different configuration, that would be a useful datapoint to have.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely this is platform specific then, I'm on AMD Windows. The GL context for me is always 4.6 even if I request 3.1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, when you limit it to OpenGL 3.1, is your GL context still giving you GL_ARB_sync? You can check based on the log on startup:

[DEBG] clyde.ogl: OpenGL capabilities:
[DEBG] clyde.ogl:   khr_debug: True
[DEBG] clyde.ogl:   sampler_objects: True
[DEBG] clyde.ogl:   texture_swizzle: True
[DEBG] clyde.ogl:   vertex_array_object: True
[DEBG] clyde.ogl:   fence_sync: True     <----- THIS SHOULD SAY TRUE
[DEBG] clyde.ogl:   map_buffer: True
[DEBG] clyde.ogl:   map_buffer_range: True
[DEBG] clyde.ogl:   pixel_buffer_object: True
[DEBG] clyde.ogl:   standard_derivatives: True
[DEBG] clyde.ogl:   GLES: False

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok no I did some more testing. For starters apparently when requesting 3.1, I actually get a 4.6 Compatibility profile. When requesting 3.3 I do get a 3.3 Core. Whatever.

I really can't find any pattern other than "suddenly it might just break" even with various mixing and matching of these params. (Windows, AMD, Steam overlay)

[display]
thread_window_blit = false
ogl_block_fence_sync = true
opengl_version = 2

At the end of the day, when this shit goes wrong, it goes wrong bad and could likely cause epileptic seizures. That alone means I don't feel like relying on any hail mary fixes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agree; I had observed that 3.1 was working fine on both my machines running different OpenGL implementations, but if it's inconsistent between implementations, we can't rely on it. I've reverted the change which shuffles the preferred renderer order.

@PJB3005 PJB3005 merged commit 5a82df2 into space-wizards:master Oct 19, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ahelp Popout Button is lets say, fucked
2 participants