-
Notifications
You must be signed in to change notification settings - Fork 408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for rendering in multiple windows #5497
Fixes for rendering in multiple windows #5497
Conversation
window.BlitDoneEvent?.Set(); | ||
Clyde._windowing!.WindowSwapBuffers(window.Reg); | ||
window.BlitDoneEvent?.Set(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a semaphore was released too early, resulting in a race condition changing the gl context
I don't understand what you mean by this or what the race condition is supposed to be. First of all, the GL context never gets changed, that's the whole point of the worker thread system.
BlitDoneEvent
only guards the window's RenderTexture
, not the copy of it that gets made with the DrawArrays
up there, so my understanding is that having the semaphore before the swap buffers is good enough.
Is the bug that the main GL context could write to the render texture for the next frame, before the DrawArrays
is done? In that case surely the proper fix would be another GL fence, not a rough semaphore. Hell I'm not even sure this semaphore is a reliable way to even fix that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was too eager on that explanation - thought I saw MakeContextCurrent() being called using the worker thread's contexts at the same time. There's some race condition which is fixed by this change, though, as, when activating the overlay, I was seeing crashes, hangs, or gl errors binding objects with bonkers numbers. All of those issues went away with this fix or by setting display.thread_window_blit false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. So the thing is, as far as I know there's nothing out-of-spec we're strictly doing here (barring maybe the thing with another GL wait I mentioned). That said, I do suppose effectively serializing the presentation as this change effectively does would maybe reduce the chance of something else bugging out, and I honestly doubt the perf impact is significant.
Maybe make this a CVar whether the semaphore is before or after, so we can experiment with it easier? I'd be fine with defaulting it to after.
FWIW, I tested and display.thread_window_blit=false
still causes Steam overlay to break :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've made this a cvar.
FWIW, I tested and display.thread_window_blit=false still causes Steam overlay to break :(
Right, the steam overlay is still broken, but you stop seeing crashes/hangs, at least? I suspect that this lock is the cause of space-wizards/space-station-14#31752, since the reporter explicitly mentions not using any overlay.
GetVersionSpec(RendererOpenGLVersion.GL33), | ||
GetVersionSpec(RendererOpenGLVersion.GL31), | ||
GetVersionSpec(RendererOpenGLVersion.GLES3), | ||
GetVersionSpec(RendererOpenGLVersion.GLES2), | ||
// Preferentially create renderers with versions other than 3.3 | ||
// As activating the Steam Overlay with multiple 3.3 windows | ||
// causes the overlay to get corrupted. | ||
GetVersionSpec(RendererOpenGLVersion.GL33), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested and this doesn't fix steam overlay for me.
Without any analysis of how Steam overlay's GL hooking works to justify why this magic hail mary fix would work, I would say this isn't worth investing in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's unfortunate - it works on my machines, one using Mesa and one using NVIDIA GL implementations. Are you sure it doesn't end up falling back to 3.3 on your machine anyway?
Obviously, I can't dig into how the Steam Overlay works, but the kind of rendering errors it's generating feel like it's reading "random" data, which is consistent with incorrect offsets into the GL context. I wrote a bare-minimum test (https://gist.github.com/eoineoineoin/7873ad11bff891e0ae7629e12a0acc3b) to compare different configurations and on both machines, 3.3 was the only one with overt overlay problems. If it's broken for you with a different configuration, that would be a useful datapoint to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely this is platform specific then, I'm on AMD Windows. The GL context for me is always 4.6 even if I request 3.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, when you limit it to OpenGL 3.1, is your GL context still giving you GL_ARB_sync
? You can check based on the log on startup:
[DEBG] clyde.ogl: OpenGL capabilities:
[DEBG] clyde.ogl: khr_debug: True
[DEBG] clyde.ogl: sampler_objects: True
[DEBG] clyde.ogl: texture_swizzle: True
[DEBG] clyde.ogl: vertex_array_object: True
[DEBG] clyde.ogl: fence_sync: True <----- THIS SHOULD SAY TRUE
[DEBG] clyde.ogl: map_buffer: True
[DEBG] clyde.ogl: map_buffer_range: True
[DEBG] clyde.ogl: pixel_buffer_object: True
[DEBG] clyde.ogl: standard_derivatives: True
[DEBG] clyde.ogl: GLES: False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok no I did some more testing. For starters apparently when requesting 3.1, I actually get a 4.6 Compatibility profile. When requesting 3.3 I do get a 3.3 Core. Whatever.
I really can't find any pattern other than "suddenly it might just break" even with various mixing and matching of these params. (Windows, AMD, Steam overlay)
[display]
thread_window_blit = false
ogl_block_fence_sync = true
opengl_version = 2
At the end of the day, when this shit goes wrong, it goes wrong bad and could likely cause epileptic seizures. That alone means I don't feel like relying on any hail mary fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, agree; I had observed that 3.1 was working fine on both my machines running different OpenGL implementations, but if it's inconsistent between implementations, we can't rely on it. I've reverted the change which shuffles the preferred renderer order.
Two items:
When rendering non-main windows on worker threads, a semaphore was released too early, resulting in a race condition changing the gl context, potential crashes/junk rendering. Probably fixes Ahelp Popout Button is lets say, fucked space-station-14#31752
Workaround steam overlay being non-functional when multiple windows are open - doesn't seem to be anything that Robust is doing wrong here, but opening multiple 3.3 window contexts seems to cause problems, so just move that renderer down the priority list. Hopefully this is temporary.