Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DxEngine::WaitUntilCanRender() should not have a hard Sleep() #13268

Closed
Corillian opened this issue Jun 10, 2022 · 3 comments
Closed

DxEngine::WaitUntilCanRender() should not have a hard Sleep() #13268

Corillian opened this issue Jun 10, 2022 · 3 comments
Labels
Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Help Wanted We encourage anyone to jump in on these. Issue-Task It's a feature request, but it doesn't really need a major design. Needs-Tag-Fix Doesn't match tag requirements Product-Terminal The new Windows Terminal.

Comments

@Corillian
Copy link

Corillian commented Jun 10, 2022

I work as an engine/graphics programmer in the games industry and seeing this causes some amount of stress. I've gone through a bit of the rendering code and I have a couple of observations, for whatever they're worth:

  1. Your DxEngine rendering code seems to be written based on assumptions that are true when working with traditional CPU-based software rasterizers - i.e. old GDI based rendering. These assumptions are frequently not true when working with GPU's. For example, it can actually be faster to submit a single draw call that does a lot of work, like touching every pixel by redrawing the entire window, instead of sending a whole bunch of draw calls to do small amounts of work, like updating multiple disparate dirty rect's.
  2. There should not be any circumstance under which your rendering thread is causing throughput issues. The logical view of text should update and the rendering thread should render this view as quickly as possible by requesting an immutable snapshot. Given that this application isn't a game, and maxing FPS is entirely irrelevant, you should wait on v-sync instead of a hardcoded period of time that could, ironically, still result in tearing if you aren't waiting on v-sync anyway for final submission. If you want to be extra high-speed about it your rendering thread does not have to be an actual dedicated OS thread that blocks. You can have an event get set on v-sync that notifies a worker thread in a thread pool so that your rendering thread is just a logical thread.
  3. You seem to go through some effort to avoid clearing the background and this doesn't make any sense to me. A solid color fill of a rectangle is pretty much the fastest thing a GPU can do and it seems like your time is better spent on optimizing the rendering of the foreground. For example, you could cache command buffers with all of the state necessary to render a single line of text. I haven't ever used D2D so I don't know how flexible it is in this regard but given that I can render millions of objects significantly more complex than text at 60fps+ I would think, worst case, you could brute force the entire screen every frame without issue.
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Jun 10, 2022
@zadjii-msft
Copy link
Member

There's a lot to unpack here and it's almost the weekend, so sorry for the delays in a longer, better response.

It's important to know that most of the stack here has evolved very organically over the course of the last decade. The GDI engine came from the collection of disparate draw calls all over the codebase. The DX engine came from the GDI engine and (some other internal dx renderer that worked well enough). The DX engine was hacked together for conhost long before Terminal even existed.

IIRC, the original sleep is descended from the GDI engine where yea, the rendering needs to happen on the CPU. There was certainly a point in that engine where trying to repaint more often than 1/60s did create a detrimental impact on the throughput. The way it existed originally, there was only ever a Renderer and a GdiEngine, and the Sleep was originally in the Renderer. Many refactors later left that in the DxEngine, which we might be able to just get rid on now, yea.

I believe we once experimented with syncing the DxEngine's frames to the v-sync. That was for #649. The original experiment didn't pan out from what I remember, and we haven't really revisited since. We could probably try that again here, in place of the Sleep().

Overall, I think a lot of what you've mentioned here is stuff that we're hoping to address with the new atlas renderer. Especially changing the renderer interface to give the engine a snapshot of the viewport's contents. That's one of the big things we're hoping to expose around the #8000 timeframe.

@zadjii-msft zadjii-msft added Help Wanted We encourage anyone to jump in on these. Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-Performance Performance-related issue Product-Terminal The new Windows Terminal. Issue-Task It's a feature request, but it doesn't really need a major design. labels Jun 10, 2022
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Jun 10, 2022
@zadjii-msft zadjii-msft added the Needs-Discussion Something that requires a team discussion before we can proceed label Jun 13, 2022
@lhecker
Copy link
Member

lhecker commented Jun 13, 2022

Overall, I think a lot of what you've mentioned here is stuff that we're hoping to address with the new atlas renderer.

FYI you can find it here: https://github.com/microsoft/terminal/tree/main/src/renderer/atlas
This engine doesn't make the mistake mentioned in your 1st point. The code isn't particularly beautiful or well documented yet unfortunately. I'm currently working on rewriting our text buffer (#8000) instead to support - among others - fast buffer snapshots during rendering and so my time is a bit "stretched thin". Afterwards we can remove the Sleep() for sure. There's a lot of highly exciting performance optimization work left to do in this area! 🙂
Here are instructions about how to enable this engine in Terminal Preview.

If you or anyone else has any tips, advice, or just feedback for the newer AtlasEngine, that'd be absolutely amazing!

@zadjii-msft zadjii-msft removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Discussion Something that requires a team discussion before we can proceed labels Jun 13, 2022
@zadjii-msft
Copy link
Member

Alrighty so we discussed this as a team yesterday.

Turns out, if you totally remove that Sleep in the DxEngine, it does start to negatively impact throughput. Without it, the DX engine spends too much time holding the terminal lock, doing too much CPU-side work, and ultimately slows the Terminal down. This could theoretically be replaced by a vsync wait there - we'd want to carefully measure any throughput changes that might cause. However, our long term strategy is to migrate to the atlas renderer entirely, and sunset the DxEngine. As discussed above, the Atlas renderer doesn't have this sort of issue. Overall, that work is being tracked in #9999, so I'll direct follow-up there.

Thanks!

@zadjii-msft zadjii-msft closed this as not planned Won't fix, can't repro, duplicate, stale Jun 14, 2022
@ghost ghost added the Needs-Tag-Fix Doesn't match tag requirements label Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Help Wanted We encourage anyone to jump in on these. Issue-Task It's a feature request, but it doesn't really need a major design. Needs-Tag-Fix Doesn't match tag requirements Product-Terminal The new Windows Terminal.
Projects
None yet
Development

No branches or pull requests

3 participants