Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Optimize BMD.fillRect() #17946

Merged
merged 1 commit into from
Sep 16, 2024

Conversation

adrian17
Copy link
Collaborator

@adrian17 adrian17 commented Sep 16, 2024

This removed repeated range checks on each pixel, replacing them by possibly-memcpy.
I originally only added set_pixel32_row_raw, which improved 640x480 fills by 3x (FF) - 5x (desktop).
Then I tried adding a full-fill fast path, which (as expected) didn't do a noticeable difference on desktop, but somehow improved FF times by another 2x.

EDIT: I just remembered that I tested browser perf without wasm extensions, but bulk-memory might also do a difference here. Will check it soon.
With wasm-opt and bulk-memory enabled, the line-fill path and full-fill path are equally fast, and another 2x faster than the best result without extensions. Just slightly slower than desktop now (though weirdly, the FF profiler doesn't show memory.fill like it used to).

This only helps the CPU side, it won't help is most time is spent waiting for GPU.

@adrian17 adrian17 added A-rendering Area: Rendering & Graphics T-perf labels Sep 16, 2024
@torokati44
Copy link
Member

torokati44 commented Sep 16, 2024

Ooh, really nice!

I think, instead of detecting a "full fill", you could, with a little bit more code, make it ever so slightly more generic, by detecting "full width fill of rows from i through j" - then the full fill would be just an accidental edge case of this, while still performing the same, I reckon.

nit: I think this is A-core, not A-rendering... :P Maybe both...? 👀 - See the discussion on Discord about this...

@adrian17
Copy link
Collaborator Author

I think, instead of detecting a "full fill", you could, with a little bit more code, make it ever so slightly more generic, by detecting "full width fill of rows from i through j" - then the full fill would be just an accidental edge case of this, while still performing the same, I reckon.

Yeah, but I'd say at this point the difference between 1 big copy and H width-sized copies is already quite small, so I'm not sure if it's worth it. I was already on a fence about adding the fill() path.

@torokati44
Copy link
Member

Yeah, but I'd say at this point the difference between 1 big copy and H width-sized copies is already quite small, so I'm not sure if it's worth it. I was already on a fence about adding the fill() path.

That's fair - it's not like I can find any real world content right off the top of my head that does "fill all but the last 3 lines of a 5x1700 BitmapData repeatedly". And even if such content does exist ... hopefully we'll notice...?

Copy link
Member

@torokati44 torokati44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@adrian17 adrian17 added A-core Area: Core player, where no other category fits and removed A-rendering Area: Rendering & Graphics labels Sep 16, 2024
@adrian17 adrian17 enabled auto-merge (rebase) September 16, 2024 19:06
@adrian17 adrian17 merged commit 4ab0910 into ruffle-rs:master Sep 16, 2024
17 checks passed
@adrian17 adrian17 deleted the optimize-bmd-fillrect branch September 16, 2024 19:06
@danielhjacobs danielhjacobs added the T-perf Type: Performance Improvements label Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-core Area: Core player, where no other category fits T-perf Type: Performance Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants