forked from bevyengine/bevy
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Start running systems while prepare_systems is running (bevyengine#4919)
# Objective While using the ParallelExecutor, systems do not actually start until `prepare_systems` completes. In stages where there are large numbers of "empty" systems with very little work to do, this delay adds significant overhead, which can add up over many stages. ## Solution Immediately and synchronously signal the start of systems that can run without dependencies inside `prepare_systems` instead of waiting for the first executor iteration after `prepare_systems` completes. Any system that is dependent on them still cannot run until after `prepare_systems` completes, but there are a large number of unconstrained systems in the base engine where this is a general benefit in almost every case. ## Performance This change was tested against `many_foxes` in the default configuration. As this change is sensitive to the overhead around scheduling systems, the spans for measuring system timing, system overhead, and system commands were all commented out for these measurements. The median stage timings between `main` and this PR are as follows: |stage|main|this PR| |:--|:--|:--| |First|75.54 us|61.61 us| |LoadAssets|51.05 us|42.32 us| |PreUpdate|54.6 us|55.56 us| |Update|61.89 us|51.5 us| |PostUpdate|7.27 ms|6.71 ms| |AssetEvents|47.82 us|35.95 us| |Last|39.19 us|37.71 us| |reserve_and_flush|57.83 us|48.2 us| |Extract|1.41 ms|1.28 ms| |Prepare|554.49 us|502.53 us| |Queue|216.29 us|207.51 us| |Sort|67.03 us|60.99 us| |Render|1.73 ms|1.58 ms| |Cleanup|33.55 us|30.76 us| |Clear Entities|18.56 us|17.05 us| |**full frame**|**11.9 ms**|**10.91 ms**| For the first few stages, the benefit is small but cumulative over each. For PostUpdate in particular, this allows `parent_update` to run while prepare_systems is running, which is required for the animation and transform propagation systems, which dominate the time spent in the stage, but also frontloads the contention as the other "empty" systems are also running while `parent_update` is running. For Render, where there is just a single large exclusive system, the benefit comes from not waiting on a spuriously scheduled task on the task pool to kick off the system: it's immediately scheduled to run.
- Loading branch information
Showing
2 changed files
with
114 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters