Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement shader caching #49050

Merged
merged 1 commit into from
May 31, 2021
Merged

Conversation

reduz
Copy link
Member

@reduz reduz commented May 25, 2021

  • Shader compilation is now cached. Subsequent loads take less than a millisecond.
  • Improved game, editor and project manager startup time.
  • Editor uses .godot/shader_cache to store shaders.
  • Game uses user://shader_cache
  • Project manager uses $config_dir/shader_cache
  • Options to tweak shader caching in project settings.
  • Editor path configuration moved from EditorSettings to new class, EditorPaths, so it can be available early on (before shaders are compiled).
  • Reworked ShaderCompilerRD to ensure deterministic shader code creation (else shader may change and cache will be invalidated).
  • Added shader compression with SMOLV: https://github.com/aras-p/smol-v

Please test.

@reduz reduz requested review from a team as code owners May 25, 2021 00:33
@Calinou Calinou added this to the 4.0 milestone May 25, 2021
static String _get_cache_key_function_glsl(const RenderingDevice::Capabilities *p_capabilities) {
String version;
version = "SpirVGen=" + itos(glslang::GetSpirvGeneratorVersion()) + ", major=" + itos(p_capabilities->version_major) + ", minor=" + itos(p_capabilities->version_minor) + " , subgroup_size=" + itos(p_capabilities->subgroup_operations) + " , subgroup_ops=" + itos(p_capabilities->subgroup_operations) + " , subgroup_in_shaders=" + itos(p_capabilities->subgroup_in_shaders);
return version;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiview settings need to be added here too but as there are a few more coming I'll do that as part of the multiview stereo render PR

Copy link
Contributor

@BastiaanOlij BastiaanOlij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool! looks like a sound approach to me. Couldn't find anything obvious that stuck out just doing a quick code review.

@Calinou
Copy link
Member

Calinou commented May 25, 2021

I benchmarked it. Note that the times reported also include shutdown time (which takes about 3.5 seconds on average on the master branch).

I didn't notice any visual regressions so far in my testing.

System information

  PROCESSOR:          Intel Core i7-6700K @ 4.40GHz
    Core Count:       4                                        
    Thread Count:     8                                        
    Extensions:       SSE 4.2 + AVX2 + AVX + RDRAND + FSGSBASE 
    Cache Size:       8 MB                                     
    Microcode:        0xe2                                     
    Core Family:      Skylake                                  
    Scaling Driver:   intel_pstate powersave                   

  GRAPHICS:           Gigabyte NVIDIA GeForce GTX 1080 8GB
    Frequency:        1873/5005MHz     
    OpenGL:           4.6.0            
    Vulkan:           1.2.168          
    Display Driver:   NVIDIA 465.27    
    Monitor:          3 x Q32G1WG4     
    Screen:           7680x1440        

  MOTHERBOARD:        MSI Z170A GAMING PRO CARBON
    BIOS Version:     1.30                          
    Chipset:          Intel Xeon E3-1200 v5/E3-1500 
    Audio:            Intel 100 /C230               
    Network:          Intel I219-V                  

  MEMORY:             32GB

  DISK:               1000GB Samsung SSD 860 + 500GB Samsung SSD 850 + 1000GB Samsung SSD 850
    File-System:      ext4             
    Mount Options:    relatime rw      
    Disk Scheduler:   BFQ              

  OPERATING SYSTEM:   Fedora 33
    Kernel:           5.11.17-200.fc33.x86_64 (x86_64)                                                                     
    Desktop:          KDE Plasma 5.20.5                                                                                    
    Display Server:   X Server 1.20.11                                                                                     
    Compiler:         Clang 11.0.0 + LLVM 11.0.0                                                                           
    Security:         itlb_multihit: KVM: Mitigation of VMX disabled                                                       
                      + l1tf: Mitigation of PTE Inversion; VMX: vulnerable                                                 
                      + mds: Vulnerable; SMT vulnerable                                                                    
                      + meltdown: Vulnerable                                                                               
                      + spec_store_bypass: Vulnerable                                                                      
                      + spectre_v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers 
                      + spectre_v2: Vulnerable IBPB: disabled STIBP: disabled                                              
                      + srbds: Vulnerable                                                                                  
                      + tsx_async_abort: Vulnerable   

Godot versions

All binaries were compiled with the same Clang version and SCons options.

  • bin/godot.linuxbsd.tools.64.llvm: 4.0.dev.custom_build.f11219175 (this branch)
  • bin/godot.linuxbsd.tools.64.llvm.vanilla: 4.0.dev.custom_build.af03e9c83 (master branch)
  • bin/godot.x11.tools.64.llvm: 3.4.beta.custom_build.a38b44741

Project manager

❯ hyperfine -i "bin/godot.linuxbsd.tools.64.llvm --quit" "bin/godot.linuxbsd.tools.64.llvm.vanilla --quit" "bin/godot.x11.tools.64.llvm --quit"

Benchmark #1: bin/godot.linuxbsd.tools.64.llvm --quit
  Time (mean ± σ):      3.959 s ±  0.354 s    [User: 3.090 s, System: 0.159 s]
  Range (min … max):    3.282 s …  4.260 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 
Benchmark #2: bin/godot.linuxbsd.tools.64.llvm.vanilla --quit
  Time (mean ± σ):      5.573 s ±  0.094 s    [User: 8.848 s, System: 0.195 s]
  Range (min … max):    5.417 s …  5.693 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 
Benchmark #3: bin/godot.x11.tools.64.llvm --quit
  Time (mean ± σ):      2.577 s ±  0.145 s    [User: 1.665 s, System: 0.112 s]
  Range (min … max):    2.459 s …  2.868 s    10 runs
 
Summary
  'bin/godot.x11.tools.64.llvm --quit' ran
    1.54 ± 0.16 times faster than 'bin/godot.linuxbsd.tools.64.llvm --quit'
    2.16 ± 0.13 times faster than 'bin/godot.linuxbsd.tools.64.llvm.vanilla --quit'

Empty project

❯ hyperfine -i "bin/godot.linuxbsd.tools.64.llvm /tmp/c/project.godot --quit" "bin/godot.linuxbsd.tools.64.llvm.vanilla /tmp/c/project.godot --quit" "bin/godot.x11.tools.64.llvm /tmp/d/project.godot --quit"

Benchmark #1: bin/godot.linuxbsd.tools.64.llvm /tmp/c/project.godot --quit
  Time (mean ± σ):     12.142 s ±  0.154 s    [User: 10.030 s, System: 0.329 s]
  Range (min … max):   11.909 s … 12.268 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 
Benchmark #2: bin/godot.linuxbsd.tools.64.llvm.vanilla /tmp/c/project.godot --quit
  Time (mean ± σ):     15.639 s ±  1.005 s    [User: 31.154 s, System: 0.389 s]
  Range (min … max):   14.500 s … 18.368 s    10 runs
 
  Warning: Ignoring non-zero exit code.
  Warning: The first benchmarking run for this command was significantly slower than the rest (18.368 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
 
Benchmark #3: bin/godot.x11.tools.64.llvm /tmp/d/project.godot --quit
  Time (mean ± σ):      6.516 s ±  0.274 s    [User: 3.201 s, System: 0.198 s]
  Range (min … max):    5.749 s …  6.653 s    10 runs
 
Summary
  'bin/godot.x11.tools.64.llvm /tmp/d/project.godot --quit' ran
    1.86 ± 0.08 times faster than 'bin/godot.linuxbsd.tools.64.llvm /tmp/c/project.godot --quit'
    2.40 ± 0.18 times faster than 'bin/godot.linuxbsd.tools.64.llvm.vanilla /tmp/c/project.godot --quit'

Conclusion

The new shader caching helps noticeably, but there's still more work to do to get back to 3.x speeds.

@TokisanGames
Copy link
Contributor

Conclusion

The new shader caching helps noticeably, but there's still more work to do to get back to 3.x speeds.

While this doesn't sound good, I'm not sure of the value of benchmarking an empty project. You need a complex reference scene with lighting and many materials.

Godot 3 has severe game lag (as in no one would play it), where it will seize the engine to recompile the shaders. No logging, no debugging, monitors don't update. No information unless you build a custom engine with printk logging directly in the renderer. It even lags in the editor, 20-300ms each time.

The work done so far to cache shaders in #46330 hasn't yet been more effective than using hacky work arounds.

My hope is that vulkan and the shader caching here finally eliminates this performance lag. However if it's not tested against a complex reference scene that currently lags in Godot 3, how will we know if the root problem this and other PRs attempt to address is resolved?

editor/editor_node.cpp Outdated Show resolved Hide resolved
@nathanfranke
Copy link
Contributor

nathanfranke commented May 25, 2021

Edit: Made wording clearer

Note that this PR is a great improvement, but as of now still doesn't fully resolve the underlying startup performance problems. As Calinou noted they are still not optimal and also first time startups can still be improved. Otherwise thank you very much for the hard work that went into this.

@akien-mga
Copy link
Member

akien-mga commented May 25, 2021

@nathanfranke This is not advertised as a "fix" for startup time issues. It improves startup times significantly, but there are many other fixes that can be made to further reduce startup times.

That doesn't mean that shader caching shouldn't be merged as is.

The real potential gain from this is not so much startup time, but avoid shader compilation stutters in-game (or at most having them happen only the first time the game is played, if we can't do shader pre-compilation at startup).

@akien-mga
Copy link
Member

Project manager uses $config_dir/shader_cache

Not super important but this should likely use the cache dir instead of config dir? (~/.cache/godot/shader_cache)

servers/rendering_server.cpp Outdated Show resolved Hide resolved
thirdparty/README.md Outdated Show resolved Hide resolved
servers/rendering/renderer_rd/shader_rd.cpp Outdated Show resolved Hide resolved
servers/rendering/renderer_rd/renderer_compositor_rd.cpp Outdated Show resolved Hide resolved
servers/rendering/renderer_rd/renderer_compositor_rd.cpp Outdated Show resolved Hide resolved
@Chaosus Chaosus self-requested a review May 25, 2021 17:22
@reduz reduz force-pushed the implement-spirv-cache branch from f112191 to dbbf888 Compare May 25, 2021 17:39
@reduz reduz requested a review from a team as a code owner May 25, 2021 17:39
@Chaosus Chaosus removed their request for review May 25, 2021 17:45
thirdparty/README.md Outdated Show resolved Hide resolved
* Shader compilation is now cached. Subsequent loads take less than a millisecond.
* Improved game, editor and project manager startup time.
* Editor uses .godot/shader_cache to store shaders.
* Game uses user://shader_cache
* Project manager uses $config_dir/shader_cache
* Options to tweak shader caching in project settings.
* Editor path configuration moved from EditorSettings to new class, EditorPaths, so it can be available early on (before shaders are compiled).
* Reworked ShaderCompilerRD to ensure deterministic shader code creation (else shader may change and cache will be invalidated).
* Added shader compression with SMOLV: https://github.com/aras-p/smol-v
@akien-mga akien-mga force-pushed the implement-spirv-cache branch from dbbf888 to 0d2e029 Compare May 31, 2021 08:13
@akien-mga akien-mga merged commit 596eb78 into godotengine:master May 31, 2021
@akien-mga
Copy link
Member

Thanks!

@Zireael07
Copy link
Contributor

I just noticed that Calinou also reported results for 3.4 - does that mean this is not Vulkan-only, as I originally guessed?

@aaronfranke
Copy link
Member

aaronfranke commented May 31, 2021

@Zireael07 No, Calinou was comparing with the current 3.x branch, which doesn't have this change, for the sake of comparison (just like he compared with the then-current master without this PR for the sake of comparison).

@YuriSizov
Copy link
Contributor

Also because we have a huge performance decrease between 3.x and the master and this PR addresses some of that. So a comparison was made to see how close are we getting to the old performance.

JanWerder added a commit to JanWerder/gitignore that referenced this pull request Jul 29, 2022
With Godot 4 the .godot folder is introduced for import and shader_cache files.
See i.e. godotengine/godot#49050

When Godot 4 is the de-facto standard all other entries can most likely be removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.