-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ogre 2.3 preparations #101
Comments
Must |
Another way to look at it is to think of Multi-GPU rendering: If GPU0 works on frame A and GPU1 works on frame B; can we assume each GPU can have its own independent version; or must GPU1 wait for GPU0 to finish and transfer the contents before GPU1 can start working with that texture? This only applies to textures (for the time being), so UAV buffers don't have this flag. However they might in the future should we support multi-GPU rendering. We discard by default in the compositor because the vast majority of textures are either temporary (i.e. used for ping-pong FXs) or reconstructed from scratch every frame. However textureManager->createOrRetrieve disables this flag by default (the C++ flag is keep_content isn't just for multi-GPU rendering (i.e. it allows the HW to reset all compression flags, start working on a resource before other components within the GPU are finished), but it is the easiest way to understand it. |
Why does this isn't valid for shadow textures? They need |
Could you please detail a bit more this point? The utility I use to take screenshots do exactly what you mention on a keypress (creates a separate workspace that renders in a rtt instead of a window and manually invokes |
It should be valid for shadow textures. If it's being rejected, it's a parser bug. Dynamic shadow maps don't need this flag
If you don't touch the render window, then nothing. This flag is specifically for the last pass that touches the render window. The last pass in a frame is responsible for preparing the render window for presentation to the screen, attempting to use it afterwards (whether for reading or writing) is undefined behaviour. Thus if pass N prepares the window for presentation but you manually execute pass N+1 afterwards which needs the render window, then pass N must set This is a bit user hostile, thus I'm still thinking of better ways to solve it. Also because this flag is very specific to Vulkan, and taking screenshots from the Render Window is still WIP, exactly what needs to be done is subject to change. But if you render to a temporary RTT and take a screenshot of that RTT, then nothing needs to be done. The render window is not involved at all. |
Behavior of TextureGpu::notifyDataIsReady changed. Calling it excessively is now considered incorrect. Documented in #101 Added mDataPreparationsPending which is required. Otherwise calling scheduleReupload (>=1x times) then destroying the texture will immediately destroy the pointer while it may actually be still used in the worker thread. Previously this was not needed because if a texture was in the worker thread, then it meant notifyDataIsReady had never been called yet since the last time we transitioned to Resident; thus destroyTexture would automatically delay the destruction.
Since commit 95c09d7 there are three changes or upcoming changes that warrant bumping the minor version:
shader templates mix with the new ones
These are minimum changes, but when accumulated it is not reasonable that someone porting from 2.2.x to consider this "100% hassle free".
Nonetheless we should be careful not to scare anyone, since these changes are not major at all. Porting should still be easy.
Tasks remaining:
root_layout
option anduses_array_bindings
for VulkanPorting:
Switch importV1 to createByImportingV1
In 2.2.2 and earlier we had a function called Mesh::importV1 which would populate a v2 mesh by filling it with data from a v1 mesh, effectively importing it.
In 2.2.3 users should use MeshManager::createByImportingV1 instead. This function 'remembers' which meshes have been created through a conversion process, which allows device lost handling to repeat this import process and recreate the resources.
Aside from this little difference, there are no major functionality changes and the function arguments are the same.
Shadow's Normal Offset Bias
We've had a couple complaints, but it wasn't until user SolarPortal made a more exhaustive research where we realized we were not using state of the art shadow mapping techniques.
We were relying on
hlmsManager->setShadowMappingUseBackFaces( true )
to hide most self-occlussion errors, but this caused other visual errors.Normal Offset Bias is a technique from 2011 (yes, it's old!) which drastically improves self occlussion and shadow acne while improving overall shadow quality; and is much more robust than using inverted-culling during the caster pass.
Therefore this technique replaced the old one and the function
HlmsManager::setShadowMappingUseBackFaces()
has been removed.Users can globally control normal-offset and constant biases per cascade by tweaking
ShadowTextureDefinition::normalOffsetBias
andShadowTextureDefinition::constantBiasScale
respectively.You can also control them via compositors scripts in the shadow node declaration, using the new keywords constant_bias_scale and normal_offset_bias
Users porting from 2.2.x may notice their shadows are a bit different (for the better!), but may encounter some self shadowing artifacts. Thus they may have to adjust these two biases if they need to.
Unlit vertex and pixel shaders unified
Unlit shaders were still duplicating its code 3 times (one for each RenderSystem) and all of its vertex & pixel shader code has been unified into a single .any file.
Although this shouldn't impact you at all, users porting from 2.2.x need to make sure old Hlms shader templates from Unlit don't linger and get mixed with the new files.
Pay special attention the files from
Samples/Media/Hlms/Unlit
match 1:1 the ones in your project and there aren't stray .glsl/.hlsl/.metal files from an older version.If you have customized the Unlit implementation, you may find your customizations to be broken. But they're easy to fix. For reference look at Colibri's two commits which ported its Unlit customizations from 2.2.x to 2.3.0
Added HlmsMacroblock::mDepthClamp
It is now possible to toggle Depth Clamp on/off. Check if it's supported via RSC_DEPTH_CLAMP. All desktop GPU should support it unless you're using extremely old OpenGL drivers.
iOS supports it since A11 chip (iPhone 8 or newer)
Users upgrading from older Ogre versions should be careful their libraries and headers don't get out of sync. A full rebuild is recommended.
The reason being is that HlmsMacroblock (which is used almost anywhere in Ogre) added a new member variable. And if a DLL or header gets out of sync, it likely won't crash but the artifacts will be very funny (most likely depth buffer will be disabled).
Added shadow pancaking
With the addition of depth clamp, we are now able to push the near plane of directional shadow maps in PSSM (non-stable variant). This greatly enhances depth buffer precision and reduces self-occlusion and acne bugs.
This improvement may make it possible for users to try using PFG_D16_UNORM instead of PFG_D32_FLOAT for shadow mapping, halving memory consumption.
Shadow pancaking is automatically disabled when depth clamp is not supported.
Other relevant information when porting
HlmsListener::hlmsTypeChanged
added an argument. Beware of it if you are overloading this functionHlmsListener::propertiesMergedPreGenerationStep
changed its arguments. Beware of it if you are overloading this functionPFG_D24_UNORM_S8_UINT
, now it usesPFG_D32_FLOAT_S8X24_UINT
RenderSystem::_setTexture
added an argument which almost always should be set to false (should only be set to true if rendering to a depth buffer without writing to it while also binding that same depth buffer as a texture for sampling).TextureGpu
now hasgetInternalWidth
andgetInternalHeight
. This happens because Vulkan on Android may require us to rotate the window ourselves to avoid performance degradation instead of letting the OS or HW do it (seeTextureGpu::setOrientationMode
). If orientation mode is 90° or 270°, then getInternalWidth returns the height and getInternalHeight returns the width). It is only relevant for Vulkan on Android. This is important if you need to perform copy operations or use AsyncTextureTickets on oriented textures.CompositorManager2::addWorkspace
removed the last parameters.ResourceLayoutMap
andResourceAccessMap
are no longer needed, they're automatic.addWorkspace
now accepts aResourceStatusMap
in case the workspace needs to assume the texture is in a specific initial layout (very unlikely)Old:
texture prevFrameDepthBuffer target_width target_height PFG_R32_FLOAT uav
New:
texture prevFrameDepthBuffer target_width target_height PFG_R32_FLOAT uav keep_content
This is normal for textures whose contents are meant to be carried over from the previous frame.
keep_content is for RenderTextures and/or UAVs. The question is whether they can start fresh clean in a new frame; or we must preserve their changes between frames.
Is keep_content for UAV textures or normal textures? And what about a UAV buffer whose content have to be preserved among multiple frames?
Normal textures: No, their contents are always kept
UAV textures: Yes, if you need to preserve their contents
UAV buffers: This flag is only for textures, thus there is no need
Another way to look at it is to think of Multi-GPU rendering:
If GPU0 works on frame A and GPU1 works on frame B... can we assume each GPU can have its own independent version? or must GPU1 wait for GPU0 to finish and transfer the contents before GPU1 can start working with that texture?
This only applies to textures (for the time being), so UAV buffers don't have this flag. However they might in the future should we support multi-GPU rendering.
We discard by default in the compositor because the vast majority of textures are either temporary (i.e. used for ping-pong FXs) or reconstructed from scratch every frame.
However
textureManager->createOrRetrieve
disables this flag by default (the C++ flag is TextureFlags::DiscardableContent) to avoid breaking more complex algorithms.keep_content
isn't just for multi-GPU rendering (i.e. it allows the HW to reset all compression flags, start working on a resource before other components within the GPU are finished), but it is the easiest way to understand it.Dynamic shadow maps don't need this flag.
But fully or partially static shadow maps (like shown in StaticShadowMaps sample, see
StaticShadowMaps.compositor
) need this flag, since contents are kept between frames until manually updated.Do not call notifyDataIsReady more than needed
In Ogre 2.2 you could call notifyDataIsReady as many times as you want. In fact we gave the following example which is now possibly wrong:
However in Ogre 2.3 every
notifyDataIsReady
must previously have had a call toscheduleTransitionTo
(Resident) orscheduleReupload
, assuming you didn't cancel the load e.g. assuming you didn't do this:Additionally for
ManualTexture
textures, Ogre automatically callsnotifyDataIsReady
as soon as the texture becomes resident. ThusnotifyDataIsReady
shouldn not be called by the user ifTextureFlags::ManualTexture
flag is set.The solution is simple: Remove the call to
notifyDataIsReady
, since it wasn't previous needed, and now it must not be there.If the assert is triggering inside Ogre, then it means you previously called
notifyDataIsReady
on that texture and now Ogre is doing it again. Find where you're callingnotifyDataIsReady
and remove that line.Terra, SSAO, Postprocessing samples and v1 Overlays were updated to reflect this change.
Global changes for Vulkan compatibility:
GraphicsSystem::initialize
has changed slightly. Look forSDL2x11
andVSync Method
RenderPassDescriptor::mReadyWindowForPresent
(Vulkan requirement). This is handled automatically inCompositorManager2::prepareRenderWindowsForPresent
whenever the compositor chain changes. However if the workspace that presents to screen is disabled (i.e. you callCompositorWorkspace::_update
manually) then you'll have to set this out yourself. The same happens if the last pass is a custom pass callsinitialize( rtv )
but never callssetRenderPassDescToCurrent
.ReadOnlyBuffers
This affects all RS and is a performance optimization (also we needed this to support Android). To quote documentation:
Represents the best way to access read-only data in shaders.
But how it is implemented depends largely on which HW/API it is running on:
Buffer<>
aka texture buffer in D3D11 on D3D10+ HWsamplerBuffer
aka texture buffer in GL3 on GL3/D3D10 HWIn short,
ReadOnlyBufferPacked
either behaves as aTexBufferPacked or as an
UavBufferPacked` (but read only) depending on HW and API being used.Some existing code has changed to use ReadOnlyBufferPacked where it originally used TexBufferPacked.
This means in shader code, these buffers should be accessed via the readOnlyFetch() macro.
i.e.
In HLSL the difference is less noticeable, but variables are now declared as
StructuredBuffer<floatN>
instead ofBuffer<floatN>
Additionally, Matrix functions in Matrix_piece_all.glsl/hlsl have been modified to only work with matrices stored in ReadOnly buffers instead of Tex buffers.
If your custom modifications to Hlms access buffers that are now
ReadOnlyBufferPacked
, you need to change your shaders to usereadOnlyFetch
. This may sound scary but it's just routinely changing code that refuses to compile with the new syntax.Is Vulkan RenderSystem stable?
Yes. It's not beta. It has been tested on multiple platforms and GPUs (NVIDIA, AMD Mesa, AMD Propietary on Linux, AMD Propietary on Windows, Intel Mesa, Intel Windows, Qualcomm, ARM) and it works very well, sometimes even better than GL3+.
But given that it is the newest RenderSystem, please report any issue you encounter.
However please note:
Samples/2.0/Tutorials/Tutorial_Memory
on how to monitor and tweak memory. Also seeSamples/2.0/Tests/MemoryCleanup
See known issues.
Known issues
VulkanVaoManager::cleanupEmptyPools
is not implemented and will raise an exception when called. This may be implemented at a later date.The text was updated successfully, but these errors were encountered: