Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MSAA option for Webgl2 #10052

Merged
merged 37 commits into from
Feb 18, 2022
Merged

Add MSAA option for Webgl2 #10052

merged 37 commits into from
Feb 18, 2022

Conversation

ebogo1
Copy link
Contributor

@ebogo1 ebogo1 commented Jan 28, 2022

Fixes #9900.

This PR adds a Scene.msaaSamples option to enable multisampling in CesiumJS. This option is only available with Webgl2 contexts, which are accessible with requestWebgl2 in the context options of the Viewer constructor.

After this PR, the FramebufferManager class added in #9966 is able to construct a MultisampleFramebuffer in place of a regular Framebuffer. The MultisampleFramebuffer class is a wrapper around two framebuffers - one with multisampled render targets and another with textures that copy multisampled pixels during the blit. When multisampling is enabled, a FramebufferManager will create both textures and corresponding (multisampled) renderbuffers for each attachment type and pass them to the MultisampleFramebuffer constructor, which splits the attachments between its two framebuffers.

The following framebuffers get multisampled when MSAA is enabled:

  • GlobeDepth._colorFramebuffer: when globe depth is enabled, acts as the scene framebuffer with depth-stencil attachments. Since the other framebuffers in GlobeDepth are used to store copies of the color framebuffer's textures, it is unnecessary to multisample the other framebuffers.
  • InvertClassification._fbo: when invert classification is enabled and translucent, it creates new depth stencil attachments for _fbo that are later sampled in its translucent frag shader. Since the geometry is rendered to the main _fbo, _fboClassified does not need to be multisampled because it only needs access to the depth-stencil texture once it's ready.
  • SceneFramebuffer._colorFramebuffer: acts as the scene framebuffer when globe depth is disabled or when post process is enabled.

To make testing easier, I made a small git patch to automatically add an MSAA dropdown to all Sandcastles and enable Webgl2 by default. There are a couple Sandcastles like Billboards that call viewer.entities.removeAll() in their reset functions, which will remove the entities if the MSAA mode is changed.

Known issues:


To-do:

  • Fix known issue with OIT (1bfb1f1)
  • Fix HDR with MSAA (895c641)
  • Consider adding to MSAA to eye dome lighting, OIT framebuffers?
  • Add Sandcastle example
  • CHANGES.md entry, docs
  • Check performance - at the very least I plan to do a sanity check that the multisample framebuffers and attachments are being created as expected. FPS and memory usage would be good to know as well. (see Add MSAA option for Webgl2 #10052 (comment))

@cesium-concierge
Copy link

Thanks for the pull request @ebogo1!

  • ✔️ Signed CLA found.
  • CHANGES.md was not updated.
    • If this change updates the public API in any way, please add a bullet point to CHANGES.md.

Reviewers, don't forget to make sure that:

  • Cesium Viewer works.
  • Works in 2D/CV.

Source/Scene/Scene.js Outdated Show resolved Hide resolved
@sanjeetsuhag
Copy link
Contributor

@ebogo1 Just double checking - is this issue shown here also caused by the same issue that causes the Globe Translucency Sandcastle to fail?
msaa4

@ebogo1
Copy link
Contributor Author

ebogo1 commented Jan 31, 2022

@sanjeetsuhag It's related - ideally I'd like to only recreate OIT's framebuffer resources when the number of samples has changed, but because OIT's update gets skipped during pick frames, this logic conflicts with disabling multisampling during pick frames. The OIT bugs can be fixed by not disabling multisampling during pick frames and recreating OIT resources only as needed, but picking is wonky when the scene color texture is antialiased - which is why I thought of the current approach initially.

@ggetz
Copy link
Contributor

ggetz commented Jan 31, 2022

Awesome @ebogo1!

We should update the jsdoc comments to include this option, and make it clear that for this option to work, requestWebgl2 also needs to be enabled. I think we should also be clear about the trade off, briefly mentioning the visual quality improvement but noting the performance implications, if noticeable. As an aside, I agree with what @lilleyse mentioned about needing to have a more clear plan about plans for webgl 2 options and features, but that's outside of the scope of this PR.

Also for the top level option name, is it obvious that numberSamples applies to AA, or should we consider making the name more descriptive?

@ebogo1
Copy link
Contributor Author

ebogo1 commented Feb 1, 2022

is it obvious that numberSamples applies to AA, or should we consider making the name more descriptive?

Babylon.js uses pipeline.samples for MSAA (clamping to the hardware limit would be good here too). I don't feel strongly about using Scene.numberSamples, and maybe it's a bit odd to have the private classes like GlobeDepth refer to it as numSamples internally. Happy to change the name to something more descriptive if that'd be better.

Copy link
Contributor Author

@ebogo1 ebogo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1bfb1f1 adds _pickColorFramebuffer to GlobeDepth for use only during pick frames. Whether or not its used is controlled by GlobeDepth._picking, which is updated at the start of every frame in Scene.js (in updateAndClearFramebuffers). There's also a new getter property in GlobeDepth for colorFramebufferManager, which returns either the picking FramebufferManager or the main one.

The idea behind the change is to avoid recreating framebuffers and textures/renderbuffers whenever Scene switches to and from picking frames. On the Invert Classification Sandcastle, I see frames stay pretty consistent around 60 fps after this change, whereas before they dropped as low as 15 on my MacBook when MSAA was turned on and Picking frames were rendered (i.e. when clicking and dragging to pan the camera). This change should also fix the OIT issues from before.

Source/Scene/GlobeDepth.js Show resolved Hide resolved
@ebogo1
Copy link
Contributor Author

ebogo1 commented Feb 2, 2022

We should update the jsdoc comments to include this option, and make it clear that for this option to work, requestWebgl2 also needs to be enabled.

@ggetz @lilleyse Any ideas about where the option should live in JSDoc? Currently it's only a member variable of Scene - maybe msaaSamples can become a one of the constructor options for Viewer?

Also I think that somewhere there should be a warning (similar to pickPosition/sampleHeights being unsupported when context.depthTexture isn't available) when MSAA isn't supported or when the number of samples exceeds the hardware limit. Should these warnings only pop up in Sandcastles, or should Context/FramebufferManager throw them too?

@lilleyse
Copy link
Contributor

lilleyse commented Feb 2, 2022

maybe msaaSamples can become a one of the constructor options for Viewer?

Yes, let's add it to the Viewer and Scene constructors.

Also I think that somewhere there should be a warning (similar to pickPosition/sampleHeights being unsupported when context.depthTexture isn't available) when MSAA isn't supported or when the number of samples exceeds the hardware limit. Should these warnings only pop up in Sandcastles, or should Context/FramebufferManager throw them too?

Good idea, Scene should have an msaaSupported function and these warnings should pop up in Sandcastles. Context/FramebufferManager should probably default to 1 sample if MSAA is not supported.

@ggetz
Copy link
Contributor

ggetz commented Feb 3, 2022

There should be an option on CesiumWidget as well as Viewer and Scene, right?

@ebogo1
Copy link
Contributor Author

ebogo1 commented Feb 3, 2022

I added JSDoc lines for msaaSamples in Viewer, Scene, and CesiumWidget in 0fa4811:

@param {Number} [options.msaaSamples=1] If provided, this value controls the rate of multisample antialiasing. Typical multisampling rates are 2, 4, and sometimes 8 samples per pixel. Higher sampling rates of MSAA may impact performance in exchange for improved visual quality. This value only applies to WebGL2 contexts that support multisample render targets.

WebGL isn't picky with the number of samples passed to functions like gl.renderbufferStorageMultisample as long as the number doesn't exceed the limit -

Based on this, I didn't find issues on my MacBook by taking the minimum of GL_MAX_SAMPLES and the user-supplied msaaSamples value in the setter. Maybe GL_MAX_SAMPLES is a loose bound if GL_MAX_DEPTH_TEXTURE_SAMPLES is smaller, but I don't think WebGL has access to that number. Passing negative values like -1 to the glRenderbufferStorage functions seems to be equivalent to disabling MSAA.

@ebogo1
Copy link
Contributor Author

ebogo1 commented Feb 3, 2022

I added a Sandcastle example with the BIM tileset, the Statue of Liberty, and the Hot Air Balloon model and MSAA options for 1x, 2x, 4x, and 8x in fd84fd9. I can also add an example with primitives like polylines if that'd be helpful. Should the Sandcastle live in the "Post Processing" category with FXAA?

@ebogo1
Copy link
Contributor Author

ebogo1 commented Feb 4, 2022

@lilleyse After CI passes this should be good for final review. I had it on my to-do list, but I don't think MSAA would work nicely with the OIT and EDL framebuffers for different reasons. OIT's transparent textures don't blend with the opaque scene texture until resolveFramebuffers in Scene.js calls OIT.execute(), which executes the composite command. By this point the accumulation texture must be ready for CompositeOITFS, but if it is blitted earlier the MSAA resolve won't have access to the opaque texture that translucent surfaces should be blended with. Without OIT, translucent surfaces should be AA'd properly. As for EDL, my understanding is that the commands are added to the frameState as Pass.CESIUM_3D_TILE passes, and similar to OIT, the shader requires the framebuffer's texture to be ready when the draw command is executed. I think this would mean performing the blit while Scene is executing the 3D tile passes after the color renderbuffer has been drawn to but before it is used to copy into the scene framebuffer. The details here are a bit hazy to me but it seems like a different use case than the other blits in this PR.

The only change in 496b1c1 was to remove an extra GlobeDepth blit before PickDepth's update. I think I added this blit just in case it was needed but even during pick frames the textures should be ready after executeUpdateDepth is called. I ran through the pickDepth/translucent picking Sandcastles again and didn't run into issues so I think this is the right change.

@ebogo1
Copy link
Contributor Author

ebogo1 commented Feb 4, 2022

One user on the forum reported low frame rates on a 3090 (!) I checked that blitting and multisample framebuffers were being called as expected so I don't think there's an obvious reason for this. I get consistent performance on my MacBook and XPS with an NVIDIA card but I can reproduce low frames on a 3070 PC. After removing the last blit before pickDepth's update I gained around 10fps on the 3070 machine, but I still drop from ~144fps to 40 when in full screen after enabling MSAA.

edit - @sanjeetsuhag was able to reproduce the frame drops on a 3080 and 1650M (XPS 15) running Windows 10 and 11. Interestingly, the same XPS did not drop frames on Linux, which is consistent with my XPS running Ubuntu. Not sure how common it is to blit depth attachments but I wonder if it's possible there's something there only on Windows.

Apps/Sandcastle/gallery/MSAA.html Show resolved Hide resolved
Apps/Sandcastle/gallery/MSAA.html Outdated Show resolved Hide resolved
Source/Scene/Scene.js Outdated Show resolved Hide resolved
@lilleyse
Copy link
Contributor

lilleyse commented Feb 16, 2022

Offline we discussed that it might be possible to avoid the stencil blit in most cases. The only place where it might be needed is executeUpdateDepth when it copies the 3D Tiles depth into the globe depth texture.

Even then, I'm not sure we need to do a stencil blit if we restructure the command a bit. Pasting a comment from slack that goes into more detail about these steps - executeUpdateDepth currently uses the stencil to determine which depth values to copy over to czm_globeDepthTexture, but it might be equally fine to copy over any non-1.0 depth values. I think this will work whether clearGlobeDepth is true or false. It might also simplify the code a lot.

executeCopyDepth - takes a snapshot of the depth after the globe is rendered. Afterwards the depth attachment may be cleared, like when depthTestAgainstTerrain is false so that primitives don't z-fight with terrain.

Textured classification primitives however still need access to the globe depth, even though the depth attachment may have been cleared. The globe depth snapshot is made available to shaders as the built-in variable czm_globeDepthTexture. This texture is a packed RGBA texture instead of a regular depth texture (hence the calls to czm_unpackDepth).

executeCopyDepth is responsible for converting the depth attachment to the packed RGBA texture. copyDepthFramebuffer owns the packed RGBA texture. copyDepthCommand takes the depth texture as input and the copyDepthFramebuffer color texture as output. When MSAA is enabled the depth attachment needs to be blitted to a depth texture before the command can be executed.

executeUpdateDepth - updates the depth snapshot after the 3D Tiles pass. Since the globe depth may have been cleared earlier we can't just naively copy the depth texture into the packed RGBA texture again, otherwise the globe depth values would get lost. Instead we need to overwrite the packed RGBA texture only where the 3D Tiles bit is set. The end result is that czm_globeDepthTexture has globe depth and 3D Tiles depth.

First tempCopyDepthCommand converts the depth texture to a packed RGBA texture owned by tempCopyDepthFramebuffer. Then updateDepthCommand takes the temporary packed RGBA texture as input and the packed RGBA texture owned by copyDepthFramebuffer as output, and copies pixels where the 3D Tiles bit is set. updateDepthFramebuffer doesn't own any of its own resources - its color attachment is copyDepthFramebuffer's color texture and it's depth-stencil attachment is the scene's depth stencil attachment (which is almost always the depth stencil attachment owned by GlobeDepth's colorFramebuffer, but is sometimes the invert classification depth-stencil attachment). The depth-stencil attachment is only used for its stencil buffer.

The reason this is two commands (depth -> temporary packed, temporary packed -> packed (where stencil bit is set)) instead of just one command (depth -> packed (where stencil bit is set)) is because a depth-stencil attachment can't be both an input and output of a shader. The depth portion would be the input (via a uniform) and the stencil portion would be the outpt (because the stencil needs to be attached to the output framebuffer in order to be used, even if it is conceptually input).

For MSAA the depth attachment would need to be blitted to a depth texture prior to executing the tempCopyDepthCommand.

@lilleyse
Copy link
Contributor

Even then, I'm not sure we need to do a stencil blit if we restructure the command a bit. Pasting a comment from slack that goes into more detail about these steps - executeUpdateDepth currently uses the stencil to determine which depth values to copy over to czm_globeDepthTexture, but it might be equally fine to copy over any non-1.0 depth values. I think this will work whether clearGlobeDepth is true or false. It might also simplify the code a lot.

I realized the flaw with this idea. When executeUpdateDepth is called the depth attachment will have the depth plane and 3D Tiles commands rendered into it. The stencil bit was the only way to differentiate between the two. The approach above will overwrite the real terrain depth with the depth plane depth which will cause czm_globeDepthTexture to be incorrect.

@lilleyse
Copy link
Contributor

Silver lining though - the stencil blit only needs to be cleared in executeUpdateDepth if clearGlobeDepth is false. The stencil isn't needed if the function takes the "fast path" and executes _copyDepthCommand.

@ebogo1
Copy link
Contributor Author

ebogo1 commented Feb 17, 2022

@lilleyse If CI passes this should be good for another look. After 06190d3 I am seeing no frame drops on my 3070 machine. The change was to disable the STENCIL_BUFFER_BIT in the blit bitmask by default and to enable it only when executeUpdateDepth needs a resolved stencil texture, which is two places:

  1. When executeUpadeDepth uses InvertClassification's texture, the stencil buffer bit is enabled in executeClassified.
  2. When clearGlobeDepth is enabled, globeDepth.prepareTextures will blit the stencil buffer bit before the normal call to executeUpdateDepth. I verified that this should works as expected by testing this Sandcastle with Front Translucent Back Opaque selected and Depth test against terrain unchecked.

Comment on lines 1600 to 1607
/**
* The sample rate of multisample antialiasing (values greater than 1 enable MSAA).
* @memberof Scene.prototype
* @type {Boolean}
* @readonly
* @default false
*/
msaaSamples: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type and default need to be fixed as well

@lilleyse
Copy link
Contributor

It was easier to just push a fix for that 🤷‍♂️

@lilleyse If CI passes this should be good for another look. After 06190d3 I am seeing no frame drops on my 3070 machine. The change was to disable the STENCIL_BUFFER_BIT in the blit bitmask by default and to enable it only when executeUpdateDepth needs a resolved stencil texture, which is two places:

Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement MSAA with WebGL 2
5 participants