Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fifo present mode occasionally returns VK_TIMEOUT without blocking in wgpu-rs examples on certain systems #1218

Closed
Imberflur opened this issue Feb 14, 2021 · 6 comments
Labels
api: vulkan Issues with Vulkan area: wsi Issues with swapchain management or windowing external: driver-bug A driver is causing the bug, though we may still want to work around it help required We need community help to make this happen.

Comments

@Imberflur
Copy link
Contributor

Imberflur commented Feb 14, 2021

Description
wgpu provides a 1s timeout which ends up being passed to Swapchain::acquire_next_image in ash however on my system sometimes VK_TIMEOUT will be returned within ~20-200µs when using the Fifo present mode. This occurs every few seconds for me on the wgpu-rs water example (Note: most of the examples use Mailbox by default so to test this the present mode needs to be changed).

Repro steps
Modify the example framework of wgpu-rs to use PresentMode::Fifo and to print errors from swapchain.get_current_frame(). Then run the water example and look for any timeout errors.

Expected vs observed behavior
get_current_frame() blocks until a frame is available or has a timeout error after 1 second when using Fifo.

Extra materials

Platform

OS: Manjaro 20.2.1 Nibia
Kernel: x86_64 Linux 5.10.7-3-MANJARO
DE: Xfce4
CPU: Intel Core i5-4690 @ 4x 3.9GHz
GPU: AMD Radeon HD 7900 Series (TAHITI, DRM 3.40.0, 5.10.7-3-MANJARO, LLVM 11.0.1)
drivers tested: radv, amdvlk, vulkan-amdgpu-pro

wgpu version is the git version a little bit after 0.7 release

I was also able to test on a newish intel iGPU laptop with the same OS and did not see the issue. I believe @kvark also tested on another device and the issue wasn't present. Thus, so far it seems like a quirk of this particular device/drivers.

@kvark kvark added external: driver-bug A driver is causing the bug, though we may still want to work around it help required We need community help to make this happen. labels Feb 14, 2021
@Imberflur
Copy link
Contributor Author

Here is the diff for testing to see if it reproduces:

diff --git a/examples/framework.rs b/examples/framework.rs
index 69d7ed4..208916a 100644
--- a/examples/framework.rs
+++ b/examples/framework.rs
@@ -205,7 +205,7 @@ fn start<E: Example>(
         format: adapter.get_swap_chain_preferred_format(&surface),
         width: size.width,
         height: size.height,
-        present_mode: wgpu::PresentMode::Mailbox,
+        present_mode: wgpu::PresentMode::Fifo,
     };
     let mut swap_chain = device.create_swap_chain(&surface, &sc_desc);
 
@@ -280,7 +280,8 @@ fn start<E: Example>(
             event::Event::RedrawRequested(_) => {
                 let frame = match swap_chain.get_current_frame() {
                     Ok(frame) => frame,
-                    Err(_) => {
+                    Err(err) => {
+                        dbg!(err);
                         swap_chain = device.create_swap_chain(&surface, &sc_desc);
                         swap_chain
                             .get_current_frame()

@hannobraun
Copy link
Contributor

I think I'm running into the same issue. My platform:

  • wgpu version: 0.12.0
  • OS: Arch Linux
  • Kernel: 5.17.1-arch1-1 x86_64
  • Desktop environment: Gnome/Wayland
  • CPU/GPU: AMD Ryzen 7 5700G with Radeon Graphics
  • Driver: amdvlk

Only happens with PresentMode::Fifo. Goes away with Immediate or Mailbox.

hannobraun added a commit to hannobraun/fornjot that referenced this issue Apr 7, 2022
I'm running into this issue:
gfx-rs/wgpu#1218

I'm not aware of any adverse effects of switching to
`PresentMode::Mailbox`, except that it's "not optimal for mobile",
according to the wgpu documentation. Since we don't currently support
any mobile platforms, I think this is fine for now.
@Imberflur
Copy link
Contributor Author

FYI my current workaround for this is to skip the frame if this error occurs https://gitlab.com/veloren/veloren/-/blob/d9825d1d38c1d36503142a648880632871f52c58/voxygen/src/render/renderer.rs#L1088

@hannobraun
Copy link
Contributor

That's good to know, thanks. I just switched to PresentMode::Mailbox, which seems to work fine on all the machines I have access to. But it might be better to specifically ignore the error.

@cwfitzgerald cwfitzgerald added area: wsi Issues with swapchain management or windowing api: vulkan Issues with Vulkan labels Jun 5, 2022
bors bot pushed a commit to bevyengine/bevy that referenced this issue Nov 12, 2022
# Objective

- Fix #3606
- Fix #4579
- Fix #3380

## Solution

When running on a Linux machine with some AMD or Intel device, when calling
`surface.get_current_texture()`, ignore `wgpu::SurfaceError::Timeout` errors.


## Alternative

An alternative solution found in the `wgpu` examples is:

```rust
let frame = surface
    .get_current_texture()
    .or_else(|_| {
        render_device.configure_surface(surface, &swap_chain_descriptor);
        surface.get_current_texture()
    })
    .expect("Error reconfiguring surface");
window.swap_chain_texture = Some(TextureView::from(frame));
```

See: <https://github.com/gfx-rs/wgpu/blob/94ce76391b560a66e36df1300bd684321e57511a/wgpu/examples/framework.rs#L362-L370>

Veloren [handles the Timeout error the way this PR proposes to handle it](gfx-rs/wgpu#1218 (comment)).

The reason I went with this PR's solution is that `configure_surface` seems to be quite an expensive operation, and it would run every frame with the wgpu framework solution, despite the fact it works perfectly fine without `configure_surface`.

I know this looks super hacky with the linux-specific line and the AMD check, but my understanding is that the `Timeout` occurrence is specific to a quirk of some AMD drivers on linux, and if otherwise met should be considered a bug.


Co-authored-by: Carter Anderson <[email protected]>
@ghost
Copy link

ghost commented Nov 14, 2022

Enabling DPMS reliably causes this timeout to happen with Intel mesa drivers on Linux.

ItsDoot pushed a commit to ItsDoot/bevy that referenced this issue Feb 1, 2023
# Objective

- Fix bevyengine#3606
- Fix bevyengine#4579
- Fix bevyengine#3380

## Solution

When running on a Linux machine with some AMD or Intel device, when calling
`surface.get_current_texture()`, ignore `wgpu::SurfaceError::Timeout` errors.


## Alternative

An alternative solution found in the `wgpu` examples is:

```rust
let frame = surface
    .get_current_texture()
    .or_else(|_| {
        render_device.configure_surface(surface, &swap_chain_descriptor);
        surface.get_current_texture()
    })
    .expect("Error reconfiguring surface");
window.swap_chain_texture = Some(TextureView::from(frame));
```

See: <https://github.com/gfx-rs/wgpu/blob/94ce76391b560a66e36df1300bd684321e57511a/wgpu/examples/framework.rs#L362-L370>

Veloren [handles the Timeout error the way this PR proposes to handle it](gfx-rs/wgpu#1218 (comment)).

The reason I went with this PR's solution is that `configure_surface` seems to be quite an expensive operation, and it would run every frame with the wgpu framework solution, despite the fact it works perfectly fine without `configure_surface`.

I know this looks super hacky with the linux-specific line and the AMD check, but my understanding is that the `Timeout` occurrence is specific to a quirk of some AMD drivers on linux, and if otherwise met should be considered a bug.


Co-authored-by: Carter Anderson <[email protected]>
@cwfitzgerald
Copy link
Member

I'm not sure if there's anything we can actually do about this in wgpu, this is something the user potentially needs to deal with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: vulkan Issues with Vulkan area: wsi Issues with swapchain management or windowing external: driver-bug A driver is causing the bug, though we may still want to work around it help required We need community help to make this happen.
Projects
None yet
Development

No branches or pull requests

4 participants