Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelined: 3D examples crash with AMDVLK on Wayland #3288

Open
inodentry opened this issue Dec 10, 2021 · 21 comments
Open

Pipelined: 3D examples crash with AMDVLK on Wayland #3288

inodentry opened this issue Dec 10, 2021 · 21 comments
Labels
A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior C-Startup A crash that occurs when first attempting to run a Bevy app O-Linux Specific to the Linux desktop operating system P-Crash A sudden unexpected crash

Comments

@inodentry
Copy link
Contributor

Bevy version

Current main (commit cf48132e).

Operating system & version

Linux 5.15.4 / Sway Wayland Compositor / AMDVLK 2021.Q3.7

Mesa RADV driver works with both wayland and x11.
AMDVLK driver works with x11, but not wayland.

What you did

cargo run --example 3d_scene_pipelined --features bevy/wayland --release

(or any other pipelined 3D example)

2D seems to work. bevymark_pipelined works correctly.

Old renderer works. The old 3d_scene works correctly.

What you expected to happen

The new 3d examples to not crash in wayland mode with the amdvlk driver. :)

What actually happened

Bevy manages to render at least 1 frame (don't know how to tell if it can do more than 1 before crashing). I can tell, because the window flashes briefly.

The window opens, a frame is rendered and displayed, and then almost immediately crashes with a panic (and sometimes segfault, but i think that's irrelevant).

Additional information

2021-12-10T15:01:56.213959Z  INFO bevy_render2::renderer: AdapterInfo { name: "Radeon RX Vega", vendor: 4098, device: 26751, device_type: DiscreteGpu, backend: Vulkan }

thread 'main' panicked at 'Failed to acquire next swap chain texture!: Timeout', pipelined/bevy_render2/src/view/window.rs:159:24
stack backtrace:
   0: rust_begin_unwind
             at /rustc/0e07bcb68b82b54c0c4ec6fe076e9d75b02109cf/library/std/src/panicking.rs:498:5
   1: core::panicking::panic_fmt
             at /rustc/0e07bcb68b82b54c0c4ec6fe076e9d75b02109cf/library/core/src/panicking.rs:107:14
   2: core::result::unwrap_failed
             at /rustc/0e07bcb68b82b54c0c4ec6fe076e9d75b02109cf/library/core/src/result.rs:1613:5
   3: bevy_render2::view::window::prepare_windows
   4: <bevy_ecs::system::function_system::FunctionSystem<In,Out,Param,Marker,F> as bevy_ecs::system::system::System>::run_unsafe
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   7: async_task::raw::RawTask<F,T,S>::run
   8: async_executor::LocalExecutor::try_tick
   9: bevy_tasks::task_pool::TaskPool::scope
  10: <bevy_ecs::schedule::executor_parallel::ParallelExecutor as bevy_ecs::schedule::executor::ParallelSystemExecutor>::run_systems
  11: <bevy_ecs::schedule::stage::SystemStage as bevy_ecs::schedule::stage::Stage>::run
  12: <bevy_render2::RenderPlugin as bevy_app::plugin::Plugin>::build::{{closure}}
  13: bevy_app::app::App::update
  14: bevy_winit::winit_runner_with::{{closure}}
  15: winit::platform_impl::platform::wayland::event_loop::EventLoop<T>::run_return
  16: winit::platform_impl::platform::wayland::event_loop::EventLoop<T>::run
  17: winit::platform_impl::platform::EventLoop<T>::run
  18: winit::event_loop::EventLoop<T>::run
  19: bevy_winit::run
  20: bevy_winit::winit_runner_with
  21: core::ops::function::Fn::call
  22: bevy_app::app::App::run
  23: _3d_scene_pipelined::main

Seems like this happens in the window presentation code and its swapchain management.

I know that @cart is reworking all of this stuff with the upcoming "actual pipelining" implementation. When that code lands, we should check if this bug still persists. It might turn out to just be fixed. :)

@inodentry inodentry added C-Bug An unexpected or incorrect behavior A-Rendering Drawing game state to the screen P-Crash A sudden unexpected crash O-Linux Specific to the Linux desktop operating system labels Dec 10, 2021
@cart cart added this to the Bevy 0.6 milestone Dec 10, 2021
@heavyrain266
Copy link

heavyrain266 commented Dec 15, 2021

Hello, have you tried using Vulkan backend for Sway? (requires master branch afaik) I'm able to run anything in bevy with my compositor which uses Vulkan for rendering.

The thing is, Vulkan program on Wayland requires some additional (client-side) extensions like VK_KHR_wayland_surface which obviously should be provided by winit.

To run Sway (master) with Vulkan backend you should run WLR_RENDERER=vulkan sway from TTY.

EDIT: Added instructions for Vulkan backend in Sway.

@alice-i-cecile alice-i-cecile added the C-Startup A crash that occurs when first attempting to run a Bevy app label Dec 21, 2021
@cart
Copy link
Member

cart commented Dec 22, 2021

Removing from the 0.6 milestone. This is important to investigate and fix, but it is also niche enough that I don't think it should block the 0.6 release.

@cart cart removed this from the Bevy 0.6 milestone Dec 22, 2021
@turboMaCk
Copy link
Contributor

turboMaCk commented Jan 3, 2022

I'm actually getting this error on many of examples in main branch. For example sprite_flipping crashes with this panic for me.

I'm also using amd-gpu on linux, however I'm using xorg instead of Wayland. I think this problem is not specific to Wayland but to only AMDVK. 0.5 runs just fine for me so this seems to be regression in 0.6 rewrite.

My full setup is Linux 5.10.88, nixos-unstable (SMP Wed Dec 22 08:31:00 UTC 2021),

sudo lshw -C video
  *-display                 
       physical id: 0
       bus info: pci@0000:0b:00.0
       version: c1
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi bus_master cap_list rom
       configuration: driver=amdgpu latency=0
       resources: irq:105 memory:d0000000-dfffffff memory:e0000000-e01fffff ioport:e000(size=256) memory:fcc00000-fcc7ffff memory:fcc80000-fcc9ffff

@heavyrain266
Copy link

I'm actually getting this error on many of examples in main branch. For example sprite_flipping crashes with this panic for me.

I'm also using amd-gpu on linux, however I'm using xorg instead of Wayland. I think this problem is not specific to Wayland but to only AMDVK. 0.5 runs just fine for me so this seems to be regression in 0.6 rewrite.

I think, it's all related to AMDVLK, someone on one discord server observed really low FPS in games with AMDVLK (around 30-40), but when switched to regular drivers (mesa or amd-gpu), jumped up to 120+ with RX6600.

@turboMaCk
Copy link
Contributor

In my setup I'm making both mesa radv as well as amdvlk available. So it's bevy choosing to use amdvlk. What I can do is to remove amdvlk completely and try if this resolves the issue.

@turboMaCk
Copy link
Contributor

turboMaCk commented Jan 3, 2022

Removing AMDVK doesn't fix this.

[nix-shell:~/Projects/bevy]$ vulkaninfo | grep driver
	driverVersion     = 8388815 (0x8000cf)
	driverID           = DRIVER_ID_AMD_OPEN_SOURCE
	driverName         = AMD open-source driver
	driverInfo         = 
	driverUUID      = 414d442d-4c49-4e55-582d-445256000000
	driverUUID                        = 414d442d-4c49-4e55-582d-445256000000
	driverID                                             = DRIVER_ID_AMD_OPEN_SOURCE
	driverName                                           = AMD open-source driver
	driverInfo                                           = 
	VK_KHR_driver_properties                    : extension revision 1
	
[marek@nixos-mainframe:~]$ cat /run/opengl-driver/share/vulkan/icd.d/radeon_icd.x86_64.json
{
    "ICD": {
        "api_version": "1.2.195",
        "library_path": "/nix/store/js1b9763648ghaczis17g4qpl06r0s83-mesa-21.3.2-drivers/lib/libvulkan_radeon.so"
    },
    "file_format_version": "1.0.0"
}

[nix-shell:~/Projects/bevy]$ RUST_BACKTRACE=1 cargo run --example sprite_flipping --release
    Finished release [optimized] target(s) in 0.15s
     Running `target/release/examples/sprite_flipping`
2022-01-03T12:53:27.252479Z  INFO winit::platform_impl::platform::x11::window: Guessed window scale factor: 1.6666666666666667    
2022-01-03T12:53:27.309893Z  INFO bevy_render::renderer: AdapterInfo { name: "AMD Radeon RX 5700 XT", vendor: 4098, device: 29471, device_type: DiscreteGpu, backend: Vulkan }
thread 'main' panicked at 'Failed to acquire next swap chain texture!: Timeout', crates/bevy_render/src/view/window.rs:159:24
stack backtrace:
   0: rust_begin_unwind
             at /rustc/78fd0f633faaa5b6dd254fc1456735f63a1b1238/library/std/src/panicking.rs:498:5
   1: core::panicking::panic_fmt
             at /rustc/78fd0f633faaa5b6dd254fc1456735f63a1b1238/library/core/src/panicking.rs:107:14
   2: core::result::unwrap_failed
             at /rustc/78fd0f633faaa5b6dd254fc1456735f63a1b1238/library/core/src/result.rs:1661:5
   3: bevy_render::view::window::prepare_windows
   4: <bevy_ecs::system::function_system::FunctionSystem<In,Out,Param,Marker,F> as bevy_ecs::system::system::System>::run_unsafe
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   7: async_task::raw::RawTask<F,T,S>::run
   8: async_executor::LocalExecutor::try_tick
   9: std::thread::local::LocalKey<T>::with
  10: <bevy_ecs::schedule::executor_parallel::ParallelExecutor as bevy_ecs::schedule::executor::ParallelSystemExecutor>::run_systems
  11: <bevy_ecs::schedule::stage::SystemStage as bevy_ecs::schedule::stage::Stage>::run
  12: <bevy_render::RenderPlugin as bevy_app::plugin::Plugin>::build::{{closure}}
  13: bevy_app::app::App::update
  14: bevy_winit::winit_runner_with::{{closure}}
  15: winit::platform_impl::platform::x11::EventLoop<T>::run_return
  16: winit::platform_impl::platform::x11::EventLoop<T>::run
  17: winit::platform_impl::platform::EventLoop<T>::run
  18: winit::event_loop::EventLoop<T>::run
  19: bevy_winit::run
  20: bevy_winit::winit_runner_with
  21: core::ops::function::Fn::call
  22: bevy_app::app::App::run
  23: sprite_flipping::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

I can try to switch amdgpu to amdgpu-pro (proprietary driver) but it will require some work - I would need to fix that package in nixos - we don't support it much especially on unstable branch which is what I'm using since it has little to no advantage over open source driver.

@turboMaCk
Copy link
Contributor

I can confirm that switching toamdgpu-pro fixes the issue. So issue is specific to amd-gpu (open source amd driver) . AMDVLK or RADV doesn't make a difference.

is there anything else I could do to help to debug this one?

@turboMaCk
Copy link
Contributor

@cart I think you should rename this issue. Or if you want me to create another one just let me know. I'm also able to test all combinations of these things in my setup in case you'll need some help testing this.

@zeroeightysix
Copy link

zeroeightysix commented Jan 9, 2022

I can confirm that switching toamdgpu-pro fixes the issue. So issue is specific to amd-gpu (open source amd driver) . AMDVLK or RADV doesn't make a difference.

Using the amdgpu-pro driver does resolve the issue using the X backend, but the issue persists (on the proprietary driver!) with the wayland feature enabled.

Edit: looks like this was a false negative, as comments below mine might suggest. Issue persists on the amdgpu-pro driver, using the X backend, as well - although its rarity has increased in my tests.

@Deukhoofd
Copy link

I'm seeing this exact issue on X11, both using amdgpu, and amdgpu-pro (ran through progl). The application opens a window, runs a single frame, and panicks.

2022-01-09T12:38:46.561792Z  INFO winit::platform_impl::platform::x11::window: Guessed window scale factor: 1    
2022-01-09T12:38:46.693942Z  INFO bevy_render::renderer: AdapterInfo { name: "AMD Radeon RX 580 Series", vendor: 4098, device: 26591, device_type: DiscreteGpu, backend: Vulkan }
thread 'main' panicked at 'Failed to acquire next swap chain texture!: Timeout', bevy_render-0.6.0/src/view/window.rs:161:24

@heavyrain266
Copy link

Looks like all those issues are caused by AMD drovers directly or bugs in wgpu's Vulkan backend because on Nvidia (RTX 3060) works fine but when I swap it with AMD (RX570) then panics on both Wayland and X11, doesn't matter if I use amdgpu-pro or amdvlk.

@turboMaCk
Copy link
Contributor

In 6 days of of using amdgpu-pro 21.30 I didn't get a single crash.

Which version did you use? Are you sure you've managed to replace amdgpu with amdgpu-pro? Did you reboot system after switching driver (it's required)?

@heavyrain266
Copy link

In 6 days of of using amdgpu-pro 21.30 I didn't get a single crash.

Which version did you use? Are you sure you've managed to replace amdgpu with amdgpu-pro? Did you reboot system after switching driver (it's required)?

Yep, I did everything also checked everything and using amdgpu-pro 21.30 as well

@turboMaCk
Copy link
Contributor

Hmm then I have no idea what is different. Anyway I agree this is most likely bug in a driver or (perhaps more likely) in wgpu/gfx.

For full disclosure I have RX 5700 XT and run 5.10.88 kernel.

doesn't matter if I use amdgpu-pro or amdvlk

Just be aware that amdgpu-pro and amdvlk are two different things. Amdvlk is vulkan driver. You can use either that or mesa RADV. This choice doesn't matter for this issue. Amdgpu vs amdgpu-pro (hardware drivers) seems to matter from my testing. They both share same kernel level component so I don't think kernel version matters in this case.

This is what I use to confirm what kernel space and user space driver I run at the moment:

$ lspci -k | grep -EA3 'VGA'

@Deukhoofd
Copy link

Deukhoofd commented Jan 11, 2022

I appear to be able to run the renderer if I tax it heavily enough. By spawning in 10_000 cubes, the renderer starts running. Once it gets sufficiently low load however (through a system that randomly deletes cubes), it immediately crashes again.

Edit: After actually using my mind after the previous comment, I disabled vsync on the window, and tried it again, this time with no crashes.

@turboMaCk
Copy link
Contributor

I can no longer reproduce the issue with open source driver on current main branch.

@lebocra
Copy link

lebocra commented Mar 7, 2022

i had this problem too but only on vulkan, when i switched WGPU_BACKEND to Gl it was fixed, i still dont know what could cause the problem on amdvk, this happened on the 0.6.1

@heavyrain266
Copy link

i had this problem too but only on vulkan, when i switched WGPU_BACKEND to Gl it was fixed, i still dont know what could cause the problem on amdvk, this happened on the 0.6.1

Pretty sure GL backend defaults to LLVMpipe which is CPU based implementation of OpenGL. (At least for now)

@inodentry
Copy link
Contributor Author

inodentry commented Apr 4, 2022

YES, I can confirm that this issue still occurs with AMD's official drivers, with the newest bevy main.

Tested with AMDGPU-PRO proprietary driver (version 21.50.2.1384496) now, which is mostly the same driver as the official open source AMDVLK ... I don't have access to AMDVLK anymore from my current distro.

I still get the issue as described originally. Panic: "Failed to acquire next swapchain texture".

The unofficial open-source RADV works.

@jivvy
Copy link

jivvy commented Aug 4, 2022

Can confirm this occurs on Bevy 0.8 [07d576]
Arch Linux 5.18.16-arch1-1
sway/Wayland, also tested GNOME
mesa 22.1.4-1

Using the following command:
cargo run --release --example sprite

I don't know what is useful, or what is not, so I'll keep including everything:

With Vulkan backend
2022-08-04T12:38:19.032983Z  INFO bevy_render::renderer: AdapterInfo { name: "AMD Radeon RX 6700 XT", vendor: 4098, device: 29663, device_type: DiscreteGpu, backend: Vulkan }
thread 'main' panicked at 'Failed to acquire next swap chain texture!: Timeout', crates/bevy_render/src/view/window.rs:190:24
Vulkan backend `RUST_BACKTRACE=1`
thread 'main' panicked at 'Failed to acquire next swap chain texture!: Timeout', crates/bevy_render/src/view/window.rs:190:24
stack backtrace:
   0: rust_begin_unwind
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:142:14
   2: core::result::unwrap_failed
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/result.rs:1785:5
   3: bevy_render::view::window::prepare_windows
   4: <bevy_ecs::system::function_system::FunctionSystem<In,Out,Param,Marker,F> as bevy_ecs::system::system::System>::run_unsafe
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   7: async_task::raw::RawTask<F,T,S>::run
   8: async_executor::Executor::try_tick
   9: std::thread::local::LocalKey<T>::with
  10: <bevy_ecs::schedule::executor_parallel::ParallelExecutor as bevy_ecs::schedule::executor::ParallelSystemExecutor>::run_systems
  11: <bevy_ecs::schedule::stage::SystemStage as bevy_ecs::schedule::stage::Stage>::run
  12: <bevy_render::RenderPlugin as bevy_app::plugin::Plugin>::build::{{closure}}
  13: bevy_app::app::App::update
  14: bevy_winit::winit_runner_with::{{closure}}
  15: winit::platform_impl::platform::x11::EventLoop<T>::run_return
  16: winit::platform_impl::platform::x11::EventLoop<T>::run
  17: winit::platform_impl::platform::EventLoop<T>::run
  18: winit::event_loop::EventLoop<T>::run
  19: bevy_winit::run
  20: bevy_winit::winit_runner_with
  21: core::ops::function::Fn::call
  22: bevy_app::app::App::run
  23: sprite::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
With OpenGL backend through `WGPU_BACKEND=gl`
2022-08-04T12:44:02.057143Z  INFO bevy_render::renderer: AdapterInfo { name: "AMD Radeon RX 6700 XT (navy_flounder, LLVM 14.0.6, DRM 3.46, 5.18.16-arch1-1)", vendor: 4098, device: 0, device_type: Other, backend: Gl }
2022-08-04T12:44:02.131961Z ERROR wgpu_core::device: surface configuration failed: incompatible window kind
thread 'main' panicked at 'Error in Surface::configure: invalid surface', /home/jivvy/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.13.1/src/backend/direct.rs:281:9
I also tested it by running GNOME on Xorg, but the same crash occurred
2022-08-04T13:11:49.927280Z  INFO bevy_render::renderer: AdapterInfo { name: "AMD Radeon RX 6700 XT", vendor: 4098, device: 29663, device_type: DiscreteGpu, backend: Vulkan }
thread 'main' panicked at 'Failed to acquire next swap chain texture!: Timeout', crates/bevy_render/src/view/window.rs:190:24

@ghost
Copy link

ghost commented Jan 6, 2023

This should probably be closed, AMD makes horrible quality proprietary drivers for Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior C-Startup A crash that occurs when first attempting to run a Bevy app O-Linux Specific to the Linux desktop operating system P-Crash A sudden unexpected crash
Projects
None yet
Development

No branches or pull requests

9 participants