Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

noop system with commands argument added to Render schedule exhibits racing behavior #12498

Closed
MeoMix opened this issue Mar 15, 2024 · 6 comments
Labels
A-ECS Entities, components, systems, and events A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior S-Needs-Investigation This issue requires detective work to figure out what's going wrong

Comments

@MeoMix
Copy link

MeoMix commented Mar 15, 2024

Bevy version

0.13

[Optional] Relevant system information

AdapterInfo { name: "D3D12 (NVIDIA GeForce GTX 1080 Ti)", vendor: 0, device: 0, device_type: Other, driver: "", driver_info: "", backend: Gl }
1.78.0-nightly (7065f0ef4 2024-03-12)

What you did

I updated to Bevy 0.13 from Bevy 0.12. I attempted to render a tilemap using bevy-ecs-tilemap in a WASM/singlethreaded environment.

What went wrong

Nothing rendered. The issue occurs approximately 1 in every 5 loads of the app.

Additional information

MRE: https://github.com/MeoMix/bevy_bug_example/blob/master/src/main.rs

// This code should draw a white square in the center of the window if it runs properly.
// One in every 5-10 runs no white square will render.
// If the `multithreaded` feature is enabled in bevy's Cargo.toml then the square "always" renders (if it's a timing issue it becomes sufficiently difficult to surface with multithreading)
use bevy::prelude::*;
use bevy::render::{Render, RenderApp, RenderSet};
use bevy_ecs_tilemap::prelude::*;

fn main() {
    App::new()
    .add_plugins((DefaultPlugins, TilemapPlugin))
    .add_systems(Startup, (spawn_camera, spawn_tilemap).chain())
    .add_plugins(ExamplePlugin)
    .run();
}

fn spawn_camera(mut commands: Commands) {
    commands.spawn(Camera2dBundle::default());
}

pub fn spawn_tilemap(mut commands: Commands) {
    let tilemap_entity = commands.spawn_empty().id();
    let map_size =  TilemapSize { x: 144, y: 144 };
    let mut tile_storage = TileStorage::empty(map_size);

    let tile_pos = TilePos { x: 0, y: 0 };

    let tile_entity = commands.spawn(TileBundle {
        position: tile_pos,
        tilemap_id: TilemapId(tilemap_entity),
        ..default()
    }).id();

    tile_storage.set(&tile_pos, tile_entity);

    commands.entity(tilemap_entity).insert(
        TilemapBundle {
            size: map_size,
            storage: tile_storage,
            tile_size: TilemapTileSize { x: 128.0, y: 128.0 },
            ..default()
        },
    );
}

pub struct ExamplePlugin;

impl Plugin for ExamplePlugin {
    fn build(&self, app: &mut App) {
    }

    fn finish(&self, app: &mut App) {
        if let Ok(render_app) = app.get_sub_app_mut(RenderApp) {
            render_app.add_systems(Render, test_system.in_set(RenderSet::Queue));
        }
    }
}

// IMPORTANT: need to have commands as an argument here or bug doesn't reproduce!
pub fn test_system(mut commands: Commands) {}

The crucial thing to note about the MRE is that test_system accepts mut commands: Commands but is otherwise empty. test_system is added to RenderApp in the set RenderSet::Queue.

The issue does not reproduce if I remove the unused argument.

I see that bevy-ecs-tilemap also inserts a system into the set RenderSet::Queue here: https://github.com/StarArawn/bevy_ecs_tilemap/blob/main/src/render/material.rs#L130

My assumption is that this bug relates to #9822 but only because I have awareness of this change and lack awareness of all changes made in Bevy 0.13.

Correct Output:
image

Incorrect Output:

image

@MeoMix MeoMix added C-Bug An unexpected or incorrect behavior S-Needs-Triage This issue needs to be labelled labels Mar 15, 2024
@MeoMix
Copy link
Author

MeoMix commented Mar 15, 2024

Perhaps this isn't actually a bug in Bevy and is just considered a breaking change. I'll explore configuring auto_insert_sync_points to mitigate the issue. It seems scary though because the need only arises when multiple, independent plugins introduce systems dependent on commands to the same schedule.

Humm.. if I do:

  render_app.edit_schedule(Render, |schedule| {
      schedule.set_build_settings(ScheduleBuildSettings {
          auto_insert_apply_deferred: false,
          ..default()
      });
  });

then stuff panics and errors out:

2024-03-15T17:44:00.628664Z ERROR present_frames: log: No work has been submitted for this frame    
thread 'main' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/bevy_ecs-0.13.0/src/system/commands/mod.rs:1075:13:
error[B0003]: Could not insert a bundle (of type `bevy_render::view::ViewTarget`) for entity 1v1 because it doesn't exist in this World.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2024-03-15T17:44:00.730635Z  WARN bevy_ecs::system::commands::command_queue: CommandQueue has un-applied commands being dropped.
2024-03-15T17:44:00.730694Z  WARN bevy_ecs::system::commands::command_queue: CommandQueue has un-applied commands being dropped.
2024-03-15T17:44:00.730778Z  WARN bevy_ecs::system::commands::command_queue: CommandQueue has un-applied commands being dropped.
2024-03-15T17:44:00.730816Z  WARN bevy_ecs::system::commands::command_queue: CommandQueue has un-applied commands being dropped.
2024-03-15T17:44:00.730824Z  WARN bevy_ecs::system::commands::command_queue: CommandQueue has un-applied commands being dropped.
2024-03-15T17:44:00.730830Z  WARN bevy_ecs::system::commands::command_queue: CommandQueue has un-applied commands being dropped.

..

Nah, as I think about it more, I do think something is wrong here. I don't think the automatic apply_deferred logic should ever result in racing-like behavior. The whole point is to only apply it in scenarios where the user has conveyed a fixed ordering to their systems. Here there's no .chain() or ordering - just adding systems to a set. The order in which they run is up in the air and, depending on that order, the sync point stuff breaks / doesn't break expected behavior.

@djeedai
Copy link
Contributor

djeedai commented Mar 15, 2024

  1. I agree that adding a noop system shouldn't change anything
  2. I don't know that Bevy is smart enough to detect the system is noop, or if seeing Commands in arguments is enough to insert a sync point even if the command queue is untouched/empty at the end of the system run

I'd add as a side note that we recently found a similar bug in 🎆 Hanabi due to weak ordering (under-constrained systems) and it's entirely possible seeing the symptoms that bevy-ecs-tilemap has a bug which 0.13 now exposes (the 🎆 Hanabi bug predates 0.13 I think, but changes in 0.13 made it prominent, once every few runs).

@djeedai djeedai added A-Rendering Drawing game state to the screen A-ECS Entities, components, systems, and events S-Needs-Investigation This issue requires detective work to figure out what's going wrong and removed S-Needs-Triage This issue needs to be labelled labels Mar 15, 2024
@MeoMix
Copy link
Author

MeoMix commented Mar 15, 2024

  1. I agree that adding a noop system shouldn't change anything
  2. I don't know that Bevy is smart enough to detect the system is noop, or if seeing Commands in arguments is enough to insert a sync point even if the command queue is untouched/empty at the end of the system run

I'd add as a side note that we recently found a similar bug in 🎆 Hanabi due to weak ordering (under-constrained systems) and it's entirely possible seeing the symptoms that bevy-ecs-tilemap has a bug which 0.13 now exposes (the 🎆 Hanabi bug predates 0.13 I think, but changes in 0.13 made it prominent, once every few runs).

I'm not expecting Bevy to be smart enough to detect the noop system. I think it's fine to assume that the signature defined in the system's arguments is indicative of the systems side-effects.

I just don't think there is an intent to apply command buffers automatically in an unordered set of systems. If test_system were added to a schedule, and made use of .before, .after, or .chain to fix its ordering relative to other systems in that schedule, then I would expect to see behavior similar to what is occurring as that's the intent of #9822

In the MRE's code, test_system has been added to a set and that set contains other systems, but the ordering of those systems is not fixed. In this scenario, I expect no change in behavior from bevy 0.12.

It's possible that my understanding here is wrong though. If you could share the Hanabi weak ordering code that might be illuminating to me.

Also, I noted this PR exists - #11669 which makes me wonder if there's more at play here with the subapp's Render schedule.

EDIT: I might be incorrect about my belief that the set isn't ordered. One sec.

  schedule.configure_sets(
      (
          ExtractCommands,
          ManageViews,
          Queue,
          PhaseSort,
          Prepare,
          Render,
          Cleanup,
      )
          .chain(),
  );

So Queue has its ordering chained relative to other sets, but the systems being added to that set (test_system and queue_material_tilemap_meshes aren't ordered relative to each other)

... so I guess apply_deferred doesn't occur automatically between the two systems, but does occur after the set itself because a system within the set introduced a mutable command.

Okayyy so this is a bug with bevy-ecs-tilemap and not bevy. I'll keep this open for a little while longer just to mull it over.

@MeoMix MeoMix closed this as completed Mar 16, 2024
@MeoMix
Copy link
Author

MeoMix commented Mar 16, 2024

Am confident this isn't an issue with Bevy at this point

@djeedai
Copy link
Contributor

djeedai commented Mar 17, 2024

Hanabi had a system in RenderSet::Prepare, and another in RenderSet::Queue which depended on the first one, but without any explicit ordering. At the time of writing (Bevy 0.6 or so) those sets where strictly ordered, with Prepare first and Queue last. Today in Bevy 0.13, not only are sets not ordered since stageless landed, as I understand Queue is now typically for systems running before Prepare (the order was swapped). Adding an explicit system ordering .after() solved the issue.

@djeedai
Copy link
Contributor

djeedai commented Mar 17, 2024

In the MRE's code, test_system has been added to a set and that set contains other systems, but the ordering of those systems is not fixed. In this scenario, I expect no change in behavior from bevy 0.12.

Agreed.

So Queue has its ordering chained relative to other sets, but the systems being added to that set (test_system and queue_material_tilemap_meshes aren't ordered relative to each other)

I don't know what sets being ordered means or does, but yes, systems in those sets are not ordered. That was the Hanabi bug, and that behavior was confirmed with the rendering team.

So yes this looks like the same kind of bug in bevy-ecs-tilemap than in Hanabi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior S-Needs-Investigation This issue requires detective work to figure out what's going wrong
Projects
None yet
Development

No branches or pull requests

2 participants