Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use string interning to avoid boxing labels #7760

Closed
wants to merge 2 commits into from

Conversation

maniwani
Copy link
Contributor

@maniwani maniwani commented Feb 20, 2023

Objective

Box<dyn MyLabelTrait> labels aren't too great. Schedules could build much faster if we didn't allocate from the heap for every instance of every label. (Probably. I need to benchmark. TODO.)

I believe an improved label type would have these properties:

  1. Implement Copy.
  2. Produce a unique ID for each value (i.e. Enum::VariantA and Enum::VariantB should be different labels).
  3. Inherently print the same Debug string as the original label.

In the past, we switched away from Box<dyn MyLabelTrait> but had to revert in 0.10 to have OnEnter(state) and OnExit(state) be distinct labels (3), which the label implementation at the time (#4957) did not support.

This is a simpler, "one-size-fits-all" approach than #5715. I think #7762 might be the preferable alternative though. It's very similar but has some internal differences, including special-casing in its macros for unit structs and fieldless enums.

Solution

  • Intern (allocate and leak) the Debug string of every label just once (on first use).
  • struct MyLabelTraitId(&'static str) just wraps the reference to the interned string.
  • Implement Debug for MyLabelTraitId to print the string.
  • Use a (TypeId, hash) tuple for uniqueness instead of TypeId so that different values become different labels.

@maniwani maniwani added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times C-Usability A targeted quality-of-life change that makes Bevy easier to use labels Feb 20, 2023
@github-actions
Copy link
Contributor

Example alien_cake_addict failed to run, please try running it locally and check the result.

@github-actions
Copy link
Contributor

Example alien_cake_addict failed to run, please try running it locally and check the result.

1 similar comment
@github-actions
Copy link
Contributor

Example alien_cake_addict failed to run, please try running it locally and check the result.

@geieredgar
Copy link
Contributor

geieredgar commented Feb 20, 2023

Hi, I have taken some ideas of this PR and #5715 and put them together and expanded them in #7762. With that, no allocations are required in many common use cases. One possible issue i see with this approach here is that if two different sets produce the same hash value for whatever reason, we cannot differentiate between them without also taking the string into account. And that string could in theory also be made equal for different sets, in which case this would also not be sufficient.

@alice-i-cecile alice-i-cecile added the X-Controversial There is active debate or serious implications around merging this PR label Feb 20, 2023
@github-actions
Copy link
Contributor

Example alien_cake_addict failed to run, please try running it locally and check the result.

@maniwani maniwani changed the title Simple labels that don't allocate (much) Use interning to avoid boxing labels Jul 19, 2023
@maniwani maniwani marked this pull request as ready for review July 19, 2023 21:07
Copy link
Member

@JoJoJet JoJoJet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick scan, this definitely looks promising as a "best of both worlds" approach.

Comment on lines 82 to 88
/// Type-elided struct whose methods return the same values as its original [`SystemSet`].
#[derive(Clone, Copy, Eq, PartialEq, Hash)]

pub struct SystemSetUntyped {
id: SystemSetId,
kind: SystemSetKind,
}
Copy link
Member

@JoJoJet JoJoJet Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the presence of SystemSetId and SystemSetUntyped is a little confusing -- could these types be merged? (Of course, doing this would probably mean you can't use define_label! for SystemSet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SystemSetId is just the ID. To replace the boxing in schedule construction, we also need to remember the distinction between normal system sets, system type sets, and anonymous system sets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but is it necessary to have SystemSetId as a separate public type? Can we just use SystemSetUntyped for any user-facing APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, we can probably keep SystemSetId internal.

crates/bevy_ecs/src/schedule/set.rs Show resolved Hide resolved
crates/bevy_ecs/src/world/mod.rs Outdated Show resolved Hide resolved
T: Debug + DynHash + ?Sized,
{
let key = UniqueValue::of(value);
let mut map = INTERNER.get_or_init(default).write().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this to have a more descriptive error message instead of just using unwrap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions? Is the standard message about the RwLock being poisoned not descriptive enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, .unwrap() uses the Debug implementation of the error type, so the error message will just be PoisonError { ... }.

Maybe it would be better to bypass poisoning entirely by using .unwrap_or_else(PoisonError::into_inner)? If a panic occurs while the lock is being held, I believe it would still leave the interner in a valid state.

/// subsequent calls with the same `value` will return the same reference.
pub fn intern_debug_string<T>(value: &T) -> &'static str
where
T: Debug + DynHash + ?Sized,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use Hash instead of DynHash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. We can't use Hash if the label traits still need to be object-safe, but otherwise, yeah.

Comment on lines +72 to +74
/// This is expected to have sufficient entropy to be different for each value.
#[derive(Clone, Copy, Eq, PartialEq, Hash)]
struct UniqueValue(TypeId, u64);
Copy link
Member

@JoJoJet JoJoJet Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My number one concern with this approach is whether or not we can rely on this being a truly unique value. What would the error case look like if an unlucky user saw two of their labels. Can we estimate the probability of this occurring?

Copy link
Contributor Author

@maniwani maniwani Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, yes we can estimate it. I don't know the exact math though. It depends on the hash functions used.

I think if we can assume all hashes are equally likely, then we can approximate the number of different things n we'd need to hash to have p probability of a collision (where p <= 0.5).

n ≈ sqrt(2 * 2^(bits in hash) * p)

So if a hash is 64 bits and all hashes are equally likely, according to the table you'd need to hash over 180 million things to have about a one-in-a-thousand chance of seeing a collision.

p = 0.001 ≈ 2^-10
n ≈ sqrt(2^1 * 2^64 * 2^-10) = sqrt(2^(1 + 64 - 9)) = sqrt(2^55) = 2^27.5 = 189,812,531

So it's extremely unlikely.

Rust recently extended TypeId to 128-bits to shrink the already astronomically small odds of collisions because there were a couple collisions reported. I think we basically have to assume TypeId is sound anyway (and that collision is impossible), So we can worry about value hash collisions, but I don't know if that concern can be justified with a mathematical argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#7762 might be better about this, idk.

crates/bevy_macro_utils/src/lib.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/schedule/set.rs Outdated Show resolved Hide resolved
@maniwani
Copy link
Contributor Author

maniwani commented Jul 19, 2023

If reviewing this, please also look at #7762. It's a very similar technique but also does more to avoid allocating at all in the common case where the label is a unit struct or fieldless enum.

@maniwani maniwani changed the title Use interning to avoid boxing labels Use string interning to avoid boxing labels Jul 19, 2023
@JoJoJet
Copy link
Member

JoJoJet commented Jul 20, 2023

See also #5410, which improves the benchmarks relevant to labels.

T: Debug + DynHash + ?Sized,
{
let key = UniqueValue::of(value);
let mut map = INTERNER.get_or_init(default).write().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, .unwrap() uses the Debug implementation of the error type, so the error message will just be PoisonError { ... }.

Maybe it would be better to bypass poisoning entirely by using .unwrap_or_else(PoisonError::into_inner)? If a panic occurs while the lock is being held, I believe it would still leave the interner in a valid state.

Comment on lines +11 to +12
type Interner = RwLock<HashMap<UniqueValue, &'static str>>;
static INTERNER: OnceLock<Interner> = OnceLock::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, you only ever call .write() on this, so it would be better to just use a mutex here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think trading the RwLock for a Mutex could end up causing some weird behavior in hypothetical situations where systems or sub-apps in different threads try to construct or run schedules at the same time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't currently being used like a RwLock, though. Since you only call .write(), this is basically just a mutex with more overhead.

Copy link
Contributor Author

@maniwani maniwani Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, doesn't that prevent different threads from reading the interned strings at the same time? How much more overhead does a RwLock have?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A RwLock only allows multiple threads to read it at once if you use .read(). Since you're only using .write(), it behaves like a mutex and only one thread can access the interner at once.

In order to allow multiple threads to read it at once you'd need to refactor this to only call .write() when a string is not already interned.

if let Some(str) = INTERNER.read().unwrap().get(key) {
    *str
} else {
    let str: &'static str = // format and leak the value
    INTERNER.write().unwrap().insert(key, str);
    str
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. It went over my head what you meant by "not using it like a RwLock".

@alice-i-cecile alice-i-cecile added this to the 0.12 milestone Sep 1, 2023
@alice-i-cecile alice-i-cecile modified the milestones: 0.12, 0.13 Sep 29, 2023
@cart
Copy link
Member

cart commented Oct 25, 2023

Closing in favor of #7762

@cart cart closed this Oct 25, 2023
github-merge-queue bot pushed a commit that referenced this pull request Oct 25, 2023
# Objective

First of all, this PR took heavy inspiration from #7760 and #5715. It
intends to also fix #5569, but with a slightly different approach.


This also fixes #9335 by reexporting `DynEq`.

## Solution

The advantage of this API is that we can intern a value without
allocating for zero-sized-types and for enum variants that have no
fields. This PR does this automatically in the `SystemSet` and
`ScheduleLabel` derive macros for unit structs and fieldless enum
variants. So this should cover many internal and external use cases of
`SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory
will be allocated.

- The interning returns a `Interned<dyn SystemSet>`, which is just a
wrapper around a `&'static dyn SystemSet`.
- `Hash` and `Eq` are implemented in terms of the pointer value of the
reference, similar to my first approach of anonymous system sets in
#7676.
- Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`.
- The debug output of `Interned<T>` is the same as the interned value.

Edit: 
- `AppLabel` is now also interned and the old
`derive_label`/`define_label` macros were replaced with the new
interning implementation.
- Anonymous set ids are reused for different `Schedule`s, reducing the
amount of leaked memory.

### Pros
- `InternedSystemSet` and `InternedScheduleLabel` behave very similar to
the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied
without an allocation.
- Many use cases don't allocate at all.
- Very fast lookups and comparisons when using `InternedSystemSet` and
`InternedScheduleLabel`.
- The `intern` module might be usable in other areas.
- `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement
`{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics.

### Cons
- Implementors of `SystemSet` and `ScheduleLabel` still need to
implement `Hash` and `Eq` (and `Clone`) for it to work.

## Changelog

### Added

- Added `intern` module to `bevy_utils`.
- Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`.

### Changed

- Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with
`InternedSystemSet` and `InternedScheduleLabel`.
- Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`.
- Replaced `AppLabelId` with `InternedAppLabel`.
- Changed `AppLabel` to use `Debug` for error messages.
- Changed `AppLabel` to use interning.
- Changed `define_label`/`derive_label` to use interning. 
- Replaced `define_boxed_label`/`derive_boxed_label` with
`define_label`/`derive_label`.
- Changed anonymous set ids to be only unique inside a schedule, not
globally.
- Made interned label types implement their label trait. 

### Removed

- Removed `define_boxed_label` and `derive_boxed_label`. 

## Migration guide

- Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with
`InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`.
- Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with
`InternedSystemSet` or `Interned<dyn SystemSet>`.
- Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn
AppLabel>`.
- Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet`
need to implement:
  - `dyn_hash` directly instead of implementing `DynHash`
  - `as_dyn_eq`
- Pass labels to `World::try_schedule_scope`, `World::schedule_scope`,
`World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`,
`Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and
`Schedules::get_mut` by value instead of by reference.

---------

Co-authored-by: Joseph <[email protected]>
Co-authored-by: Carter Anderson <[email protected]>
ameknite pushed a commit to ameknite/bevy that referenced this pull request Nov 6, 2023
# Objective

First of all, this PR took heavy inspiration from bevyengine#7760 and bevyengine#5715. It
intends to also fix bevyengine#5569, but with a slightly different approach.


This also fixes bevyengine#9335 by reexporting `DynEq`.

## Solution

The advantage of this API is that we can intern a value without
allocating for zero-sized-types and for enum variants that have no
fields. This PR does this automatically in the `SystemSet` and
`ScheduleLabel` derive macros for unit structs and fieldless enum
variants. So this should cover many internal and external use cases of
`SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory
will be allocated.

- The interning returns a `Interned<dyn SystemSet>`, which is just a
wrapper around a `&'static dyn SystemSet`.
- `Hash` and `Eq` are implemented in terms of the pointer value of the
reference, similar to my first approach of anonymous system sets in
bevyengine#7676.
- Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`.
- The debug output of `Interned<T>` is the same as the interned value.

Edit: 
- `AppLabel` is now also interned and the old
`derive_label`/`define_label` macros were replaced with the new
interning implementation.
- Anonymous set ids are reused for different `Schedule`s, reducing the
amount of leaked memory.

### Pros
- `InternedSystemSet` and `InternedScheduleLabel` behave very similar to
the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied
without an allocation.
- Many use cases don't allocate at all.
- Very fast lookups and comparisons when using `InternedSystemSet` and
`InternedScheduleLabel`.
- The `intern` module might be usable in other areas.
- `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement
`{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics.

### Cons
- Implementors of `SystemSet` and `ScheduleLabel` still need to
implement `Hash` and `Eq` (and `Clone`) for it to work.

## Changelog

### Added

- Added `intern` module to `bevy_utils`.
- Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`.

### Changed

- Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with
`InternedSystemSet` and `InternedScheduleLabel`.
- Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`.
- Replaced `AppLabelId` with `InternedAppLabel`.
- Changed `AppLabel` to use `Debug` for error messages.
- Changed `AppLabel` to use interning.
- Changed `define_label`/`derive_label` to use interning. 
- Replaced `define_boxed_label`/`derive_boxed_label` with
`define_label`/`derive_label`.
- Changed anonymous set ids to be only unique inside a schedule, not
globally.
- Made interned label types implement their label trait. 

### Removed

- Removed `define_boxed_label` and `derive_boxed_label`. 

## Migration guide

- Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with
`InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`.
- Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with
`InternedSystemSet` or `Interned<dyn SystemSet>`.
- Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn
AppLabel>`.
- Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet`
need to implement:
  - `dyn_hash` directly instead of implementing `DynHash`
  - `as_dyn_eq`
- Pass labels to `World::try_schedule_scope`, `World::schedule_scope`,
`World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`,
`Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and
`Schedules::get_mut` by value instead of by reference.

---------

Co-authored-by: Joseph <[email protected]>
Co-authored-by: Carter Anderson <[email protected]>
rdrpenguin04 pushed a commit to rdrpenguin04/bevy that referenced this pull request Jan 9, 2024
# Objective

First of all, this PR took heavy inspiration from bevyengine#7760 and bevyengine#5715. It
intends to also fix bevyengine#5569, but with a slightly different approach.


This also fixes bevyengine#9335 by reexporting `DynEq`.

## Solution

The advantage of this API is that we can intern a value without
allocating for zero-sized-types and for enum variants that have no
fields. This PR does this automatically in the `SystemSet` and
`ScheduleLabel` derive macros for unit structs and fieldless enum
variants. So this should cover many internal and external use cases of
`SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory
will be allocated.

- The interning returns a `Interned<dyn SystemSet>`, which is just a
wrapper around a `&'static dyn SystemSet`.
- `Hash` and `Eq` are implemented in terms of the pointer value of the
reference, similar to my first approach of anonymous system sets in
bevyengine#7676.
- Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`.
- The debug output of `Interned<T>` is the same as the interned value.

Edit: 
- `AppLabel` is now also interned and the old
`derive_label`/`define_label` macros were replaced with the new
interning implementation.
- Anonymous set ids are reused for different `Schedule`s, reducing the
amount of leaked memory.

### Pros
- `InternedSystemSet` and `InternedScheduleLabel` behave very similar to
the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied
without an allocation.
- Many use cases don't allocate at all.
- Very fast lookups and comparisons when using `InternedSystemSet` and
`InternedScheduleLabel`.
- The `intern` module might be usable in other areas.
- `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement
`{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics.

### Cons
- Implementors of `SystemSet` and `ScheduleLabel` still need to
implement `Hash` and `Eq` (and `Clone`) for it to work.

## Changelog

### Added

- Added `intern` module to `bevy_utils`.
- Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`.

### Changed

- Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with
`InternedSystemSet` and `InternedScheduleLabel`.
- Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`.
- Replaced `AppLabelId` with `InternedAppLabel`.
- Changed `AppLabel` to use `Debug` for error messages.
- Changed `AppLabel` to use interning.
- Changed `define_label`/`derive_label` to use interning. 
- Replaced `define_boxed_label`/`derive_boxed_label` with
`define_label`/`derive_label`.
- Changed anonymous set ids to be only unique inside a schedule, not
globally.
- Made interned label types implement their label trait. 

### Removed

- Removed `define_boxed_label` and `derive_boxed_label`. 

## Migration guide

- Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with
`InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`.
- Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with
`InternedSystemSet` or `Interned<dyn SystemSet>`.
- Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn
AppLabel>`.
- Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet`
need to implement:
  - `dyn_hash` directly instead of implementing `DynHash`
  - `as_dyn_eq`
- Pass labels to `World::try_schedule_scope`, `World::schedule_scope`,
`World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`,
`Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and
`Schedules::get_mut` by value instead of by reference.

---------

Co-authored-by: Joseph <[email protected]>
Co-authored-by: Carter Anderson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times C-Usability A targeted quality-of-life change that makes Bevy easier to use X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revisit labels again
5 participants