Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source of truth for Transform data #229

Closed
andreasterrius opened this issue Aug 18, 2020 · 27 comments
Closed

Source of truth for Transform data #229

andreasterrius opened this issue Aug 18, 2020 · 27 comments
Labels
A-ECS Entities, components, systems, and events C-Docs An addition or correction to our documentation

Comments

@andreasterrius
Copy link

andreasterrius commented Aug 18, 2020

Currently the Transform is not considered the ssot for position, rotation, and scale (based on the comments in transform.rs). However in the examples (3d_scene, load_model) translation and rotation for the Camera3dComponents is usually never set. This could cause confusion in regards of the ssot of the position, rotation and scale data.

The position data will be needed in the future to calculate more advanced pbr shaders.

@karroffel karroffel added C-Docs An addition or correction to our documentation A-ECS Entities, components, systems, and events labels Aug 18, 2020
@MarekLg
Copy link
Contributor

MarekLg commented Aug 24, 2020

I would propose another solution to this problem (copied from another issue that has been closed as duplicate)

Wherever Transform is required, if there is none attached, build it (maybe with #266) from Translation, Rotation or Scale, or Trasform::identity(), if nothing else is present. DynamicBundles would have to restrict creation to just Transform, which could however incorporate Translation, Rotation and/or Scale by using the above mentioned TransformBuilder.

@cart
Copy link
Member

cart commented Aug 24, 2020

Yeah currently transform is the "output" used in shaders, but position, rotation, and scale will be used to set the Transform, if they exist. The benefits are that systems can pull in exactly what they want, which is faster (less memory being accessed) and easier to parallelize.

The downside is that Transform needs to be updated after app logic finishes, which means Transform will be a frame behind in the UPDATE stage. It also creates issues like "setting the transform does not automatically update source components.

The alternative is to use a combined transform for everything. We are heavily considering this option.

The Amethyst folks have already discussed the issue at length, so please read that thread first before weighing in here: https://community.amethyst.rs/t/legion-transform-design-discussion

@GrantMoyer
Copy link
Contributor

GrantMoyer commented Aug 25, 2020

Couldn't we update the Transform every time Scale, NonUniformScale, Rotation, or Translation are mutated (what we do now), and visa versa (which we don't do now)? That way, the Transform and the individual components are always kept in sync, and we could throw out the sync member of Transform. What am I missing?

@aclysma
Copy link
Contributor

aclysma commented Aug 25, 2020

+1 for using a single transform component (my preference is that it contains a simple 4x4 matrix). My rationale was on the thread linked by @cart here: https://community.amethyst.rs/t/legion-transform-design-discussion/1582/16?u=aclysma

@MarekLg
Copy link
Contributor

MarekLg commented Aug 25, 2020

@cart great thread! I think this post summarizes it fairly well.

@GrantMoyer That's kind of the idea, but instead we would have one Transform-component that implements e.g. translate(). The problem lies more in the hierarchy: When a Transform is translated, for example, all children would have to be moved as well, which is not possible without access to them. You can look at the linked post for how the amethyst folks want to solve this.

As I see it, the biggest drawback of their solution, in terms of performance, is that there are more matrix calculations needed to be done by the CPU. In terms of usability it would be nice to be able to mutate the global-part of the Transform.

I think, however, those drawbacks are negligible, at least for the moment. A possible solution for the performance-part could be to load it off to a compute shader, if it all happens at the same time and I'm not terribly mistaken.

Another extension to this solution could still be to allow just e.g. a Translation-component on the entity and create a Transform based on that. Also, could it be possible to perform those global transform calculations in visible_entities_system(..)?.

What do you guys think about the linked solution?

@MarekLg
Copy link
Contributor

MarekLg commented Sep 16, 2020

I think this issue can be closed now.

@cart cart closed this as completed Sep 16, 2020
@AThilenius
Copy link

@cart @MarekLg @aclysma I'm sorry I didn't know about Bevy until a few days ago, so didn't know about this conversation. I wrote the original legion_transform library. I'm all for a 'simpler' API if you guys feel that's a better fit for Bevy (I personally much prefer separate Translation/Rotation/Scale components as keeping entity sizes down is crucial for ECS performance and being able to 'throw entities at the problem'). I'm sorry so much work has already gone into this, but...

You cannot store the source of truth for an Affine transform as a mat4, it accumulates significant errors over time as you manipulate rotation. This is because rotation and scale are encoded together in the upper left 3x3, rotation will end up effecting scale and it will drift, see this article for example. This must be reverted back to separate Translation (Vec2/3), Rotation (Quat3/4), Scale(Float) and/or NonUniformScale (float2/3) even if you chose to have all of those live in one component (which again, I disagree with, but that's your call 🤷‍♂️).

Also NonUniformScale was chosen as an exception not the default because Affine transforms (ones with non-uniforms scale) are evil in games. They incur significant precision errors when converted to a mat4 (especially in a hierarchy), plus they cannot be used by a physics engine in any meaningful way without a nightmarish amount of corner-cases and performance degradation.

@cart
Copy link
Member

cart commented Sep 25, 2020

@AThilenius thanks for the heads up! You have a lot of context here, so your perspective is much appreciated. We currently make no stability guarantees, so I'm more than happy to make breaking changes in the interest of improving the engine. I'm re-opening this issue and I encourage everyone to weigh in on @AThilenius's comments.

You cannot store the source of truth for an Affine transform as a mat4, it accumulates significant errors over time as you manipulate rotation

This is certainly useful information 😄
Sounds like a problem we should solve, but how we do that is up for debate:

  1. Completely replace Mat4 with pos/rot/scale inside the Transform component and re-compute matrix on demand when it is asked for. Direct matrix read/writes are now "more expensive" and accessing rotation and scale are now "less expensive".
  2. Revert to the old model. "Transform" is now back to being out of sync during the UPDATE stage. "directly setting" matrix values is now cumbersome because the individual components are the source of truth. I would probably want to work out a nice way to "sync back" direct Transform changes to the "source of truth" components.
  3. Store pos/rot/scale/mat4 together and make sure they stay in-sync internally. This makes the component "very big" and is almost certainly not what we should shoot for. It might be worth benchmarking this just to see how much performance we lose.

NonUniformScale was chosen as an exception not the default

Yeah I totally agree on this. I tried to mirror that in the new transform api by using the "scale" term to refer to a uniform scale:

pub fn from_scale(scale: f32) -> Self {

@cart cart reopened this Sep 25, 2020
@GrantMoyer
Copy link
Contributor

GrantMoyer commented Sep 25, 2020

In any case, I don't think we should have a non-source of truth component for the local transform. I think helper methods to set the source of truth are a much more intuitive interface.

@cart
Copy link
Member

cart commented Sep 25, 2020

Pulling in @termhn because of her linear algebra / ultraviolet experience

@AThilenius
Copy link

AThilenius commented Sep 25, 2020

Here's a few more considerations to throw into the mix (why legion_transform was laid out the way that it was 😁)

  • If LocalToWorld is computed from N transformation types, and LocalToWorld is only requirement for most end-frame systems like rendering, then pre-baked static objects need only that one component, regardless of their original transforms / hierarchies. This is both a saving on memory, and more importantly runtime did-move checks as these static objects are simply never processed by any transform system. They don't even need to be checked for 'did you move' because they are incapable of moving without an archetype change or someone directly modifying the LocalToWorld which is generally not a good idea because of physics needing access to non-drifting translation/rotation components.
  • Computing a local-space mat4 isn't "expensive" per se, but it's a very common operation. If you split transformation types into Translation/Rotation/Scale/NonUniformScale then you only pay the cost of incorporating that single transformation into the matrix. For Translation (which is the most common) that's just a vec3 copy.
  • If all 'movable' objects all have some combination of Translation/Rotation/Scale/NonUniformScale then networking becomes simpler, you don't need any sleep or 'did-move' logic as entities that lack these components will not move. Nor does the entire 4x4 matrix need to be sent over the network, only the Vec3 for Translation in most cases.
  • Storing transforms for hierarchies (LocalToParent, Parent and Children) separate is also a good idea, because it's surprisingly uncommon once you bake out static objects in a scene. Very few entities actually use hierarchies at runtime, mostly it's used as an organizational tool in editor. Meaning you want to pay as little cost per-entity that doesn't use hierarchies here as possible (in the case of legion_transform that cost is exactly zero).
  • Without splitting the components apart into Translation/Rotation/Scale/NonUniformScale you will need to provide a fast way to check for identity, and uniform vs. non-uniform scale. Either wrapping them all in enums / optional, or paying the cost of running expensive checks each time you want to check "does this entity have a non-uniform scale". This will be required for things like Physics, where Affine transforms cannot be applied to colliders because they are not tessellated.

All of these requirements is why I felt separate components for everything was the right call. Your mileage may vary of course. If you really want local transforms to be one component, I would recommend something along the lines of:

pub enum Scale {
    Identity,
    Uniform(f32),
    NonUniform(Vec3)
}

pub struct Transform3D {
    translation: Vec3,
    rotation: Quat,
    scale: Scale,
}

// Plus nice `impl` methods for `Transform3D`

with separate components to store local space and global space matrices (these are the same on root objects). This still doesn't help with the problem of hierarchies being out of date though. There isn't a perfect solution to that one sadly. For that I would say... make recomputing the hierarchy (updating Children and LocalToWorld on hierarchy entities) very easy API wise, and let users recompute (or partially recompute) the graph when they need to. The engine has no way of knowing ahead of time when that is, without doing nasty 'dirty checks' and conservative re-computation all over the place.

@fu5ha
Copy link
Contributor

fu5ha commented Sep 25, 2020

I am personally a fan of "option 1". I think it's the best compromise in terms of the things that @AThilenius has said with usability from the user's perspective. A Similarity3 (which allows for translation, rotation, and uniform scale) or something similar (no pun intended...) takes up only 8 floats (nonuniform scale would be 10 floats instead) and I don't think splitting these things into separate Components is going to make a big enough perf difference to account for the usability hindrance. Separating them makes composition harder or impossible, which while it's not a super-common use case as much as simply changing the transform, it's still common.

Another thing to consider is that the composition (i.e. combining two together to get the result of applying their transformations in sequence) of Similarities (or transforms stored in the same manner) is faster than the composition of a full Mat4, but transforming a Point3/Vec3 with a Similarity is slower than transforming it with a Mat4 because rotating a Vec/Point with a Quaternion is slower than with a rotation matrix... however, this perf difference can be made negligible by transforming multiple points at once because you can compute some 'state' upfront and then only do a small amount of computation per point.

Finally, one more thing to toss into the hat is that I would consider restricting any scaling at all to be only possible on entities that are "leaves" of the hierarchy graph. This could be done by using an Isometry3 (allows translation and rotation, also called a rigid body transformation or a "motor") or similar as the universal Transform type and then allowing leaf nodes to also have a Scale component like the one @AThilenius proposed above. The benefits would be:

  • Cheaper to store, less memory usage (by a small amount)
  • Simplifies transformations due to hierarchy by a great amount, particularly since you no longer have to account for different scaling types during hierachy, you just apply it at the leaf and then leave it, which is good for physics, simpler code, and it's cheaper to compute.
  • Though this isn't particularly relevant currently, I'll probably start evangelizing adopting (Projective) Geometric Algebra eventually, due to its awesome elegance and simplicity plus several other benefits mentioned in that video. PGA has elements called "motors" (which I alluded to above) that are the combination of a "translator" and "rotor", and which do rigid body transformations (translation and rotation). They can be stored as 8 floats. These motors are really awesome because you can apply them to any other element of the algebra (points, lines, planes, other motors, other translators, other rotors...) in a straight forward way and it just works. In addition, they are isomorphic (meaning they are the same thing, just a different formulation) to Dual Quaternions, and much like how Quaternions are useful for clean interpolation of rotations, we can use motors to do much better interpolation of full rigid body transforms, which is particularly useful for animations but can be useful for a number of other things as well. You can even use them to do the job of a perspective projection matrix. However, they don't do any type of direct scaling at all, so when using them if you need/want to scale, the best way to do it is to simply add a scaling step which just multiplies things at the beginning and then uses motors from then on in the transform "pipeline".

@bitshifter
Copy link
Contributor

If you are interested in trying Scale, Rotation and Translation in a single 3D transform these currently exist in glam at the moment using the transform-types feature (see https://github.com/bitshifter/glam-rs/blob/master/src/f32/transform.rs). They're on a feature because it's just something I was playing around with as a 3D transform type and haven't committed to supporting in the library. There are benchmarks for them in glam if you want to see the performance trade offs. Generally they performed worse than a Mat4 for transforming points/vectors and other transforms but are faster for inverses. I think in particular in glam they are slower than the Mat4 because they're using Vec3 instead of Vec3A.

@aclysma
Copy link
Contributor

aclysma commented Sep 26, 2020

Regarding position/rotation/scale being in a single component or spread across components:

  • Unreal and other engines I've used combine these into a single struct/component and it seems to work well for them. Unity DOTS plans to separate them, but it sounds like they are considering other options (https://gist.github.com/joeante/79d25ec3a0e86436e53eb74f3ac82c0c).
  • Separating these components potentially introduces new costs (fetching additional components and running extra systems) and I don't feel confident to assume either approach would be faster than the other. I'd like to see data from a real game that's meant to be played. (i.e. not a microbenchmark or tech demo that has a sole purpose of making transform fetches a bottleneck.)

The internals of how position/rotation/scale are stored in a component that contains all three is a changeable implementation detail, so I think that takes a lot of pressure off of picking "the right way" today (as if there is such a thing.) I prefer approach (1). I think there are options for a Mat4 approach (like stashing scale in unused w components) but this seems like something that could be investigated later, hopefully with the benefit of profiling something a little more "real" than we have today.

@AThilenius
Copy link

AThilenius commented Sep 26, 2020

Regarding position/rotation/scale being in a single component or spread across components:

  • Unreal and other engines I've used combine these into a single struct/component and it seems to work well for them. Unity DOTS plans to separate them, but it sounds like they are considering other options (https://gist.github.com/joeante/79d25ec3a0e86436e53eb74f3ac82c0c).
  • Separating these components potentially introduces new costs (fetching additional components and running extra systems) and I don't feel confident to assume either approach would be faster than the other. I'd like to see data from a real game that's meant to be played. (i.e. not a microbenchmark or tech demo that has a sole purpose of making transform fetches a bottleneck.)

The internals of how position/rotation/scale are stored in a component that contains all three is a changeable implementation detail, so I think that takes a lot of pressure off of picking "the right way" today (as if there is such a thing.) I prefer approach (1). I think there are options for a Mat4 approach (like stashing scale in unused w components) but this seems like something that could be investigated later, hopefully with the benefit of profiling something a little more "real" than we have today.

Unreal (and other engines) don't use an ECS so you're comparing apples to race cars. Unity DOTs has been separate components from the start and they don't plan on changing that because the performance is exceptional. That gist is only talking about aspects, which are just 'views' on top of set of components.

Breaking things apart rather than combining them is almost always faster. Instead of the hierarchy system needing to loop over every chunk with a Transform, it can simply loop over chunks that include a Parent. Ie. the former requires more looping AND a conditional branch for each entity, the latter is a non-op which is the beauty of an archetypical ECS.

These are all separate concepts and must be considered independently (I'm seeing these getting blurred):

  • What format is the source-of-truth for local space transforms in
    • This means separate or single component (and @termhn can talk more to Protective GA because I'm out of my depth there but it looks awesome).
    • This should never be stored as a mat4, including 'stashing scale in unused w components' which defeats the entire purpose of using a mat4 (ie they can be used by linear algebra equations directly). You're basically just packing a normal vec3 into a mat4 struct at that point, which isn't useful.
  • Where the computed space transforms are stored
    • There are very few reasons for this to be stored in the above component(s), and lots of very good reasons for it to be separate.
    • Both a local-space and a global-space transform are needed, with the local-space being conditional on presence of a hierarchy (ie we really want to keep LocalToWorld and LocalToParent separate, because the former is orders of magnitude more common).
    • I am almost certain these must be mat4, because all our existing hardware (GPUs for example) work on mat4. @termhn can chime in here again if Projective GA can offer an alternative. But multiplying matrices is also really fast, even on the CPU.
  • Where are hierarchies stored
    • While not ideal, after 3 totally disparate implementations, I finally agreed with Unity DOTs and went the way of separate Parent and Children components. Hierarchies in ECS are hard, because we lack reference types. There is no easy solution to this one, they all have some super nasty trade-offs.

Edit: One thing I never explored that has some interesting possibilities, is to integrate the concept on hierarchies directly into the ECS core. We might be able to significantly clean up the builder/manipulation/query API if that's done. It also might incur some overhead, but I would GLADLY accept that for a better hierarchy API. Never looked into it, but this is a path I would personally throw some time into because it sounds like heaven after 8 months of Unity DOTs for my day job.

@MarekLg
Copy link
Contributor

MarekLg commented Sep 27, 2020

I'd like to propose a solution to this problem, which IMO is in line with the general consensus here, as well as the current solution in regards changes that would be needed to make.

Geometric data is currently stored as Transform and GlobalTransform:

  • Transform represents local data
  • GlobalTransform is used as the global data and gets computed after every update.

Seeing as only GlobalTransform gets used for any rendering, it can keep being represented as a Mat4 - minus the mutating methods except matrix_mut(..) ..

Transform could look like this

struct Transform {
  pub translation: Vec3,
  pub rotation: Quat,
  pub scale: Vec3,
}

With something like pub fn compute_matrix(mut self) -> Mat4 to compute the respective Mat4 used to compute the GlobalTransform.

This doesn't have to be the definitive answer, but can be seen as 'good for now' and can be altered if need be. One extension could be to compute GlobalTransform only from a Tuple of e.g. (&Translation, &Rotation), if no Transform is present.

The main reasons I see to not convert to a system like that right now:

  • It either relies on newtypes, which IMO are more wonky to use as a single Transform, or we would need to rewrite a lot of the glam stuff.
  • We could not provide methods like Transform::orbit_around(&mut self, center: Vec3, axis: Vec3, t: f32) without a system, because they operate on more than one component.

These points could be irrelevant for other engines, and rightfully so, because they don't prioritize simplicity.

Regarding scale: keeping ease of use in mind, we can just leave it a Vec3, which will likely stay Vec3::one() all the time. If a more performat solution is needed here, I like the enum approach shown by @AThilenius.
Regarding hierarchy: We can just leave it as it is, because it works. As above, if we need to rewrite it that would likely be decoupled from Transforms anyway.

I'll try to get a PR ready in the next week to see what you guys think!

@onehundredfeet
Copy link

onehundredfeet commented Sep 27, 2020 via email

@MarekLg
Copy link
Contributor

MarekLg commented Sep 27, 2020

@onehundredfeet thanks for the article! It's always nice to see different approaches.
That seems to already be the approach bevy is taking, however. At least concerning hierarchy. I might look into it a bit more after I've got my first solution ready, but I think potential improvements to the hierarchy system should be a different issue/PR as it works quite independent from the Transform composition.

@AThilenius
Copy link

With something like pub fn compute_matrix(mut self) -> Mat4 to compute the respective Mat4 used to compute the GlobalTransform.

What happens when a parent has N children? Do you recompute the mat4 for each child? What happens when N systems need access to the global transform for M entities? Do you just accept the O(MN) complexity, or do you cache it somewhere? If you cache it somewhere... then how is that different from the way I originally setup legion_transform?

@MarekLg
Copy link
Contributor

MarekLg commented Sep 27, 2020

@AThilenius bevy has a transform propagation system which queries all Transforms and GlobalTransforms starting by the ones without Parent and, after computing their own, recursively compute the GlobalTransform of the Children. So the GlobalTransform gets computed just once and represents the cache of the Mat4 for each entity for the next frame.

I don't know how this is different from legions implementation, because I didn't look at it.

@AThilenius
Copy link

bevy has a transform propagation system which...

I'm aware, I wrote the original impl 😋 I'm confused though, can you clarify exactly which components you intend to have, and what they contain?

@MarekLg
Copy link
Contributor

MarekLg commented Sep 28, 2020

I'm aware, I wrote the original impl

That's good to know 😅

I just adopted the system as it was when we had both Transforms and Translation, etc. components, so I don't know the ins and outs, but this is how I understand it.

Relevant components are:

  • Parent stores the parent Entity
  • Children stores the children Entitys
  • Transform
  • GlobalTransform absolute global transform used for rendering etc.

That's about it, really. A quick look into bevy-transform::transform_propagate_system should get you a clearer picture (at least clearer than I could paint it).

@bitshifter
Copy link
Contributor

@MarekLg

It either relies on newtypes, which IMO are more wonky to use as a single Transform, or we would need to rewrite a lot of the glam stuff.

I was wondering why it would be necessary to rewrite a lot of glam stuff?

@MarekLg
Copy link
Contributor

MarekLg commented Sep 29, 2020

@bitshifter
Maybe I expressed that poorly 😅. I don't think we should stop using glam at all! Let me try to explain what I mean.

For example: Vec3 is currently used as translation and scale, so in order to use it as a component it needs to be wrapped in either a newtype, or in a wrapper. My problem is that newtype feels, again IMO, wonky to use and a wrapper would need a lot of boilerplate code (e.g. implementing all the traits of Vec3 on a supposed Translation etc.).

I may be wrong, however, and there is a neat solution after all.. If that's the case I'd love to be corrected!

@bitshifter
Copy link
Contributor

@MarekLg ah thanks. I was just wondering if you were hitting some limitation in glam but it's more a general Rust new type thing.

@MarekLg
Copy link
Contributor

MarekLg commented Sep 29, 2020

@bitshifter absolutely correct

@cart
Copy link
Member

cart commented Oct 18, 2020

Resolved by #596

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Docs An addition or correction to our documentation
Projects
None yet
Development

No branches or pull requests

10 participants