-
-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up object construction code #1249
Comments
Here's another idea: Simplify
|
I've implemented the most recent idea: #1287 And I've updated the issue description to include a list of the ideas. |
I have more ideas! Partial Object API: Explicit InferenceBuilding a full object from a partial one isn't always a trivial endeavor. Most objects reference other objects, and for a partial objects, some of those other objects might have been provided as full objects, others as partial ones (which might be dealing with a similar level of complexity themselves), yet others weren't provided at all and must be inferred from the rest of the data. There's quite a bit of ad-hoc logic for dealing with this, and that can get gnarly. I think it might be possible to simplify this a bit by making inference fully explicit. Right now, when building a full object from a partial one, the code that does that will look at what it has available, and infer the rest (if it can). Instead, the builder code could just never infer anything and produce and error ,if anything is missing. Inference could happen explicitly, with the user having to call a method: vertex
.with_position(position)
.with_curve(curve)
.infer_surface_vertex() // inference happens here
.build() // this would fail, if we hadn't inferred the surface vertex, or provided one I've already experimented with this, but ran into some trouble. I'm not sure if what I've been running into were inherent limitations of this concept, or if I was just using it the wrong way. I'm writing this down for later, hoping the confusion will lift at some point. Partial Object API: Merge OperationI already mentioned the ad-hoc logic that's going when building objects, and how it can get gnarly. For example, the code building a As I said, non-trivial stuff. And (of course!) there's a bug in there: What if curve and surface vertex are both partial, and only the surface vertex has a We could use a more structured approach here: instead of propagating objects from one place to another using ad-hoc code, we could define a merge operation with two inputs that could be either partial or full objects of the same type. It could work something like this:
I'm not super-clear on how useful this operation would be in practice, but I think there's some potential. |
I think trying to raise the level of abstraction of the partial object API is a bust. It ended up adding more complications than it removed, so far, and attempts to push it further had to be aborted because I ran into various problems. I've started to make everything lower-level and maximally stupid again, and have developed other ideas instead. Make partial object types fully public and mutableI initially chose a builder-style API which moves If all fields of partial objects are public, and all methods take Partial Object API: Merge++The merge operation I previously suggested exists now and is helpful. I have the suspicion that it is under-used though. Many |
I've already been working on this issue, developing and trying out ideas, but I've realized that I need to focus on this. I keep running into this problem, no matter what else I'm trying to do. This needs to get fixed. Here's the latest idea that I'm currently exploring: Partial Object API: Replace operationOne common pattern that I keep seeing is that an object needs to be replaced within a partial object recursively (meaning, including all objects it references). This is currently implemented in haphazard way. I suspect that turning this into a well-defined operation on partial objects that is used consistently, will provide a lot of cleanup potential. |
I've had some new ideas that could help solve this issue. One doesn't involve the partial object API directly, but instead simplifies the transform code, so that it doesn't use the partial object API anymore, relieving some pressure from the partial object API. This one is already implemented: #1406 This has already resulted in a minor cleanup that I failed to do previously: #1407 The second idea, I'm currently working on. It differs from the ones presented here before, in that it isn't an incremental improvement over the existing partial object API, but a completely new approach to it. Consequently, I couldn't figure out a way to implement this as a series of incremental improvements, and my current prototype is a complete rewrite of the partial object API. Here's the idea:
This means, the partial object graph is correctly connected from the beginning. Theoretically, there is no need to take care when updating or building the partial objects. You just update the partial objects that need updating, and the changes are propagated to where they are needed. I'm still in the process of prototyping this, and there might be reasons why it won't work out. I'm very optimistic however. I believe this new approach has the potential to solve this issue completely. |
As you can see from all those merged pull requests that reference this issue, work here has been ongoing. I've implemented the idea I presented in my previous comment. The design ended up a bit differently than I initially imagined, but the core of the idea, assigning an identity to partial objects and preserving that identity when constructing full objects from them, has been kept. It looks like the new approach can indeed address this issue. I've made substantial cleanups, among those replacing the whole partial object API with a new one, and updating a lot of the code using the partial object API to construct objects. So far, there have been no blockers, and no indication that the new approach won't stack up. There is still a lot of problematic code that I haven't touched (in fact, some of it got worse during the transition to the new partial object API), and I guess it's possible that there are still some hidden problems that can't be solved using the new approach. What I can say for sure though, is the new partial object API is better than the old one, and a lot of code has already been cleaned up successfully. I expect that it will take some more time to update all the horribly complicated code that's still left, which will likely result in some adjustments to the new partial object API. But at this point, I would be very surprised if this work wouldn't result in this issue being addressed completely. Since the partial object API works based on a completely new approach now, most of the ideas I've presented here before are no longer valid. To reflect that, I'm going to remove the list that tracks their implementation status from the original issue description. |
It's been a while since I wrote an update here, and again, you can see that quite a bit of work happened, by the list of pull requests referencing this issue. In my last update, I was very optimistic that the new approach to the partial object API would be the solution I was looking for, and this has turned out to be correct. While not all messy code has been fully cleaned up, the situation is vastly improved. Cleaning up the rest is more of an "is it worth the effort" problem, and no longer a "don't know how to do it" problem. As such, I've been considering to declare victory and close this issue. I might still do that, but I haven't done it yet, as I'm still working on a distinct, but closely related problem in the context of the object construction code. Namely, while the current object construction code is well under control now, we are missing the right APIs to write the next batch of object construction code. As I've been shifting my attention to #1162, I've been hindered by the partial object and builder APIs lack of completeness. Building up these APIs has basically been what I've been doing all along as part of working on this issue, and this has been a success so far. Building a cycle or a face works effortlessly now, but now building solids and shells needs to become just as effortless. This is what's needed to take the next steps I need to take to fix #1162. That hasn't been going so well, to be honest. Not because the problem is inherently hard (at least I don't think it is), but rather because I've been staring at object construction code for too long, and probably need a break. I need to work on something different for a bit, and have started to look into other problems and opportunities for that reason. I still try to make a little bit of progress here every day, but I expect that to be slow going for a while. |
I mentioned yesterday that I'm considering to close this issue, and I decided to that now. To be clear, this is purely an administrative decision. It doesn't change anything about the work I'm doing, just how I'm tracking it and where I'm posting my updates. The original problem that this issue was opened for has been addressed, and further work in the same area is currently driven by #1162, so I'm going to post my updates there. If there's still more work to do after #1162 has been addressed, I can open a more specific follow-up issue. Before I press |
Due to the recent advancements of #1021, it has become possible to make a lot of validation code much more strict, by basing comparisons between objects on identity instead of equality. There's a thorough write-up in the kernel documentation that explains this concept, but the gist of it is, two objects can look the same (be equal), while having been constructed by different pieces of code, from different input data (i.e. not be identical).
Relying on such equal-but-not-identical objects to always be equal is dangerous and a source of bugs. Think about a cube, for example. Each face of the cube shares its 4 vertices with its neighboring faces. If you construct those vertices by computing their 3D coordinates from the local surface coordinates of each face, that will totally work in my unit test, because I use neat and tidy coordinates there. But when you construct a cube in an actual model, that same code will break horribly, because the cube is located at some weird point in space and slightly rotated, and suddenly those vertices that are supposed to be equal have a distance of
1e-16
or something between them.This is a big problem, because a CAD kernel provides lots of opportunities for subtly broken code like that. If that problem is left unaddressed, there will be a million little bugs, that will only ever be discovered by users in production, and will be a pain to debug, because they can only be reproduced in that user's specific model. And there are lots of ways to address it. The way I've chosen for Fornjot is to prevent this problem outright, by simply disallowing these equal-but-not-identical objects. If you have two references, and expect the objects they reference to be the same, then them being equal is not enough. They must be identical, so you know this is one object, constructed by one piece of code, from one set of input data. It will always be the same.
The Problem
We have most objects managed in centralized storage now (#1021), which makes it possible to compare references to them based on identity, and the existing (and still very incomplete) validation code has already uncovered a lot of potential issues, which I've fixed. So far, so good.
The problem is, that fixed code is messy. Really messy! Just constructing objects wherever you need them from whatever input data is available is easy, while only constructing them once and making sure you put the constructed object everywhere it's needed is much harder. I've had to create a whole new API to make it possible at all, and the resulting code is still extremely complicated and hard to understand.
This is far from ideal, but it's fine for now. While that code is complicated and hard to understand, it is not fragile. Its functionality is fixed in place by the validation code that found those issues in the first place (and that validation code will become more expansive as time goes on). So I'm not really worried about that code breaking, but of course I'd still prefer to have code that is easy to understand and modify.
From the beginning, I suspected that under all that mess, there was a cleaner approach waiting to be discovered. I didn't know what that approach was, so I kept writing messy code. This issue is meant to be the start of finding this cleaner approach, by collecting ideas for solutions that might help.
Before I get into those ideas, I'd like to note that the existence of a cleaner approach wasn't just blind hope, even when I had no idea what it might look like. It is my experience that when writing really complicated code, a test suite that covers this complicated code completely, tends to be large and complicated itself. My messy object construction code doesn't have a test suite per se, but it is covered by that validation code I mentioned, which is relatively small and simple.
That the functionality of this huge mess is held in place so completely by relatively little validation code, indicates to me that the complexity of the code is somewhat needless, not inherent to the problem itself. The ideas I'm collecting here have yet to prove themselves, but I'm optimistic that they will confirm this.
Solutions
Custom Iterators Adapters
I expect at least part of the solution to be not one big thing, but a collection of smaller techniques applied all over the place. One such technique are custom iterator adapters. Iterating over objects is something that object construction code does all the time. That this isn't always simple, is one source of complexity.
Introducing custom iterator adapters to encode common use cases was suggested to me by @HackerFoo on Matrix (thanks again!), and he even provided code for one such iterator adapter.
Partial Object API: Extracting a minimal core
The partial object API was introduced to make all this work possible. It allows you to start building an object by providing just some of its attributes, then finish building it later, inferring the attributes that weren't provided. For example, when creating a
Vertex
, you can infer theGlobalVertex
. Or, if thatGlobalVertex
is already available, you can provide it to the partial object before building the finished one.The partial object API has, to a large degree, supplanted the builder API, which was also capable of building objects, inferring parts of them, but was less flexible. I now suspect that reducing the partial object API to a minimal core, that's easier to make simple and correct, and building the more convenience-focus builder-like methods on top of that, is going to be part of the solution.
Partial Object API: Raising the level of abstraction
I think that one problem with the partial object API, that makes its implementation so complicated, is that it doesn't properly represent the higher-level principles that are at play. To give an example,
Vertex
references aCurve
which references aGlobalCurve
.GlobalEdge
also references aGlobalCurve
. AHalfEdge
referencesVertex
instances and aGlobalEdge
, and theHalfEdge
is only valid, if allGlobalCurve
s referenced by it indirectly are identical. This is not hard to verify ([1], [2]), but making sure theHalfEdge
is built that way turned out really messy.A caller using a
PartialHalfEdge
to build aHalfEdge
could provide theGlobalCurve
through one of theVertex
instances, or theGlobalEdge
, or none, or both. At any time, a caller could update one of the fields of thePartialHalfEdge
to add or remove aGlobalCurve
, or modify a partially constructed one. When it comes to building theHalfEdge
, all of these combinations need to be handled correctly, which is quite complicated.It makes sense to provide this level of flexibility when looking at, for example, a
PartialVertex
in isolation. But when thatPartialVertex
is part of aPartialHalfEdge
, it is subject to restrictions, like the aforementioned need to reference a sharedGlobalCurve
. It makes sense to represent those restrictions in the API ofPartialHalfEdge
, and I think this could contribute to making the whole thing simpler.That's all I have for now! I intend to update this issue as I implement those ideas (or try to), and as I come up with new ideas. For now, I don't expect this to be an area of focus. Rather, I'll do this work piece-meal, where it benefits other work I need to do.
If anyone's been reading through the whole thing, I hope you enjoyed it. As always, writing this out helped making my thoughts much clearer (at least to me), as they would have otherwise been.
The text was updated successfully, but these errors were encountered: