Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a specific reason we are storing the datasets in their serialised form rather than static fields/properties in the assembly? #1941

Closed
epignatelli opened this issue Aug 20, 2020 · 7 comments
Assignees
Labels
type:question Ask for further details or start conversation

Comments

@epignatelli
Copy link
Member

Is there a specific reason we are storing the datasets in their serialised form rather than static fields/properties in the assembly?

@epignatelli epignatelli added the type:question Ask for further details or start conversation label Aug 20, 2020
@IsakNaslundBh
Copy link
Contributor

Not sure I fully understand the question?

Are you asking if there is a reason we are storing json files at all? Or is the question about the inner workings of the Library_Engine?

@epignatelli
Copy link
Member Author

The dataset is a collection of constant objects. Why do we store this collection in a json format, in a separate file, rather than into a class (eg. public static class Gradients : Dataset)?

@IsakNaslundBh
Copy link
Contributor

IsakNaslundBh commented Aug 20, 2020

Some varying reasons:

  1. Ease of use for less experienced developers, as well as more experienced ones. Having it in JSON files and letting the menu be dictated purely by the file structure means that all you have to do to create a dataset is to generate the objects you like, in whatever UI you like, then plug them to a Dataset obejct and push that obejct through the file adapter. Then just put the file in the correct folder and raise a PR. No need to change any bit of code etc. Easy to do, easy to understand.

  2. Those datasets can be massive. If we were to put all of the SteelSections on the steel section class it would be thousands of lines. Also it would require constant updates to the specific class, requiring far more code knowledge etc.

  3. IMO, keeping the schema and the data separate makes more sense to me in general. Some for the reasons above, but also as a general concept. The class is just the class definition, not a lot specific instances of it.

Saying all of this, we have a few cases of static properties on the class (as I know you know). For example the Vector class and Point class (getting X, Y, and Z vectors as well as the origin point). Something like that might make sense for the gradient case, but still think in general data is better kept separate from the class definition.

@epignatelli
Copy link
Member Author

I am not sure how easier and more intuitive that is. Unless the data in the dataset can be generated procedurally with an algrithm, I don't see the benefits.

The menu organization is a superstructure that we generated - we could do the same if datasets where organised in a different way, e.g. in classes (we do it already to cluster methods in components). There is a lot of knowlege in what you're describing - why should I plug all into a Dataset object? Then use a FileAdapter? What's the correct folder? Again, we data is not procedurally generated, that's not easier than writing the same thing you would write in grasshopper into a cs file.

Yep, this I understand, it makes sense. Maybe that's the case for procedurally generated data, I suppose.
It has a drawback, though. If you changed a schema and not versioned, the project compiles and you know you broke the dataset only at runtime, if you use it.

I am no proposing to mix schema and data instances - the opposite. Ideally what I would do is to add another angle and have a MachineLearning_Datasets project (as well as any other namespace) that holds the instances.

My whole point here is: I can't use datsets in their simplest form: by typing the data that's in there. I need to open the UI, create the objects, serialise them, create a file adapter and push them into the correct folder.

@FraserGreenroyd
Copy link
Contributor

Hi guys, this is a perfect opportunity to use Discussions - could we move this conversation there until we get to a stage of having an actionable issue? 😄

@IsakNaslundBh
Copy link
Contributor

Good point @FraserGreenroyd . @epignatelli , want to port across, and we can continue there? :)

@epignatelli
Copy link
Member Author

Here you go guys!
BHoM/BHoM#973

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:question Ask for further details or start conversation
Projects
None yet
Development

No branches or pull requests

6 participants