Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split library into targeted use cases #322

Open
thompson-tomo opened this issue Aug 25, 2024 · 6 comments
Open

Split library into targeted use cases #322

thompson-tomo opened this issue Aug 25, 2024 · 6 comments

Comments

@thompson-tomo
Copy link
Contributor

When we use CycloneDX.Core nuget, this package contains multiple different serializers even though a user is likely to only ever use 1 format and a large number of times a user will never need the spdx functionality.

As such it would be beneficial to split the library into smaller targeted libraries as follows:

  • CycloneDX.Core
  • CycloneDX.Json
  • CycloneDX.Protobuf
  • CycloneDX.Spdx
  • CycloneDX.Xml

In the end this will also lead to a reduction in dependencies ie protobuf is only a dependency for the protobuf package

@andreas-hilti
Copy link
Contributor

While I see the value of a reduced set of dependencies, based on the current implementation this goal is hard to achieve IMO, mainly because
a) the support for the 3 serialization formats is mostly based on a single code decorated with different attributes
b) for some functionality, the different serialization formats are also used internally, e.g. copying of BOMs is using the protobuf serialization, equality checks are currently based on the JSON serialization, etc.

@thompson-tomo
Copy link
Contributor Author

What I propose is that we take stepwise approach to splitting such as follows:

  • Split serialization into dedicated classes (1 for each) which inherit from common abstract class
  • Evaluate if XML can become standalone or tackle blockers
  • switch copy bom to json (seperate issue)
  • identify any blockers to splitting proto out (if necessary seperate issue or previous one)
  • identify any blockers to splitting json out (if necessary seperate issue)
  • switch to generated classes based on schemas (seperate issue)

@mtsfoni
Copy link
Contributor

mtsfoni commented Aug 27, 2024

I think copying BOMs should happen in memory. Same for equality checks. That we use serialization and deserialization to copy the object-structure was a probably a lazy short cut taken at some point.

That would be the first step in my oppinion.

Then the question is how much the serialization-annotation can be seperated from the actual serializers. Alternatively custom annotation could work with some extra complexity.

switch to generated classes based on schemas (seperate issue)

I don't see this working. After 1.6 release we have something like 14 schemas that are all using the same model - I don't see how anybody generates that automatically.

However we are limited by the actual work capacity that people put into the project. It's already a challange to keep up with the CycloneDX releases.

@thompson-tomo
Copy link
Contributor Author

thompson-tomo commented Aug 28, 2024

Ok, how about this.

Create 2x issues:

We use this issue to track splitting the package which we can discuss how to achieve ie should it be dedicated classes or custom attributes.

Am aware that the biggest Blocker is someone to do the work, as such I am currently looking for something to work on next week as I have some days free in my schedule for OSS work and at the same time it helps me progress with #323

@mtsfoni
Copy link
Contributor

mtsfoni commented Aug 28, 2024

Sounds good to me.
@andreas-hilti What do you think? Do you see any major roadblocks in moving to in memory clone and comparision? I remember we discussed that topic in june in the frame of another issue.

@andreas-hilti
Copy link
Contributor

andreas-hilti commented Sep 7, 2024

@mtsfoni I don't see major roadblocks in this respect. However, the aspect that you brought in, namely the work capacity, is actually a big issue. For me, this implies whatever you plan to implement needs to be very lean and robust (i.e. it should not introduce other issues or major additional testing needs). We don't want this to be burden in terms of maintainability.

For the future plans, I don't like the idea too much of wrapping the serialization attributes yet into another layer. IMO already the xml and json serialization that we use are not that well documented. However, at least people that are familiar with it can easily understand the serialization mechanism used by the cyclone-dotnet-library. If we were to wrap it (apart from the fact that I don't believe it is so easy), then only people diving deep into the code will be able to understand the meaning and how to extend it. In addition, if we need a special attribute, this would lead to a large machinery of wrapping...

What one could maybe do is to provide special builds, for which say all the protobuf attributes are defined as dummy attributes such that you can skip the protobuf dependency (however, I'm not yet really convinced by this idea).

As a side-note, I think the term in-memory clone/comparison is actually misleading. Already now, we don't write files (i.e. it is in memory), the comparsion/clone is just not done directly on the objects (and in particular for the protobuf serialization, I would argue that the overhead is not that crucial).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants