Overhaul data model of dependency graphs for better query performance and relationship type support #3452
Labels
cdx-1.6
Related to CycloneDX specification v1.6
enhancement
New feature or request
p2
Non-critical bugs, and features that help organizations to identify and reduce risk
performance
size/L
High effort
spike / research
Current Behavior
How Dependency Graphs Work Today
Both the
PROJECT
andCOMPONENT
object have aDIRECT_DEPENDENCIES
column, which contains a JSON array of serializedComponentIdentity
objects.This roughly resembles how dependencies are represented in CycloneDX v1.5 and earlier.
While this approach works, it has a few downsides:
LIKE
conditions such as"DIRECT_DEPENDENCIES" LIKE ('%' || :childComponentUuid || '%')
.DIRECT_DEPENDENCIES
columnLIKE
condition with wildcards on both ends requires a special index, such as GIN in PostgreSQL, which in turn requires an extension that is not enabled per default.To give an example of how traversal queries look like in the current data model, here's one from Hyades that identifies whether a specific component (identified by the parameter
:leadComponentUuid
) is introduced through another component with the namefoo
:Example Query
CycloneDX v1.6
CycloneDX v1.6 will introduce the
provides
attribute to itsDependency
model (source):In order to support this, and potential future additions to relationships, we need to rework how we handle dependency graphs.
Proposed Behavior
I propose to revisit how we store and traverse dependency graphs.
Ideally, the graph structure would live in separate tables, and would be able to refer to multiple object types, such as Component, Service, Data, etc.
Instead of JSON, more efficient data types should be used, that are trivial to index and allow for referential integrity verification.
Just to put something out there:
Or a less strict variant using JSON to store properties:
However, JSON support and the ability to index such columns varies wildly between RDBMSes.
We need to ensure that whatever data model we end up using, works well with the queries we're going to execute.
In the best case, the new graph structure would allow to also represent relationships between projects.
Checklist
The text was updated successfully, but these errors were encountered: