-
Notifications
You must be signed in to change notification settings - Fork 16
Diffing and Hashing – guide for developers
This page gives a more in-depth technical explanation about some diffing methods, and also serves as a guide for developers to build functionality on top of existing diffing code.
See the Diffing and the Hash wiki pages for a more quick-start guide.
- Developing Toolkit-specific diffing methods
- IDiffing() method: internal workings
- Other Diffing methods inner workings
- Customising the Diffing output: ComparisonInclusion() extension method
- Customising the Hash: HashString() extension method
- Toolkit-specific ComparisonConfig options
- Testing and profiling
The IDiffing()
method is designed to be a "universal" entry point for users wanting to diff their objects; for this reason, it has an automated mechanism to call any Toolkit-specific diffing method that can is compatible with the input objects. This work similarly to the Extension Method discovery pattern that is often leveraged in many BHoM methods.
A Toolkit-specific Diffing method is defined as a method:
- that is
public
; - whose name ends with
Diffing
; - that has the following inputs:
- a first
IEnumerable<object>
for the past objects; - a second
IEnumerable<object>
for the following objects; - any number of optional parameters;
- a final
DiffingConfig
parameter (that should default tonull
, and be auto initialised if null within the implementation).
- a first
Any method that respect these criteria is discovered and stored during the assembly loading through this method. It gets invoked by the IDiffing()
as explained here.
The IDiffing method does a series of automated steps to ensure that the most appropriate diffing method gets invoked for the input objects.
The IDiffing first looks for any Toolkit-specific diffing method that is compatible with the input objects (relevant code here). This is done by checking if there is a IPersistentAdapterId
stored on the objects; if there is, the namespace to which that IPersistentAdapterId
object belongs is taken as the source namespace to get a compatible Toolkit-specific diffing method. For example, if the input objects own a RevitIdentifier
fragment (which implements IPersistentAdapterId
), then the namespace BH.oM.Adapters.Revit.Parameters
is taken. This namespace, which is an .oM
one, is "modified" to an .Engine
one, so the related Toolkit Engine is searched for a diffing method.
If a Toolkit-specific diffing method match is found, that is then invoked. For example, this is how RevitDiffing()
gets called by the IDiffing.
Note that only the first matching method gets invoked. This is because we only allow to have 1 Toolkit-specific diffing method. If you have method overloading over your Toolkit-specific Diffing method (for example, because you want to provide the users with multiple choices when they choose to invoke directly your Toolkit-specific diffing method), you must ensure that all overloads are equally valid and can any can be picked by the IDiffing with the same results (like it happens for RevitDiffing()
: all methods end up calling a single, private
Diffing method, and additional inputs are optional, so they all behave the same if called by the IDiffing).
If the previous step does not find any Toolkit-specific diffing method
compatible with the input objects, then a variety of steps are taken to try possible diffing methods. In a nutshell, a series of checks are done on the input objects to see what diffing method is most suitable. This is better described in the following diagram. For more details on each individual diffing method, see https://github.com/BHoM/documentation/wiki/Diffing%3A-tracking-changes-in-your-BHoM-objects/#other-diffing-methods.
In addition to the main Diffing method IDiffing()
, there are several other methods that can be used to perform Diffing. These are a bit more advanced and should be used only for specific cases. All diffing methods can be found in the Compute folder of Diffing_Engine.
Most diffing methods are simply relying on an ID that is associated to the input objects, or a similar way to determine which object should be compared to which. Once a match is found, the two matched objects (one from the pastObjects
set and one from the followingObjects
set) are sent to the ObjectDifferences()
method, as illustrated by the following diagram.
This diagram also illustrates that only the DiffWithHash()
method does not rely on the ObjectDifferences()
method. The DiffWithHash()
is a rather simple and limited method, in that it cannot identify Modified objects but only new/old ones, and it is described here.
As shown above, the method that does most of the work in diffing is the BH.Engine.Diffing.Query.ObjectDifferences()
method.
This is the method that has the task of finding all the differences between two input objects. This method currently leverages an open-source, free library called CompareNETObjects
by Kellerman software. It maps our ComparisonConfig
options to the equivalent class in the CompareNETObjects
library, and then executes the comparison using it.
Because not all of the options available in the ComparisonConfig are mappable to Kellerman's, ObjectDifferences()
has to adopt a workaround. For example, our numerical approximation options are not directly compatible.
The general compatibility strategy is:
- if an option is mappable/convertible, map/convert it from our
ComparisonConfig
to Kellerman'sCompareLogic
object. This is true for most of them. - if an option is not compatible with Kellerman (like our numerical approximation options), set Kellerman
CompareLogic
so it finds all possible differences with regards to that option (like we do for numerical differences), then iterate the differences found and cull out those that are non relevant (example for the numerical differences).
The loop to iterate over the differences found by Kellerman is also useful to further customise the output, as shown by the following section.
In order to customise our diffing output, we want to customise how the ObjectDifferences()
method determines the differences between objects.
This is done through a specific ComparisonInclusion()
extension method that is invoked when we loop through the differences found by the Kellerman library. This is essentially an application of the Extension Method discovery pattern that is often leveraged in many BHoM methods.
You can implement a ObjectDifferences()
method in your Toolkit to customise how the difference between two specific objects is to be considered by the diffing. This method must have the following inputs, in this order:
- a fist object input (which will be the object coming from the
pastObjs
set); - a second object input, of the same type as the first object (which will be the object coming from the
followingObjs
set); - a
string
input, which will contain the Full Name of the property difference found by theObjectDifferences()
method; - a
BaseComparisonConfig
input, which will be passed in by theObjectDifferences()
method.
The method must return a ComparisonInclusion
object, which will contain information on whether the difference should be included or not, and how to display it.
Here is an example of ComparisonInclusion()
for RevitParameters:
public static ComparisonInclusion ComparisonInclusion(this RevitParameter parameter1, RevitParameter parameter2, string propertyFullName, BaseComparisonConfig comparisonConfig)
{
// Initialise the result.
ComparisonInclusion result = new ComparisonInclusion();
// Differences in any property of RevitParameters will be displayed like this.
result.DisplayName = parameter1.Name + " (RevitParameter)";
// Check if we have a RevitComparisonConfig input.
RevitComparisonConfig rcc = comparisonConfig as RevitComparisonConfig;
// Other logic
...
}
Note that this method supports Toolkit-specific ComparisonConfig
objects, like e.g. RevitComparisonConfig
. See the section below for more details.
If you want a specific object to be Hashed in a particular way, you can implement a HashString()
extension method for that object in your Toolkit. The HashString()
method will get invoked when computing the Hash(). This is essentially an application of the Extension Method discovery pattern that is often leveraged in many BHoM methods.
This method must have the following inputs, in this order:
- An object input, which will be the object for which we are calculating the Hash.
- A
string
input, which will indicated the FullName of the property being analysed by the Hash() method (for example when the input object is a property of another object; this can be useful in certain cases, and if not useful can simply be ignored). - A
BaseComparisonConfig
input, which can be used to will be passed in by theHash()
method.
Here is an example of HashString()
for RevitParameters:
public static string HashString(this RevitParameter revitParameter, string propertyFullName = null, BaseComparisonConfig comparisonConfig = null)
{
// Null check.
if (revitParameter == null) return null;
string hashString = revitParameter.Name + revitParameter.Value;
// Check if we have a RevitComparisonConfig input.
RevitComparisonConfig rcc = comparisonConfig as RevitComparisonConfig;
// Other logic
...
}
Note that this method supports Toolkit-specific ComparisonConfig
objects, like e.g. RevitComparisonConfig
. See the section below for more details.
There are cases where you may need more options to further customise the Hash or Diffing process, to refine how they work with your Toolkit's objects.
The "default" comparisonConfig
object gives all the default options, and it inherits from the BaseComparisonConfig
abstract class. This abstract class can be extended by the "Toolkit-specific" comparisonConfig
s, so you can include additional options to deal with certain objects in your Toolkit.
See an example with Revit's RevitComparisonConfig
.
If you implement your own Toolkit-specific ComparisonConfig
object, you will need to implement the functions that deal with it too, which should include at least one of:
- A toolkit-specific
Diffing()
method (example in Revit), which your users can call independently, or that may be automatically called by the IDiffing method, as shown here. - A toolkit-specific
HashString()
method (example in Revit), which will get invoked when computing the Hash(). - Any number of
ComparisonInclusion()
methods that you might need to customise the diffing output per each object (example in Revit for RevitParameters), as explained here.
We have a DiffingTests repo which contains Unit Tests and profiling functions. These are required given the amount of options and use cases that both offer.
-
Introduction to the BHoM:
What is the BHoM for?
Structure of the BHoM
Technical Philosophy of the BHoM -
Getting Started:
Installing the BHoM
Using the BHoM
Submitting an Issue
Getting started for developers -
Use GitHub & Visual Studio:
Using the SCRUM Board
Resolving an Issue
Avoiding Conflicts
Creating a new Repository
Using Visual Studio
Using Visual Studio Code -
Contribute:
The oM
The Engine
The Adapter
The Toolkit
The UI
The Tests -
Guidelines:
Unit convention
Geometry
BHoM_Engine Classes
The IImmutable Interface
Handling Exceptional Events
BHoM Structural Conventions
BHoM View Quality Conventions
Code Versioning
Wiki Style
Coding Style
Null Handling
Code Attributes
Creating Icons
Changelog
Releases and Versioning
Open Sourcing Procedure
Dataset guidelines -
Foundational Interfaces:
IElement Required Extension Methods -
Continuous Integration:
Introduction
Check-PR-Builds
Check-Core
Check-Installer -
Code Compliance:
Compliance -
Further Reading:
FAQ
Structural Adapters
Mongo_Toolkit
Socket_Toolkit