-
Notifications
You must be signed in to change notification settings - Fork 6
Synthesis guide
There are many ways to synthesize the tree. Here is a guide to the code that allows for flexible construction of synthetic trees. All classes discussed here are in the opentree.synthesis package.
First, here is an example of use (the draft synthesis method modified from GraphExplorer.java):
To define a synthesis protocol, first we initialize a RelationshipEvaluator
object:
RelationshipEvaluator draftSynthesisMethod = new RelationshipEvaluator();
To set filtering criteria, we can initialize a RelationshipFilter
object, and assign individual filtering criteria to it. Here we define a filter that will only accept relationships from studies after the year 1999. Source property filters must be supplied with an the node index for the source metadata nodes, from which they will extract the metadata to filter upon. Finally, we assign the filtering method to the synthesis protocol.
RelationshipFilter rf = new RelationshipFilter();
rf.addCriterion(new SourcePropertyFilterCriterion(
SourceProperty.YEAR, FilterComparisonMethod.GREATEROREQUAL,
new TestValue(2000), sourceMetaIndex));
draftSynthesisMethod.setFilter(rs);
To set ranking criteria, we first initialize a RelationshipRanker
object, to which we can assign ranking criteria. We will rank first using a priority list, which we need to build. The datatype of the priority list will be inferred by the ranking criterion. The values of the priority list should be of the same type as the property used for ranking, but for now we must store them as a list of <Object>
.
RelationshipRanker rs = new RelationshipRanker();
ArrayList<Object> sourceIdPriorityList = new ArrayList<Object>();
for (String sourceId : preferredSourceIds) {
sourceIdPriorityList.add((Object) sourceId);
}
Now we can add the ranking criteria to the ranker. The orderings will be executed in reverse order, so that ranking criteria added first have higher priority. Here we define a criterion that will rank relationships by source property "ot:studyId", according to the specified priority list. We specify the source property used by passing an element of the SourceProperty
enum. All available source properties are defined there.
rs.addCriterion(new SourcePropertyPrioritizedRankingCriterion(
SourceProperty.STUDYID, sourceIdPriorityList, sourceMetaIndex));
Here we assign a secondary ranking according to the source property "ot:studyYear" (this is probably meaningless in this case, since rels the same study id should have the same year as well), and then assign the entire ranking method to the synthesis protocol.
rs.addCriterion(new SourcePropertyRankingCriterion(
SourceProperty.YEAR, RankingOrder.DECREASING, sourceMetaIndex));
draftSynthesisMethod.setRanker(rs);
Finally, we set the conflict resolution criterion for the synthesis protocol.
RelationshipConflictResolver rcr = new RelationshipConflictResolver(new AcyclicRankPriorityResolution());
draftSynthesisMethod.setConflictResolver(rcr);
We can now use the RelationshipEvaluator
that defines our synthesis method to do a synthesis traversal. Below we supply it to the draftSynthesisRecur
method (defined in GraphExplorer.java), which will traverse the graph and store the synthesized relationships. Alternatively we could define other recursive functions to store the synthesized rels in other ways, etc. Refer to GraphExplorer.java for more details.
// set empty parameters for initial recursion:
Node originalParent = null;
// since we are saving rels to the graph, we need to have a transaction
Transaction tx = graphDb.beginTx();
try {
draftSynthesisRecur(startNode, originalParent, draftSynthesisMethod, "drafttree.version.alpha");
tx.success();
} catch (Exception ex) {
tx.failure();
ex.printStackTrace();
} finally {
tx.finish();
}
At each step in the recursive process, we call the RelationshipEvaluator.evaluateBestPaths(Node n)
method to get the set of optimum paths given the query parameters. The following code snippet is taken from within the draftSynthesisRecur
method of GraphExplorer.java.
// use the RelationshipEvaluator `re` to get the preferred relationships for this node
Iterable<Relationship> relsToFollow = re.evaluateBestPaths(curNode);
// continue recursion
for (Relationship rel : relsToFollow) {
draftSynthesisRecur(rel.getStartNode(), curNode, re, synthTreeName);
}
Synthesis is performed as a traversal that makes decisions at each node about which relationships to follow from that node to its children. A synthesis query protocol is defined within a RelationshipEvaluator
object. There are essentially three elements of a synthesis query, which correspond to specific objects that can be assigned to a RelationshipEvaluator
:
- A relationship filter (class
RelationshipFilter
) - A relationship ranker (class
RelationshipRanker
) - A conflict resolution method (class
RelationshipConflictResolver
)
Decisions are made using information supplied to each of these elements. The decision making-process occurs at each node in the traversal. First, the filters defined in the supplied RelationshipFilter
object are applied to the set of all incoming STREECHILDOF
relationships for the given node (these originate from the node's children). Then, the remaining relationships are ranked according to the protocols defined in the supplied RelationshipRanker
object, and then some set of these ranked relationship are chosen according to the method defined in the supplied RelationshipConflictResolver
object. Any of these objects can be omitted, if so then that step is skipped. For example, if no filter, ranker, or resolver were supplied, then all STREECHILDOF
relationships would be included in the final synthesized graph.
Any number of methods could be defined to perform each of the three steps above in various ways. There are (very simple) Java interfaces defined for the filtering and ranking criteria (FilterCriterion
and RankingCriterion
interfaces respectively), and the conflict resolution method (ConflictResolutionMethod
). New methods of filtering, ranking, and conflict resolution must implement the appropriate interface.
Currently, there is one available filtering method (SourcePropertyFilterCriterion
) that allows relationships to be filtered based on the comparison of their source properties to some specified TestValue
using a comparison method defined in the FilterComparisonType
enum (available methods are >, <, <=, >=, ==, and !=).
Available ranking methods allow ranking based on source properties in a regular order (SourcePropertyRankingCriterion
, which accepts an ordering method defined in the RankingOrder
enum), or ranking based on source properties using an arbitrary order that is pre-defined (SourcePropertyPrioritizedRankingCriterion
, which accepts a priority list). Filtering and ranking criteria can be combined. See examples above for how to use these classes to define a synthesis query.
There is currently one available conflict-resolution method: AcyclicRankPriorityResolution
(see example above). This method uses the order-preference defined by the RelationshipRanker
to prefer paths and guarantees a fully acyclic result. Currently, conflict resolution methods cannot be combined in the same way that filtering and ranking methods can.
IF this organization and approach is compatible with the intended future directions of other synthesis methods (e.g. the synthesis library), several additional methods that should be included are:
- Conflict resolution using Stephen's branch and bound approach to maximizing the number of included taxa
- Node-based filtering
- More specific relationship ranking (e.g. if condition A is met, then do ranking X, otherwise do ranking Y).