Skip to content
This repository has been archived by the owner on Sep 22, 2019. It is now read-only.

Synthesis guide

chinchliff edited this page Apr 26, 2013 · 17 revisions

Tree synthesis methods

There are many ways to synthesize the tree. Here is a guide to the code that allows for flexible construction of synthetic trees. All classes discussed here are in the opentree.synthesis package.

Example

First, here is an example of use (the draft synthesis method modified from GraphExplorer.java):

To define a synthesis protocol, first we initialize a RelationshipEvaluator object:

    RelationshipEvaluator draftSynthesisMethod = new RelationshipEvaluator();

To set filtering criteria, we can initialize a RelationshipFilter object, and assign individual filtering criteria to it. Here we define a filter that will only accept relationships from studies after the year 1999. Source property filters must be supplied with an the node index for the source metadata nodes, from which they will extract the metadata to filter upon. Finally, we assign the filtering method to the synthesis protocol.

    RelationshipFilter rf = new RelationshipFilter();

    rf.addCriterion(new SourcePropertyFilterCriterion(
        SourceProperty.YEAR, FilterComparisonMethod.GREATEROREQUAL,
        new TestValue(2000), sourceMetaIndex));

    draftSynthesisMethod.setFilter(rs);

To set ranking criteria, we first initialize a RelationshipRanker object, to which we can assign ranking criteria. We will rank first using a priority list, which we need to build. The datatype of the priority list will be inferred by the ranking criterion. The values of the priority list should be of the same type as the property used for ranking, but for now we must store them as a list of <Object>.

    RelationshipRanker rs = new RelationshipRanker();

    ArrayList<Object> sourceIdPriorityList = new ArrayList<Object>();
    for (String sourceId : preferredSourceIds) {
    	sourceIdPriorityList.add((Object) sourceId);
    }

Now we can add the ranking criteria to the ranker. The orderings will be executed in reverse order, so that ranking criteria added first have higher priority. Here we define a criterion that will rank relationships by source property "ot:studyId", according to the specified priority list. We specify the source property used by passing an element of the SourceProperty enum. All available source properties are defined there.

    rs.addCriterion(new SourcePropertyPrioritizedRankingCriterion(
        SourceProperty.STUDYID, sourceIdPriorityList, sourceMetaIndex));

Here we assign a secondary ranking according to the source property "ot:studyYear" (this is probably meaningless in this case, since rels the same study id should have the same year as well), and then assign the entire ranking method to the synthesis protocol.

    rs.addCriterion(new SourcePropertyRankingCriterion(
        SourceProperty.YEAR, RankingOrder.DECREASING, sourceMetaIndex));
    
    draftSynthesisMethod.setRanker(rs);

Finally, we set the conflict resolution criterion for the synthesis protocol.

    RelationshipConflictResolver rcr = new RelationshipConflictResolver(new AcyclicRankPriorityResolution());
    draftSynthesisMethod.setConflictResolver(rcr);

We can now use the RelationshipEvaluator that defines our synthesis method to do a synthesis traversal. Below we supply it to the draftSynthesisRecur method (defined in GraphExplorer.java), which will traverse the graph and store the synthesized relationships. Alternatively we could define other recursive functions to store the synthesized rels in other ways, etc. Refer to GraphExplorer.java for more details.

    // set empty parameters for initial recursion:
    Node originalParent = null;
    
    // since we are saving rels to the graph, we need to have a transaction
    Transaction tx = graphDb.beginTx();
    try {
    	draftSynthesisRecur(startNode, originalParent, draftSynthesisMethod, "drafttree.version.alpha");
    	tx.success();

    } catch (Exception ex) {
    	tx.failure();
    	ex.printStackTrace();

    } finally {
    	tx.finish();
    }

At each step in the recursive process, we call the RelationshipEvaluator.evaluateBestPaths(Node n) method to get the set of optimum paths given the query parameters. The following code snippet is taken from within the draftSynthesisRecur method of GraphExplorer.java.

    // use the RelationshipEvaluator `re` to get the preferred relationships for this node
    Iterable<Relationship> relsToFollow = re.evaluateBestPaths(curNode);
    
    // continue recursion
    for (Relationship rel : relsToFollow) {
        draftSynthesisRecur(rel.getStartNode(), curNode, re, synthTreeName);
    }

More information

Synthesis is performed as a traversal that makes decisions at each node about which relationships to follow from that node to its children. A synthesis query protocol is defined within a RelationshipEvaluator object. There are essentially three elements of a synthesis query, which correspond to specific objects that can be assigned to a RelationshipEvaluator:

  1. A relationship filter (class RelationshipFilter)
  2. A relationship ranker (class RelationshipRanker)
  3. A conflict resolution method (class RelationshipConflictResolver)

Decisions are made using information supplied to each of these elements. The decision making-process occurs at each node in the traversal. First, the filters defined in the supplied RelationshipFilter object are applied to the set of all incoming STREECHILDOF relationships for the given node (these originate from the node's children). Then, the remaining relationships are ranked according to the protocols defined in the supplied RelationshipRanker object, and then some set of these ranked relationship are chosen according to the method defined in the supplied RelationshipConflictResolver object. Any of these objects can be omitted, if so then that step is skipped. For example, if no filter, ranker, or resolver were supplied, then all STREECHILDOF relationships would be included in the final synthesized graph.

Any number of methods could be defined to perform each of the three steps above in various ways. There are (very simple) Java interfaces defined for the filtering and ranking criteria (FilterCriterion and RankingCriterion interfaces respectively), and the conflict resolution method (ConflictResolutionMethod). New methods of filtering, ranking, and conflict resolution must implement the appropriate interface.

Available synthesis methods

Currently, there is one available filtering method (SourcePropertyFilterCriterion) that allows relationships to be filtered based on the comparison of their source properties to some specified TestValue using a comparison method defined in the FilterComparisonType enum (available methods are >, <, <=, >=, ==, and !=).

There are two available ranking methods. SourcePropertyRankingCriterion allows ranking based on source properties in one of the regular orders defined in the RankingOrder enum. SourcePropertyPrioritizedRankingCriterion allows ranking based on source properties using an arbitrary order that is pre-defined, and provided as a List<Object> object defining the order of priority (elements closer to the start of the list are higher priority). Filtering and ranking criteria can be combined. See examples above for how to use these classes to define a synthesis query.

There is currently one available conflict-resolution method: AcyclicRankPriorityResolution (see example above). This method uses the order-preference defined by the RelationshipRanker to prefer paths and guarantees a fully acyclic result. Currently, conflict resolution methods cannot be combined in the same way that filtering and ranking methods can.

Future work

IF this organization and approach is compatible with the intended future directions of other synthesis methods (e.g. the synthesis library), several additional methods that should be included are:

  • Conflict resolution using Stephen's branch and bound approach to maximizing the number of included taxa
  • Node-based filtering
  • More specific relationship ranking (e.g. if condition A is met, then do ranking X, otherwise do ranking Y).