Skip to content

Commit

Permalink
Document graph cache and bindings for JG clusters
Browse files Browse the repository at this point in the history
[skip ci]

Issues: JanusGraph#938
Signed-off-by: David Pitera <[email protected]>
  • Loading branch information
dpitera committed Mar 9, 2018
1 parent 894488c commit c06f7f3
Showing 1 changed file with 71 additions and 0 deletions.
71 changes: 71 additions & 0 deletions docs/multinodejanusgraphcluster.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
[[things-to-consider-in-a-multi-node-janusgraph-cluster]]
== Things to Consider in a Multi-Node JanusGraph Cluster

JanusGraph is a distributed graph database, which means it can be setup in a multi-node cluster. However, when working in such an environment, there are important things to consider. Furthermore, if configured properly, JanusGraph handles some of these special considerations for the user.

[[dynamic-graphs]]
=== Dynamic Graphs

JanusGraph supports <<configuredgraphfactory.adoc#configuredgraphfactory,dynamically creating graphs>>. This is d eviation from the way in which standard GremlinServer implementations allow one to access a graph. Traditionally, users create bindings to graphs at server-start, by configuring the gremlin-server.yaml file accordingly. For example, if the `graphs {}` section of your yaml file looks like this:

[source, properties]
----
graphs {
graph1: conf/graph1.properties,
graph2: conf/graph2.properties
}
----

then you will access your graphs on the GremlinServer using the fact that the String `graph1` will be bound to the graph opened on the server as per its supplied properties file, and the same holds true for `graph2`.

However, if we use the `ConfiguredGraphFactory` to dynamically create graphs, then those graphs are managed by the <<configuredgraphfactory.adoc#JanusGraphmanager,JanusGraphManager>> and the graph configurations are managed by the <<configuredgraphfactory.adoc#iconfigurationmanagementgraph,ConfigurationManagementGraph>>. This is especially useful because it 1. allows you to define graph configurations post-server-start and 2. allows the graph configurations to be managed in a persisted and distributed nature across your JanusGraph cluster.

To properly use the `ConfiguredGraphFactory`, you must configure every GremlinServer in your cluster to use the `JanusGraphManager` and the `ConfigurationManagementGraph`. This procedure is explained in detail <<configuredgraphfactory.adoc#configuring-JanusGraph-server-for-configuredgraphfactory,here>>.

[[ensuring-graph-references-are-up-to-date-across-the-cluster]]
==== Ensuring Graph References Are Up-To-Date Across All JanusGraph Nodes

If you configure all your JanusGraph servers to use the <<configuredgraphfactory.adoc#configuring-JanusGraph-server-for-configuredgraphfactory,ConfiguredGraphFactory>>, JanusGraph will ensure all graph representations are-up-to-date across all JanusGraph nodes in your cluster.

For example, if you update or delete the configuration to a graph on one JanusGraph node, then we must evict that graph from the cache on _every JanusGraph node in the cluster_. Otherwise, we may have inconsistent graph representations across your cluster. JanusGraph automatically handles this eviction using a messaging log queue through the backend system that the graph in question is configured to use.

If one of your servers is configured incorrectly, then it may not be able to successfully remove the graph from the cache.

[IMPORTANT]
====
Any updates to your <<configuredgraphfactory.adoc#template-configuration,TemplateConfiguration>> will not result in the updating of graphs/graph configurations previously created using said template configuration. If you want to update the individual graph configurations, you must do so using the <<configuredgraphfactory.adoc#updating-configuration,available update APIs>>. These update APIs will _then_ result in the graphe cache eviction across all JanusGraph nodes in your cluster.
====

[[accessing-graph-and-traversal-objects-through-bindings]]
==== Accessing Graph and Traversal Objects Through String Bindings Across Your Cluster

JanusGraph has the ability to bind dynamically created graphs and their traversal references to `<graph.graphname>` and `<graph.graphname>_traversal`, respectively, across all JanusGraph nodes in your cluster, with a maximum of a 20s lag for the binding to take effect on any node in the cluster. Read more about this <<configuredgraphfactory.adoc#graph-and-traversal-bindings, here>>.

JanusGraph accomplishes this by having each node in your cluster poll the `ConfigurationManagementGraph` for all graphs for which you have created configurations. The `JanusGraphManager` will then open said graph with its persisted configuration, store it in its graph cache, and bind the `<graph.graphname>` to the graph reference on the `GremlinExecutor` as well as bind `<graph.graphname>_traversal` to the graph's traversal reference on the `GremlinExecutor`.

This allows you to access a dynamically created graph and its traversal reference by their string bindings, on every node in your JanusGraph cluster. This is particularly important to be able to work with GremlinServer clients.

[[set-up]]
===== Set Up

To set up your cluster to bind dynamically created graphs and their traversal references, you must:

1. Configure each node to use the <<configuredgraphfactory.adoc#configuring-JanusGraph-server-for-configuredgraphfactory,ConfiguredGraphFactory>>.

2. Configure each node to use a `JanusGraphChannelizer`, which injects lower-level GremlinServer components, like the GremlinExecutor, into the JanusGraph project, giving us greater control of the GremlinServer.

To configure each node to use a `JanusGraphChannelizer`, we must update the `gremlin-server.yaml` to do so:

[source, properties]
----
channelizer: org.janusgraph.channelizers.JanusGraphWebSocketChannelizer
----

There are a few channelizers you can choose from:

1. org.janusgraph.channelizers.JanusGraphWebSocketChannelizer
2. org.janusgraph.channelizers.JanusGraphHttpChannelizer
3. org.janusgraph.channelizers.JanusGraphNioChannelizer
4. org.janusgraph.channelizers.JanusGraphWsAndHttpChannelizer

All of the channelizers share the exact same functionality as their TinkerPop counterparts.

0 comments on commit c06f7f3

Please sign in to comment.