Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bind all graphs to gremlin executor context #884

Merged
merged 1 commit into from
Mar 16, 2018

Conversation

dpitera
Copy link
Contributor

@dpitera dpitera commented Jan 11, 2018

Issues: #883

Signed-off-by: David Pitera [email protected]


Thank you for contributing to JanusGraph!

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there an issue associated with this PR? Is it referenced in the commit message?
  • Does your PR body contain #xyz where xyz is the issue number you are trying to resolve?
  • Has your PR been rebased against the latest commit within the target branch (typically master)?
  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you written and/or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE.txt file, including the main LICENSE.txt file in the root of this repository?
  • If applicable, have you updated the NOTICE.txt file, including the main NOTICE.txt file found in the root of this repository?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?
  • If this PR is a documentation-only change, have you added a [skip ci]
    tag to the first line of your commit message to avoid spending CPU cycles in
    Travis CI when no code, tests, or build configuration are modified?

Note:

Please ensure that once the PR is submitted, you check Travis CI for build issues and submit an update to your PR as soon as possible.

@janusgraph-bot janusgraph-bot added the cla: yes This PR is compliant with the CLA label Jan 11, 2018
@dpitera dpitera force-pushed the dynamic-bindings-new branch 3 times, most recently from cfb3eeb to f9511c7 Compare January 16, 2018 16:31
Copy link
Contributor

@sjudeng sjudeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these classes be moved to the new janusgraph-server module?

@@ -0,0 +1,27 @@
// Copyright 2017 JanusGraph Authors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbrukman For files updated/added in 2018 should copyright be updated to "2018" or "2017-2018"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbrukman What is the final word here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a brand new file created in 2018 (which is what it appears to be since it's not branched and appears to be added in this PR), it should be (c) 2018.

If this is a modification of an existing file which as created in 2017 and (c) 2017, there's no need to update the year of copyright, just keep it as it was originally.

public void init(final ServerGremlinExecutor serverGremlinExecutor) {
this.serverGremlinExecutor = serverGremlinExecutor;
super.init(serverGremlinExecutor);
GraphManager graphManager = serverGremlinExecutor.getGraphManager();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final here and throughout PR

/**
* @author Stephen Mallette (http://stephen.genoprime.com)
*/
public final class TestClientFactory {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just use TestClientFactory from org.apache.tinkerpop:gremlin-server directly here? This looks identical to that class and the module is already a dependency in janusgraph-test. If it does need to be duplicated here then the original copyright header should be preserved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a part of this work, I have pulled in the
AbstractGremlinServerIntegrationTest from TinkerPop as it is part of the
src/test/** resources under the gremlin-server distribution, and these
test resources are not packaged in the jar. As a quick short term
solution, this allows us to test inner server components, like we have
to do to test this changeset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this file is from Apache TinkerPop project, it should retain the original license / copyright header, not inherit the one from JanusGraph.

@janusgraph-bot janusgraph-bot added the cla: yes This PR is compliant with the CLA label Feb 3, 2018
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra blank line (one blank line between license header and package, please).

// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra blank line (one blank line between license header and package, please).

// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra blank line (one blank line between license header and package, please).

// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra blank line (one blank line between license header and package, please).

@dpitera dpitera force-pushed the dynamic-bindings-new branch from f9511c7 to c9dcbff Compare February 7, 2018 17:04
@dpitera
Copy link
Contributor Author

dpitera commented Feb 7, 2018

@mbrukman @sjudeng Thanks for your advice. Pushed new code with suggested updates.

@dpitera dpitera force-pushed the dynamic-bindings-new branch from c9dcbff to 261c750 Compare February 7, 2018 20:49
@@ -67,6 +67,8 @@ You can either use `ConfiguredGraphFactory.create("graphName")`
or `ConfiguredGraphFactory.open("graphName")`. Learn more about the difference
between these two options by reading the section below about the `ConfigurationManagementGraph`.

You can also access your graphs by using the bindings. Read more about this in the <<graphandtraversalbindings>> section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name graphandtraversalbindings will become either an anchor or a full page, but it's not very readable.

Can you use dashes to separate the words, please? graph-and-traversal-bindings is a valid identifier (like listing-the-graphs you can see below), and is much more readable and scannable.


IMPORTANT: The JanusGraphManager rebinds every graph stored on the ConfigurationManagementGraph (or those for which you have created configurations) every 20 seconds. This means your graph and traversal bindings for graphs created using the ConfigredGraphFactory will be available on all JanusGraph nodes with a maximum of a 20 second lag. It also means that a binding will still be available on a node after a server restart.

[[bindingexample]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment for the name bindingexample -> binding-example as above.

@dpitera dpitera force-pushed the dynamic-bindings-new branch from 261c750 to 4295fe1 Compare February 8, 2018 18:15
@dpitera
Copy link
Contributor Author

dpitera commented Feb 8, 2018

@mbrukman Pushed doc updates.

Copy link
Contributor

@sjudeng sjudeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Hopefully we can get another review by someone more familiar with Gremlin Server.

@sjudeng sjudeng assigned pluradj and unassigned pluradj Feb 9, 2018
@sjudeng sjudeng requested a review from pluradj February 9, 2018 01:04
@@ -0,0 +1,130 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was taken (and modified, I assume?) from the TinkerPop project? Would it make sense to document:

  • where it came from (e.g., path to file in GitHub or ASF Git)
  • what local modifications were made
  • whether it would make sense to remove it and depend on the original source file in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,62 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as for the other file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@dpitera dpitera force-pushed the dynamic-bindings-new branch 2 times, most recently from 6c51d80 to 579de33 Compare February 21, 2018 15:49
Copy link
Member

@pluradj pluradj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added my review


Graphs created using the ConfiguredGraphFactory are bound to the GremlinServer's Executor's Context by the "graph.graphname" property, and the graph's traversal reference is bound to the context by "<graphname>_traversal". This means, on subsequent connections to the server after the first time you create/open a graph, you can access the graph and traversal references by the "<graphname>" and "<graphname>_traversal" properties.

IMPORTANT: If you are connected to a remote GremlinServer using the gremlin-console and a *sessioned* connection, then you will have to reconnect to the server to bind the variables. This is probably the case for any sessioned WebSocket connection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/GremlinServer/Gremlin Server
s/gremlin-console/Gremlin Console
s/probably the case/is also true

[[graph-and-traversal-bindings]]
=== Graph and Traversal Bindings

Graphs created using the ConfiguredGraphFactory are bound to the GremlinServer's Executor's Context by the "graph.graphname" property, and the graph's traversal reference is bound to the context by "<graphname>_traversal". This means, on subsequent connections to the server after the first time you create/open a graph, you can access the graph and traversal references by the "<graphname>" and "<graphname>_traversal" properties.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/GremlinServer's Executor's Context/executor context on the Gremlin Server


IMPORTANT: If you are connected to a remote GremlinServer using the gremlin-console and a *sessioned* connection, then you will have to reconnect to the server to bind the variables. This is probably the case for any sessioned WebSocket connection.

IMPORTANT: The JanusGraphManager rebinds every graph stored on the ConfigurationManagementGraph (or those for which you have created configurations) every 20 seconds. This means your graph and traversal bindings for graphs created using the ConfigredGraphFactory will be available on all JanusGraph nodes with a maximum of a 20 second lag. It also means that a binding will still be available on a node after a server restart.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the 20 second interval configurable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the rebinding operation affect incoming requests? Is there a failure scenario possible if a request comes in during rebinding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so. Basically, before the rebinding, the user cannot access graphs and traversals by their bindings, post rebinding completion, they can.

final ScheduledExecutorService bindExecutor = Executors.newScheduledThreadPool(1);
// Dynamically created graphs created with the ConfiguredGraphFactory are
// bound across all nodes in the cluster and in the face of server restarts
bindExecutor.scheduleWithFixedDelay(new GremlinExecutorGraphBinder(this.gremlinExecutor), 0, 20L, TimeUnit.SECONDS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to make the 20 second interval be configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be awkward to do this, as this is not tied to a specific graph and the only way to inject configuration settings into the JG server is by using graph configuration options.

We could make it a graph config option, but it would be awkward cause it would only be relevant upon the call to configureGremlinExecutor which would happen once at server start. We could make it only a config option read off of the ConfigurationManagementGraph's configuration settings, but I worry that will be confusing as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'm ok with leaving it as a constant. I don't think there is an immediate need to configure it.

this.gremlinExecutor.getScriptEngineManager().put(it, graph);
this.gremlinExecutor.getScriptEngineManager().put(it + "_traversal", graph.traversal());
} catch (Exception e) {
// cannot open graph, do nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log the error

*/
public interface JanusGraphChannelizer {

public void init(final ServerGremlinExecutor serverGremlinExecutor);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javadoc for this method


/*
* This file was pulled from the Apache TinkerPop project:
* https://github.com/apache/tinkerpop/blob/master/gremlin-server/src/test/java/org/apache/tinkerpop/gremlin/server/AbstractGremlinServerIntegrationTest.java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

point at a specific version instead of master


/*
* This file was pulled from the Apache TinkerPop project:
* https://github.com/apache/tinkerpop/blob/master/gremlin-server/src/test/java/org/apache/tinkerpop/gremlin/server/TestClientFactory.java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

point at a specific version instead of master

@@ -1,7 +1,7 @@
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
channelizer: org.janusgraph.channelizers.JanusGraphWebSocketChannelizer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to cause a lot of confusion when configuring the Gremlin Server. Adding comments in all the Gremlin Server yaml files would be helpful to tell users when to use a TinkerPop WebSocketChannelizer instead of JanusGraphWebSocketChannelizer. Also add similar information in the documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more confusing if I put comments in the yaml files. Currently there are two YAMl files, one when using the CMG and one when not. The JG channelizers are in place in the CMG YAML because this is when you can make use of the channelizer, and not in the normal YAML because it can't work there. I think the existence of the separate files is enough differentiation between when one could/should/can't use the JG channelizers (of course along with the appropriate documentation on the matter)-- and I have already added to the CGF setting on when it is appropriate to use the JG channelizers and how to configure the YAMl to use so.

Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Make sure to clarify in the docs then. Is there a migration hit for apps that were using CGF in JG 0.2.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any app previously using the CGF can make use of this new binding functionality by making use of the new channelizers. That's about it in terms of steps forward for people previously using the CGF.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's optional though, correct? It is not required to use the new channelizers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

"org.janusgraph.core.ConfiguredGraphFactory.createTemplateConfiguration(new MapConfiguration(map));" +
"org.janusgraph.core.ConfiguredGraphFactory.create(\"newGraph\")");
//assert newGraph is indeed bound
assertEquals(0, client.submit("newGraph.vertices().size()").all().get().get(0).getInt());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tests to verify that newGraph_traversal is bound and newGraph_traversal.getGraph() is the same as newGraph.

@dpitera dpitera force-pushed the dynamic-bindings-new branch from 579de33 to 51df174 Compare February 22, 2018 20:04
@dpitera
Copy link
Contributor Author

dpitera commented Feb 22, 2018

@pluradj Repushed with requested changes.


Graphs created using the ConfiguredGraphFactory are bound to the executor context on the Gremlin Server by the "graph.graphname" property, and the graph's traversal reference is bound to the context by "<graphname>_traversal". This means, on subsequent connections to the server after the first time you create/open a graph, you can access the graph and traversal references by the "<graphname>" and "<graphname>_traversal" properties.

IMPORTANT: If you are connected to a remote Gremlin Server using the Gremlkin Console and a *sessioned* connection, then you will have to reconnect to the server to bind the variables. This is also true for any sessioned WebSocket connection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Gremlkin/Gremlin


/*
* This file was pulled from the Apache TinkerPop project:
* https://github.com/apache/tinkerpop/blob/tp32/gremlin-server/src/test/java/org/apache/tinkerpop/gremlin/server/AbstractGremlinServerIntegrationTest.java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/tp32/3.2.7
tp32 is a branch, 3.2.7 is a tagged version


/*
* This file was pulled from the Apache TinkerPop project:
* https://github.com/apache/tinkerpop/blob/tp32/gremlin-server/src/test/java/org/apache/tinkerpop/gremlin/server/TestClientFactory.java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/tp32/3.2.7
tp32 is a branch, 3.2.7 is a tagged version

@pluradj
Copy link
Member

pluradj commented Feb 22, 2018

@dpitera I didn't see any doc updates in configuredgraphfactory.adoc for the channelizers.

@dpitera
Copy link
Contributor Author

dpitera commented Feb 23, 2018

@dpitera I didn't see any doc updates in configuredgraphfactory.adoc for the channelizers.

Oh u are right. Looking back, I remember that I just assumed everyone using 0.3.0 CFG would use the provided YAML distr and thus have this functionality, but I will update the section I added to include a notion of how this happens, i.e. by configuring the server to use the JG channelizers.

Furthermore, any graphs stored on the ConfigurationManagementGraph are
bound every 20 seconds using a ScheduledExecutorService. This is useful
for 2 reasons:

1. This allows graph bindings to be global across all JanusGraph nodes
in a deployment. A user can therefore "create" a graph once and expect
the binding to find itself on every node in the cluster in <20 seconds.
This is particularly useful for users using services hidden behind
third-party managed routers.

2. If a server is restart, the binding will be rebound on the server
without having to reopen the graph again.

As a part of this work, I have pulled in the
AbstractGremlinServerIntegrationTest from TinkerPop as it is part of
the
src/test/** resources under the gremlin-server distribution, and these
test resources are not packaged in the jar. As a quick short term
solution, this allows us to test inner server components, like we have
to do to test this changeset.

Issues: JanusGraph#883

Signed-off-by: David Pitera <[email protected]>
@dpitera dpitera force-pushed the dynamic-bindings-new branch from 51df174 to 0071e7c Compare February 26, 2018 17:43
@dpitera
Copy link
Contributor Author

dpitera commented Feb 26, 2018

@pluradj Pushed with suggested updates

@janusgraph-bot janusgraph-bot added the cla: yes This PR is compliant with the CLA label Mar 13, 2018
Copy link
Member

@pluradj pluradj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks david

@dpitera dpitera merged commit 7afe62d into JanusGraph:master Mar 16, 2018
@pluradj pluradj added this to the Release v0.3.0 milestone Mar 29, 2018
bwatson-rti-org pushed a commit to bwatson-rti-org/janusgraph that referenced this pull request Mar 9, 2019
Bind all graphs to gremlin executor context
micpod pushed a commit to micpod/janusgraph that referenced this pull request Nov 5, 2019
Bind all graphs to gremlin executor context
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This PR is compliant with the CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants