-
Notifications
You must be signed in to change notification settings - Fork 8
0.7.4 OrientDB in Jupyter
#TAP 0.7.4 Using OrientDB with Jupyter
Starting with TAP 0.7.4, you can access OrientDB from a Jupyter Notebook. This page shows you how to do that.
##Connecting OrientDB with Jupyter
-
Create an instance of OrientDB in TAP via Services >Marketplace. See Creating a service instance.
-
After the OrientDB instance has been created, download the OrientDB keys to use in the Jupyter notebook. Here are the steps to do that:
a. In the TAP Console, navigate to Services >Instances.
b. Locate your OrientDB instance and click on it. TAP displays Create Key.
c. click Create Key, enter a name for your key, and click the Add button.
d. The name is displayed along with an + Add to exports option. Click + Add to exports to add these OrientDB keys to the export queue for the services on the page.
e. Scroll up and click the Export keys button at the top right of the screen to export the keys.
The Export keys button exports the all the keys in the export queue. Typically, these are the keys for just one service. If you need to export keys for multiple services, however, you can create the keys and add them to the export queue for all the desired services first, then click the Export keys button.
f. Scroll down to see the OrientDB keys in JSON format. Click the Download JSON file button to download the keys as a JSON file, so you can copy/paste them into your Jupyter notebook.
##Export/Import a graph to/from OrientDB database
These steps assume you have already exported your OrientDB instance keys, as previously described. They also assume you have already created a Jupyter notebook.
See Creating a Jupyter Notebook Instance if you are new to working with Jupyter notebooks.
-
Import the spark-tk library to your Jupyter notebook and establish a spark-tk context.
import sparktk as tk
tc= tk.TkContext()
```
- The
create_OrientDB_conf
API creates a connection to the OrientDB container. Copy the required settings from the previously downloaded OrientDB instance keys (connection settings) for use with this API.
hostname = "localhost" portnumber = "xxxx" root_password = "rxkp094rbtvbli6d" orient_conf = tc.graph.create_orientdb_conf(hostname,portnumber,"admin","admin",root_password) orient_conf ```
-
The
export_to_orientdb
API creates an OrientDB database with the specified name and exports the spark-tk graph to the database. The API returns summary statistics for the exported data. You can see this in the example that follows. -
The
import_from_orientdb
API imports the graph from the given OrientDB database name to the spark-tk graph. You can see this in the example that follows.
#Example
The code snippets here show a graph being exported to the OrientDB database, then imported back.
For the following graph dataset:
v = tc.frame.create([("a", "Alice", 34,"female"),
("b", "Bob", 36,"male"),
("c", "Charlie", 30,"male"),
("d", "David", 29,"male"),
("e", "Esther", 32,"female"),
("f", "Fanny", 36,"female")], ["id", "name", "age","gender"])
e = tc.frame.create([("a", "b", "friend"),
("b", "c", "follow"),
("c", "b", "follow"),
("f", "c", "follow"),
("e", "f", "follow"),
("e", "d", "friend"),
("d", "a", "friend"),
("a", "e", "friend")], ["src", "dst", "relationship"])
Create a spark-tk graph:
graph.graphframe.vertices.show()
graph.graphframe.edges.show()
Export the graph to the OrientDB database:
graph.export_to_orientdb(orient_conf,
db_name = "Demo" ,
vertex_type_column_name="gender",
edge_type_column_name="relationship",
batch_size=1000,
db_properties=({"db.validation":"false"}))
Import the data back from OrientDB to the spark-tk graph:
orient_graph = tc.graph.import_orientdb_graph(orient_conf,
db_name="Demo",
db_properties=({"db.validation":"false"}))
The
db_properties
parameter is an optional parameter to configure the OrientDB database default settings. For more information on OrientDB database properties options, see http://orientdb.com/docs/2.1/Configuration.html
To access the OrientDB studio from TAP, see https://github.com/trustedanalytics/platform-wiki-0.7/wiki/OrientDB#accessing-the-orientdb-dashboard