-
Notifications
You must be signed in to change notification settings - Fork 314
Setting up a Spark 2.0 notebook with MLeap and Toree
Mikhail Semeniuk edited this page Jan 23, 2017
·
7 revisions
We are going to assume you already have the following installed:
- Python 2.x
- Docker (required to install Toree)
virtualenv venv
source ./venv/bin/activate
pip install jupyter
Clone master into your working directory from Toree's github repo.
For this next step, you'll need to make sure that docker is running.
$ cd incubator-toree
$ make release
$ cd dist/toree-pip
$ pip install .
SPARK_HOME=<path to spark> jupyter toree install
The most error-proof way to add mleap to your project is to modify the kernel directly (or create a new one for Toree and Spark 2.0).
Kernel config files are typically located in /usr/local/share/jupyter/kernels/apache_toree_scala/kernel.json
Go ahead and add or modify __TOREE_SPARK_OPTS_
like so:
"__TOREE_SPARK_OPTS__": "--packages com.databricks:spark-avro_2.11:3.0.1,ml.combust.mleap:mleap-spark_2.11:0.5.0",
An alternative way is to use AddDeps Magics, but we've run into dependency collisions, so do so at your own risk:
%AddDeps ml.combust.mleap mleap-spark_2.11 0.5.0 --transitive