-
Notifications
You must be signed in to change notification settings - Fork 8
Spark in TAP
Todd Lisonbee edited this page May 24, 2016
·
12 revisions
Apache Spark is a general engine for cluster scale computing. It provides API's for multiple languages including Python, Scala, and SQL.
The easiest way to get started with Spark on TAP is within a Jupyter notebook.
-
First, create a Jupyter notebook.
-
Open Jupyter and navigate to examples/spark/README.ipynb
The README notebook demonstrates how to create a SparkContext and some simple Spark code. The other example notebooks show how to use Spark dataframes, RDD's, streaming, SQL, and machine learning with KMeans and Linear Regression.
More information about Spark is available on the Spark website
###Accessing a Terminal from Jupyter
- From the Jupyter dashboard select the button >New Located in the upper right.
- Select >Terminal from the sub menu to open a new terminal within Jupyter.
Within the terminal Spark commands are available, e.g. spark-shell, spark-submit.