-
Notifications
You must be signed in to change notification settings - Fork 8
Spark in TAP
Apache Spark is a general engine for cluster scale computing. It provides APIs for multiple languages including Python, Scala, and SQL.
The easiest way to get started with Spark on TAP is within a Jupyter notebook, as follows:
-
First, create a Jupyter notebook.
-
Open Jupyter and navigate to examples/spark/README.ipynb
The README notebook demonstrates how to create a SparkContext and some simple Spark code.
The other example notebooks show how to use Spark dataframes, RDDs, streaming, SQL, and machine learning with K-Means and Linear Regression.
More information about Spark is available on the Spark website
###Accessing a terminal from Jupyter 1 From the Jupyter dashboard, select the >New button located in the upper right.
2 Select >Terminal from the sub menu to open a new terminal within Jupyter.
You can enter Spark commands (spark-shell, spark-submit, etc.) in the terminal window.