Skip to content

Commit

Permalink
Python programming guide
Browse files Browse the repository at this point in the history
  • Loading branch information
MLnick committed May 23, 2014
1 parent 7caa73a commit d0f52b6
Showing 1 changed file with 4 additions and 5 deletions.
9 changes: 4 additions & 5 deletions docs/python-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,10 @@ conf = (SparkConf()
sc = SparkContext(conf = conf)
{% endhighlight %}

`spark-submit` supports launching Python applications on standalone, Mesos or YARN clusters, through
its `--master` argument. However, it currently requires the Python driver program to run on the local
machine, not the cluster (i.e. the `--deploy-mode` parameter cannot be `cluster`).

# SequenceFile and Hadoop InputFormats

In addition to reading text files, PySpark supports reading Hadoop SequenceFile and arbitrary InputFormats.
Expand Down Expand Up @@ -214,11 +218,6 @@ Future support for 'wrapper' functions for keys/values that allows this to be wr
and called from Python, as well as support for writing data out as SequenceFile format
and other OutputFormats, is forthcoming.

`spark-submit` supports launching Python applications on standalone, Mesos or YARN clusters, through
its `--master` argument. However, it currently requires the Python driver program to run on the local
machine, not the cluster (i.e. the `--deploy-mode` parameter cannot be `cluster`).


# API Docs

[API documentation](api/python/index.html) for PySpark is available as Epydoc.
Expand Down

0 comments on commit d0f52b6

Please sign in to comment.