Skip to content

Commit

Permalink
Add documentation for the HistoryServer
Browse files Browse the repository at this point in the history
  • Loading branch information
andrewor14 committed Apr 9, 2014
1 parent 567474a commit 2282300
Show file tree
Hide file tree
Showing 2 changed files with 62 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ class HistoryServer(
*
* If a log check is invoked manually in the middle of a period, this thread re-adjusts the
* time at which it performs the next log check to maintain the same period as before.
*
* TODO: Add a mechanism to update manually.
*/
private val logCheckingThread = new Thread {
override def run() {
Expand Down Expand Up @@ -292,7 +294,7 @@ object HistoryServer {
val UPDATE_INTERVAL_MS = conf.getInt("spark.history.updateInterval", 10) * 1000

// How many applications to retain
val RETAINED_APPLICATIONS = conf.getInt("spark.deploy.retainedApplications", 250)
val RETAINED_APPLICATIONS = conf.getInt("spark.history.retainedApplications", 250)

val STATIC_RESOURCE_DIR = SparkUI.STATIC_RESOURCE_DIR

Expand Down
64 changes: 59 additions & 5 deletions docs/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,71 @@ displays useful information about the application. This includes:

* A list of scheduler stages and tasks
* A summary of RDD sizes and memory usage
* Information about the running executors
* Environmental information.
* Information about the running executors

You can access this interface by simply opening `http://<driver-node>:4040` in a web browser.
If multiple SparkContexts are running on the same host, they will bind to succesive ports
If multiple SparkContexts are running on the same host, they will bind to successive ports
beginning with 4040 (4041, 4042, etc).

Spark's Standalone Mode cluster manager also has its own
[web UI](spark-standalone.html#monitoring-and-logging).
Note that this information is only available for the duration of the application by default.
To view the web UI after the fact, set `spark.eventLog.enabled` to true before starting the
application. This configures Spark to log Spark events that encode the information displayed
in the UI to persisted storage.

Note that in both of these UIs, the tables are sortable by clicking their headers,
## Viewing After the Fact

Spark's Standalone Mode cluster manager also has its own
[web UI](spark-standalone.html#monitoring-and-logging). If an application has logged events over
the course of its lifetime, then the Standalone master's web UI will automatically re-render the
application's UI after the application has finished.

If Spark is run on Mesos or YARN, it is still possible to reconstruct the UI of a finished
application through Spark's history server, provided that the application's event logs exist.
You can start a the history server by executing:

./sbin/start-history-server.sh <base-logging-directory>

The base logging directory must be supplied, and should contain sub-directories that each
represents an application's event logs. This creates a web interface at
`http://<server-url>:18080` by default, but the port can be changed by supplying an extra
parameter to the start script. The history server depends on the following variables:

<table class="table">
<tr><th style="width:21%">Environment Variable</th><th>Meaning</th></tr>
<tr>
<td><code>SPARK_DAEMON_MEMORY</code></td>
<td>Memory to allocate to the history server. (default: 512m).</td>
</tr>
<tr>
<td><code>SPARK_DAEMON_JAVA_OPTS</code></td>
<td>JVM options for the history server (default: none).</td>
</tr>
</table>

Further, the history server can be configured as follows:

<table class="table">
<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
<tr>
<td>spark.history.updateInterval</td>
<td>10</td>
<td>
The period at which information displayed by this history server is updated. Each update
checks for any changes made to the event logs in persisted storage.
</td>
</tr>
<tr>
<td>spark.history.retainedApplications</td>
<td>250</td>
<td>
The number of application UIs to retain. If this cap is exceeded, then the least recently
updated applications will be removed.
</td>
</tr>
</table>

Note that in all of these UIs, the tables are sortable by clicking their headers,
making it easy to identify slow tasks, data skew, etc.

# Metrics
Expand Down

0 comments on commit 2282300

Please sign in to comment.