Skip to content

Commit

Permalink
[SPARK-6880][CORE] Use properties from ActiveJob associated with a Stage
Browse files Browse the repository at this point in the history
Author: Mark Hamstra
Apache Spark master PR: apache#6291

This issue was addressed in apache#5494, but the fix in that PR, while safe in the
sense that it will prevent the SparkContext from shutting down, misses the
actual bug. The intent of submitMissingTasks should be understood as "submit
the Tasks that are missing for the Stage, and run them as part of the ActiveJob
identified by jobId". Because of a long-standing bug, the jobId parameter was
never being used. Instead, we were trying to use the jobId with which the Stage
was created -- which may no longer exist as an ActiveJob, hence the crash
reported in SPARK-6880.

The correct fix is to use the ActiveJob specified by the supplied jobId
parameter, which is guaranteed to exist at the call sites of
submitMissingTasks.

This fix should be applied to all maintenance branches, since it has existed
since 1.0.
  • Loading branch information
pankaj arora authored and mbautin committed May 21, 2015
1 parent aedf957 commit a6e2663
Showing 1 changed file with 3 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -808,12 +808,9 @@ class DAGScheduler(
}
}

val properties = if (jobIdToActiveJob.contains(jobId)) {
jobIdToActiveJob(stage.jobId).properties
} else {
// this stage will be assigned to "default" pool
null
}
// Use the scheduling pool, job group, description, etc. from an ActiveJob associated
// with this Stage
val properties = jobIdToActiveJob(jobId).properties

runningStages += stage
// SparkListenerStageSubmitted should be posted before testing whether tasks are
Expand Down

0 comments on commit a6e2663

Please sign in to comment.