Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23147][UI] Fix task page table IndexOutOfBound Exception #20315

Closed
wants to merge 1 commit into from

Conversation

jerryshao
Copy link
Contributor

@jerryshao jerryshao commented Jan 18, 2018

What changes were proposed in this pull request?

Stage's task page table will throw an exception when there's no complete tasks. Furthermore, because the dataSize doesn't take running tasks into account, so sometimes UI cannot show the running tasks. Besides table will only be displayed when first task is finished according to the default sortColumn("index").

screen shot 2018-01-18 at 8 50 08 pm

To reproduce this issue, user could try sc.parallelize(1 to 20, 20).map { i => Thread.sleep(10000); i }.collect() or sc.parallelize(1 to 20, 20).map { i => Thread.sleep((20 - i) * 1000); i }.collect to reproduce the above issue.

Here propose a solution to fix it. Not sure if it is a right fix, please help to review.

How was this patch tested?

Manual test.

@@ -676,7 +676,7 @@ private[ui] class TaskDataSource(

private var _tasksToShow: Seq[TaskData] = null

override def dataSize: Int = stage.numCompleteTasks + stage.numFailedTasks + stage.numKilledTasks
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin I think you made this change, why not also track the running tasks? I'm not sure what is your original purpose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably just a side effect of me testing mostly with replayed logs and short-lived tasks.

@vanzin
Copy link
Contributor

vanzin commented Jan 18, 2018

Fix seems to be in line with what Spark 2.2 did, so LGTM pending tests.

@SparkQA
Copy link

SparkQA commented Jan 18, 2018

Test build #86341 has finished for PR 20315 at commit 568aa84.

  • This patch fails from timeout after a configured wait of `300m`.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Jan 18, 2018

Timed out on kafka test, no reason to run tests again.

@vanzin
Copy link
Contributor

vanzin commented Jan 18, 2018

Merging to master / 2.3.

@asfgit asfgit closed this in cf7ee17 Jan 18, 2018
asfgit pushed a commit that referenced this pull request Jan 18, 2018
## What changes were proposed in this pull request?

Stage's task page table will throw an exception when there's no complete tasks. Furthermore, because the `dataSize` doesn't take running tasks into account, so sometimes UI cannot show the running tasks. Besides table will only be displayed when first task is finished according to the default sortColumn("index").

![screen shot 2018-01-18 at 8 50 08 pm](https://user-images.githubusercontent.com/850797/35100052-470b4cae-fc95-11e7-96a2-ad9636e732b3.png)

To reproduce this issue, user could try `sc.parallelize(1 to 20, 20).map { i => Thread.sleep(10000); i }.collect()` or `sc.parallelize(1 to 20, 20).map { i => Thread.sleep((20 - i) * 1000); i }.collect` to reproduce the above issue.

Here propose a solution to fix it. Not sure if it is a right fix, please help to review.

## How was this patch tested?

Manual test.

Author: jerryshao <[email protected]>

Closes #20315 from jerryshao/SPARK-23147.

(cherry picked from commit cf7ee17)
Signed-off-by: Marcelo Vanzin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants