Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20250][Core]Improper OOM error when a task been killed while spilling data #18090

Closed
wants to merge 1 commit into from

Conversation

ConeyLiu
Copy link
Contributor

@ConeyLiu ConeyLiu commented May 24, 2017

What changes were proposed in this pull request?

Currently, when a task is calling spill() but it receives a killing request from driver (e.g., speculative task), the TaskMemoryManager will throw an OOM exception. And we don't catch Fatal exception when a error caused by Thread.interrupt. So for ClosedByInterruptException, we should throw RuntimeException instead of OutOfMemoryError.

https://issues.apache.org/jira/browse/SPARK-20250?jql=project%20%3D%20SPARK

How was this patch tested?

Existing unit tests.

@ConeyLiu
Copy link
Contributor Author

Hi @cloud-fan, @srowen, Can you help take a look ? Thanks a lot.

@cloud-fan
Copy link
Contributor

ok to test

@@ -184,6 +185,10 @@ public long acquireExecutionMemory(long required, MemoryConsumer consumer) {
break;
}
}
} catch (ClosedByInterruptException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprisingly this is also IOException... good catch!

@cloud-fan
Copy link
Contributor

LGTM

1 similar comment
@viirya
Copy link
Member

viirya commented May 25, 2017

LGTM

@SparkQA
Copy link

SparkQA commented May 25, 2017

Test build #77329 has finished for PR 18090 at commit 1a45ff5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request May 25, 2017
…spilling data

## What changes were proposed in this pull request?

Currently, when a task is calling spill() but it receives a killing request from driver (e.g., speculative task), the `TaskMemoryManager` will throw an `OOM` exception.  And we don't catch `Fatal` exception when a error caused by `Thread.interrupt`. So for `ClosedByInterruptException`, we should throw `RuntimeException` instead of `OutOfMemoryError`.

https://issues.apache.org/jira/browse/SPARK-20250?jql=project%20%3D%20SPARK

## How was this patch tested?

Existing unit tests.

Author: Xianyang Liu <[email protected]>

Closes #18090 from ConeyLiu/SPARK-20250.

(cherry picked from commit 731462a)
Signed-off-by: Wenchen Fan <[email protected]>
@asfgit asfgit closed this in 731462a May 25, 2017
asfgit pushed a commit that referenced this pull request May 25, 2017
…spilling data

Currently, when a task is calling spill() but it receives a killing request from driver (e.g., speculative task), the `TaskMemoryManager` will throw an `OOM` exception.  And we don't catch `Fatal` exception when a error caused by `Thread.interrupt`. So for `ClosedByInterruptException`, we should throw `RuntimeException` instead of `OutOfMemoryError`.

https://issues.apache.org/jira/browse/SPARK-20250?jql=project%20%3D%20SPARK

Existing unit tests.

Author: Xianyang Liu <[email protected]>

Closes #18090 from ConeyLiu/SPARK-20250.

(cherry picked from commit 731462a)
Signed-off-by: Wenchen Fan <[email protected]>
asfgit pushed a commit that referenced this pull request May 25, 2017
…spilling data

Currently, when a task is calling spill() but it receives a killing request from driver (e.g., speculative task), the `TaskMemoryManager` will throw an `OOM` exception.  And we don't catch `Fatal` exception when a error caused by `Thread.interrupt`. So for `ClosedByInterruptException`, we should throw `RuntimeException` instead of `OutOfMemoryError`.

https://issues.apache.org/jira/browse/SPARK-20250?jql=project%20%3D%20SPARK

Existing unit tests.

Author: Xianyang Liu <[email protected]>

Closes #18090 from ConeyLiu/SPARK-20250.

(cherry picked from commit 731462a)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

thanks, merging to master/2.2/2.1/2.0!

@ConeyLiu
Copy link
Contributor Author

ConeyLiu commented May 25, 2017

thanks @cloud-fan @viirya for reviewing.

@ConeyLiu ConeyLiu deleted the SPARK-20250 branch May 25, 2017 08:10
jzhuge pushed a commit to jzhuge/spark that referenced this pull request Aug 20, 2018
…spilling data

Currently, when a task is calling spill() but it receives a killing request from driver (e.g., speculative task), the `TaskMemoryManager` will throw an `OOM` exception.  And we don't catch `Fatal` exception when a error caused by `Thread.interrupt`. So for `ClosedByInterruptException`, we should throw `RuntimeException` instead of `OutOfMemoryError`.

https://issues.apache.org/jira/browse/SPARK-20250?jql=project%20%3D%20SPARK

Existing unit tests.

Author: Xianyang Liu <[email protected]>

Closes apache#18090 from ConeyLiu/SPARK-20250.

(cherry picked from commit 731462a)
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants