Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-16251][SPARK-20200][Core][Test]Flaky test: org.apache.spark.rdd.LocalCheckpointSuite.missing checkpoint block fails with informative message #18314

Closed
wants to merge 2 commits into from

Conversation

jiangxb1987
Copy link
Contributor

What changes were proposed in this pull request?

Currently we don't wait to confirm the removal of the block from the slave's BlockManager, if the removal takes too much time, we will fail the assertion in this test case.
The failure can be easily reproduced if we sleep for a while before we remove the block in BlockManagerSlaveEndpoint.receiveAndReply().

How was this patch tested?

N/A

@@ -168,6 +172,10 @@ class LocalCheckpointSuite extends SparkFunSuite with LocalSparkContext {
// Collecting the RDD should now fail with an informative exception
val blockId = RDDBlockId(rdd.id, numPartitions - 1)
bmm.removeBlock(blockId)
// Wait until the block has been removed successfully.
eventually(timeout(1 seconds), interval(100 milliseconds)) {
assert(bmm.getBlockStatus(blockId).size == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use isEmpty or ===

Copy link
Member

@HyukjinKwon HyukjinKwon Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen, actually, do you maybe know why === is preferred over ==? Up to my knowledge, === is preferred as it gives a better error message and I saw several comments saying so by few committers before.

However, it seems some do not think so. I raised an issue about a year ago - databricks/scala-style-guide#36 but I am still confused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just curious. === might be still preferred per the documentation.

@SparkQA
Copy link

SparkQA commented Jun 15, 2017

Test build #78095 has finished for PR 18314 at commit 7515a5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 15, 2017

Test build #78098 has finished for PR 18314 at commit 4c792b9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Jun 15, 2017
…dd.LocalCheckpointSuite.missing checkpoint block fails with informative message

## What changes were proposed in this pull request?

Currently we don't wait to confirm the removal of the block from the slave's BlockManager, if the removal takes too much time, we will fail the assertion in this test case.
The failure can be easily reproduced if we sleep for a while before we remove the block in BlockManagerSlaveEndpoint.receiveAndReply().

## How was this patch tested?
N/A

Author: Xingbo Jiang <[email protected]>

Closes #18314 from jiangxb1987/LocalCheckpointSuite.

(cherry picked from commit 7dc3e69)
Signed-off-by: Wenchen Fan <[email protected]>
asfgit pushed a commit that referenced this pull request Jun 15, 2017
…dd.LocalCheckpointSuite.missing checkpoint block fails with informative message

## What changes were proposed in this pull request?

Currently we don't wait to confirm the removal of the block from the slave's BlockManager, if the removal takes too much time, we will fail the assertion in this test case.
The failure can be easily reproduced if we sleep for a while before we remove the block in BlockManagerSlaveEndpoint.receiveAndReply().

## How was this patch tested?
N/A

Author: Xingbo Jiang <[email protected]>

Closes #18314 from jiangxb1987/LocalCheckpointSuite.

(cherry picked from commit 7dc3e69)
Signed-off-by: Wenchen Fan <[email protected]>
asfgit pushed a commit that referenced this pull request Jun 15, 2017
…dd.LocalCheckpointSuite.missing checkpoint block fails with informative message

## What changes were proposed in this pull request?

Currently we don't wait to confirm the removal of the block from the slave's BlockManager, if the removal takes too much time, we will fail the assertion in this test case.
The failure can be easily reproduced if we sleep for a while before we remove the block in BlockManagerSlaveEndpoint.receiveAndReply().

## How was this patch tested?
N/A

Author: Xingbo Jiang <[email protected]>

Closes #18314 from jiangxb1987/LocalCheckpointSuite.

(cherry picked from commit 7dc3e69)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

LGTM, merging to master/2.2/2.1/2.0

@asfgit asfgit closed this in 7dc3e69 Jun 15, 2017
dataknocker pushed a commit to dataknocker/spark that referenced this pull request Jun 16, 2017
…dd.LocalCheckpointSuite.missing checkpoint block fails with informative message

## What changes were proposed in this pull request?

Currently we don't wait to confirm the removal of the block from the slave's BlockManager, if the removal takes too much time, we will fail the assertion in this test case.
The failure can be easily reproduced if we sleep for a while before we remove the block in BlockManagerSlaveEndpoint.receiveAndReply().

## How was this patch tested?
N/A

Author: Xingbo Jiang <[email protected]>

Closes apache#18314 from jiangxb1987/LocalCheckpointSuite.
@jiangxb1987 jiangxb1987 deleted the LocalCheckpointSuite branch June 19, 2017 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants