Skip to content

Commit

Permalink
[SPARK-7957] Preserve partitioning when using randomSplit
Browse files Browse the repository at this point in the history
cc JoshRosen
Thanks for noticing this!

Author: Burak Yavuz <[email protected]>

Closes apache#6509 from brkyvz/sample-perf-reg and squashes the following commits:

497465d [Burak Yavuz] addressed code review
293f95f [Burak Yavuz] [SPARK-7957] Preserve partitioning when using randomSplit
  • Loading branch information
brkyvz authored and nemccarthy committed Jun 19, 2015
1 parent 217c1e0 commit 3b43a08
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions core/src/main/scala/org/apache/spark/rdd/RDD.scala
Original file line number Diff line number Diff line change
Expand Up @@ -434,11 +434,11 @@ abstract class RDD[T: ClassTag](
* @return A random sub-sample of the RDD without replacement.
*/
private[spark] def randomSampleWithRange(lb: Double, ub: Double, seed: Long): RDD[T] = {
this.mapPartitionsWithIndex { case (index, partition) =>
this.mapPartitionsWithIndex( { (index, partition) =>
val sampler = new BernoulliCellSampler[T](lb, ub)
sampler.setSeed(seed + index)
sampler.sample(partition)
}
}, preservesPartitioning = true)
}

/**
Expand Down

0 comments on commit 3b43a08

Please sign in to comment.