Skip to content

Commit

Permalink
[SPARK-23312][SQL][FOLLOWUP] add a config to turn off vectorized cach…
Browse files Browse the repository at this point in the history
…e reader

## What changes were proposed in this pull request?

apache#20483 tried to provide a way to turn off the new columnar cache reader, to restore the behavior in 2.2. However even we turn off that config, the behavior is still different than 2.2.

If the output data are rows, we still enable whole stage codegen for the scan node, which is different with 2.2, we should also fix it.

## How was this patch tested?

existing tests.

Author: Wenchen Fan <[email protected]>

Closes apache#20513 from cloud-fan/cache.
  • Loading branch information
cloud-fan authored and Robert Kruszewski committed Feb 12, 2018
1 parent 7257b91 commit 7baf853
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@ case class InMemoryTableScanExec(
}) && !WholeStageCodegenExec.isTooManyFields(conf, relation.schema)
}

// TODO: revisit this. Shall we always turn off whole stage codegen if the output data are rows?
override def supportCodegen: Boolean = supportsBatch

override protected def needsUnsafeRowConversion: Boolean = false

private val columnIndices =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -787,7 +787,8 @@ class CachedTableSuite extends QueryTest with SQLTestUtils with SharedSQLContext
withSQLConf(SQLConf.CACHE_VECTORIZED_READER_ENABLED.key -> vectorized.toString) {
val df = spark.range(10).cache()
df.queryExecution.executedPlan.foreach {
case i: InMemoryTableScanExec => assert(i.supportsBatch == vectorized)
case i: InMemoryTableScanExec =>
assert(i.supportsBatch == vectorized && i.supportCodegen == vectorized)
case _ =>
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ class WholeStageCodegenSuite extends QueryTest with SharedSQLContext {
val dsStringFilter = dsString.filter(_ == "1")
val planString = dsStringFilter.queryExecution.executedPlan
assert(planString.collect {
case WholeStageCodegenExec(FilterExec(_, i: InMemoryTableScanExec)) if !i.supportsBatch => ()
case i: InMemoryTableScanExec if !i.supportsBatch => ()
}.length == 1)
assert(dsStringFilter.collect() === Array("1"))
}
Expand Down

0 comments on commit 7baf853

Please sign in to comment.