You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wish we can support StddevSamp with cast(col as double) for input.
Eg:
scala> spark.sql("select stddev_samp(cast(1 as double))").collect
It will show:
!Expression <StddevSamp> stddev_samp(1.0) cannot run on GPU because input expression Literal 1.0 (DoubleType is not supported); expression StddevSamp stddev_samp(1.0) produces an unsupported type DoubleType
The text was updated successfully, but these errors were encountered:
I just check the expression carefully. Indeed, we (both plugin + libcudf) don't support stddev in reduction context. It will only work in groupby and windowing.
In the long term, we should support it. While waiting, there is one simple workaround for the issue: Append a simple trivial keys column to the input (like an integer column with all 0 values), then do groupby on that keys column. For example:
scala> val df = Seq((1.0, 0),(2.0, 0)).toDF
df: org.apache.spark.sql.DataFrame = [_1: double, _2: int]
scala> df.groupBy("_2").agg(stddev("_1")).show
23/03/15 19:04:51 WARN GpuOverrides:
*Exec <CollectLimitExec> will run on GPU
*Partitioning <SinglePartition$> will run on GPU
*Exec <HashAggregateExec> will run on GPU
*Expression <AggregateExpression> stddev_samp(_1#164) will run on GPU
...
+---+---------------+
| _2|stddev_samp(_1)|
+---+---------------+
| 0| 0.707106781|
+---+---------------+
I wish we can support StddevSamp with cast(col as double) for input.
Eg:
It will show:
The text was updated successfully, but these errors were encountered: