-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] GpuExpression columnarEval can return scalars from subqueries that may be unhandled #8303
Comments
We cannot/should not force |
True, we still need the ability to get scalars in some situations. Does it make sense to have two forms of expression eval, one that expects to resolve to a scalar because it is explicitly expected, and one that forces it to be a columnar vector (e.g.: via columnarEvalToColumn)? As it is now, we're likely to keep having to fix places that aren't prepared to handle both. If we don't want two eval forms, then this issue becomes a tracker to audit all the places where subexpressions are evaluated and updating them to handle scalars (and hope we don't add more places that aren't properly dealing with scalars). |
It is not just sub-expressions. GpuCoalesce can also return a scalar and I think there may be others. But in general, yes I think what we want is two APIs. One that is guaranteed to return a This is not going to be a super simple fix. If you call the one that returns a ColumnVector you just need to handle that case. |
I am keeping this issue open from the PR because there is follow on work that has to happen as P0:
|
List of binary expressions that throw from a
|
GpuTernaryExpressions are all string expressions, and there are less of them.
|
#8293 is an instance of a case where a subquery caused a sub-expression to be evaluated as a scalar in a context that expected a column vector. One way to handle it is to explode the scalar out to a full column the same size as the batch and proceed as if it were a column all along (not necessarily the most performant), or update all the code to treat scalars as scalars (which may involve hacking the scalar into a 1-row column since most cudf functions operate on columns not scalars).
If we want to go the "always explode the scalar into a full column" route, then we should make it so
GpuExpression.columnarEval
is incapable of returning a scalar -- e.g.: it will always wrap the real eval with something like GpuExpressionsUtils.columnarEvalToColumn.The text was updated successfully, but these errors were encountered: