-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel] Return unsupported expression in partition pruning in remaining predicate #2491
Conversation
0a18d35
to
9423b8f
Compare
try { | ||
exprHandler.getPredicateEvaluator(tableSchema, predicate); | ||
return false; | ||
} catch (UnsupportedOperationException ex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if we should throw a defined exception like UnsupportedExpressionException
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets leave it as is for now. I think its better to think through all different exceptions and take a call if you want to make custom exceptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but this is a bad pattern, to use exceptions for control flow. I think we should consider defining the method in ExpressionHandler which validates whether an expression is supported or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, also worried about the cost of creating an evaluator in some engines. Is this API ok?
/**
* Is the given expression evaluation supported on the data with given schema?
*
* @param inputSchema Schema of input data on which the expression is evaluated.
* @param expression Expression to evaluate.
* @param outputType Expected result data type.
* @return true if supported and false otherwise.
*/
boolean isSupported(StructType inputSchema, Expression expression, DataType outputType);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets make that a separate PR. I think we need to think about whether we want the schema as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can make a separate PR. I think we need it because the connector may support an expression but it may only support it on a selected input data types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new Predicate(name, children.asJava) | ||
} | ||
private def and(left: Predicate, right: Predicate): Predicate = predicate("AND", left, right) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these will be removed once #2492 is merged
kernel/kernel-api/src/main/java/io/delta/kernel/internal/ScanImpl.java
Outdated
Show resolved
Hide resolved
// Subset of the given predicate that Kernel can't guarantee that it can complete satisfy | ||
// It could be predicate on the data columns and/or unsupported predicate on partition columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions
- What does
satisfy
mean? Can we be more specific and clear? - Why can't we handle a predicate on the data columns? We have column stats and data skipping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment saying how this is all in relation to what we can evaluate just using input X (scan file path or whatever) and without reading data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a best-effort predicate. Updated the comment.
public static Predicate combineWithAndOp(Predicate left, Predicate right) { | ||
String leftName = left.getName().toUpperCase(); | ||
String rightName = right.getName().toUpperCase(); | ||
if (leftName.equals("ALWAYS_FALSE") || rightName.equals("ALWAYS_FALSE")) { | ||
return ALWAYS_FALSE; | ||
} | ||
if (leftName.equals("ALWAYS_TRUE")) { | ||
return right; | ||
} | ||
if (rightName.equals("ALWAYS_TRUE")) { | ||
return left; | ||
} | ||
return new And(left, right); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something feels off here. Why are we doing string comparisons? Isn't everything strongly typed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our expressions name (=
, like
etc) based , we don't enforce any restrictions on the expression name to allow the connector to pass through Kernel any arbitrary expressions which its own expression handler can support
* Utility method to check whether the given predicate has any expressions that are not | ||
* supported by the given expression handler. | ||
*/ | ||
private static boolean hasUnsupportedExpr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about this method is specific to PartitionUtils
and makes it not be a part of ExpressionUtils
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removed as part of the PR #2492 on which this PR is based on.
2b24241
to
33c45d2
Compare
if (areFiltersSplit) { | ||
return; | ||
} | ||
filter.map(predicate -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: filter.ifPresent
, that's in JDK 8 right?
33c45d2
to
50da34d
Compare
Description
Currently the
DefaultExpressionHandler
only supports a few expressions. If a user passes an unsupported expression, Kernel partition pruning fails with an unsupported operation exception. Instead, this PR changes it to return the unsupported part of the expression in the remaining filter ofScan
. It makes use ofExpressionHandler.getPredicateEvaluator
to decide whether an expression is supported or not.How was this patch tested?
Unit tests and integration tests.