-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Implement CASE... WHEN... THEN... ELSE... END #41349
Conversation
Implement the ANSI SQL CASE expression which provides the if/else functionality common to most programming languages. The CASE expression can have multiple WHEN branches and becomes a powerful tool for SQL queries as it can be used in SELECT, WHERE, GROUP BY, HAVING and ORDER BY clauses. Closes: elastic#36200
Pinging @elastic/es-search |
@@ -25,6 +25,7 @@ STDDEV_POP |AGGREGATE | |||
SUM_OF_SQUARES |AGGREGATE | |||
VAR_POP |AGGREGATE | |||
HISTOGRAM |GROUPING | |||
CASE |CONDITIONAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you agree with listing CASE as a conditional function?
Case is quite special and doesn't follow at all the traditional functional format.
private final Expression defaultElse; | ||
|
||
@SuppressWarnings("unchecked") | ||
public Case(Source source, List<Expression> expressions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "weird" ctor is here as public instead of a more useful (for the ExpressionBuilder
) one which would like:
public Case(Source source, List<IfConditional> conditionals, Expression defaultElse)
because of FunctionRegistry which gets confused if there are 2 public ctors, and obviously the one with IfConditional
cannot be used in the FunctionRegistry
.
super(source, expressions); | ||
this.conditions = (List<IfConditional>) (List<?>) expressions.subList(0, expressions.size() - 1); | ||
this.defaultElse = expressions.get(expressions.size() - 1); | ||
setDataType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will apply the same logic as here: https://github.com/elastic/elasticsearch/pull/41355/files#diff-026c7791c54d1c165a2cd3f516a500beR36 once that PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I left some comments.
|
||
*Input*: | ||
|
||
Multiple but at least one WHEN *condition* THEN *result* clause and optional ELSE *defaultResult* clause. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence sounds weird. Maybe get rid of "Multiple" and rephrase?
At least one _WHEN *condition* THEN *result*_ clause is required and the expression can optionally have an _ELSE *default_result*_ clause. Every *condition* must be boolean expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rephrase with something like "One or multiple WHEN ... are used and the ..."
Multiple but at least one WHEN *condition* THEN *result* clause and optional ELSE *defaultResult* clause. | ||
Every *condition* should be boolean expression. | ||
|
||
*Output*: one of the *result* expressions if the corresponding WHEN *condition* evaluates to `true`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*Output*: one of the *result* expressions if the corresponding _WHEN *condition*_ evaluates to
trueor the *default_result* if all _WHEN *condition*_ clauses evaluate to
false. If the optional _ELSE *default_result*_ clause is missing and all _WHEN *condition*_ clauses evaluate to
falsethen
null is returned.
Case c = (Case) e; | ||
|
||
// Remove or foldable conditions that fold to FALSE | ||
// Stop at the 1st foldable condition taht folds to TRUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: that
.
|
||
@Override | ||
protected Case mutate(Case instance) { | ||
return randomCase(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this mutate
logic will always have a changed, non-equal to original, instance of CASE. I think the mutation should have as many IfConditional
instances as constructor values that can change (2 in this case with randomStringLiteral()
). And from Case
point of view, it will also need to mutate the randomIntLiteral()
parameter as well. Something like this: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/expression/function/scalar/string/ConcatFunctionPipeTests.java#L96
} | ||
|
||
@Override | ||
public void testTransform() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this one empty? Same for testReplaceChildren
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this Class as a quick fix to the failing NodeSubclassTests (because of the special IfConditional as an expression) and forgot to properly implement everything there.
Fixed.
|
||
@Override | ||
protected TypeResolution resolveType() { | ||
DataType expectedResultDataType = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not initializing expectedResultDataType directly with defaultElse().dataType()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we want to match the results of the THEN <result>
clauses to the 1st non-null result of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that happen anyway in the following loop (that checks the conditions)?
} | ||
|
||
for (IfConditional conditional : conditions) { | ||
dataType = DataTypeConversion.commonType(dataType, conditional.dataType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case conditions.isEmpty() == false
the first commonType
check is with itself, isn't it? Maybe worth doing a for
with an index that starts either at 0
or 1
depending on the previous isEmpty()
check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I would just start with the defaultElse() type and then iterate over the conditions - this avoid repetition.
SELECT emp_no, CASE WHEN emp_no - 10000 < 10 THEN 'First 10' ELSE 'Second 10' END as "case" FROM test_emp WHERE emp_no >= 10005 | ||
ORDER BY emp_no LIMIT 10; | ||
|
||
emp_no | case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following this PR - #41355 - can you change one of these tests to not have an alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. This feature will make a lot of folks happy.
|
||
*Input*: | ||
|
||
Multiple but at least one WHEN *condition* THEN *result* clause and optional ELSE *defaultResult* clause. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rephrase with something like "One or multiple WHEN ... are used and the ..."
@Override | ||
public boolean foldable() { | ||
return defaultElse.foldable() && (conditions.isEmpty() || | ||
(conditions.size() == 1 && conditions.get(0).condition().foldable() && conditions.get(0).result().foldable())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this method is correct - why is the defaultElse checked first - if a condition is foldable and true, defaultElse
is not relevant.
I think it should be:
return (conditions.isEmpty() && defaultElse.foldable()) || (conditions.size() > 0 && conditions.get(0).condition().foldable());
return defaultElse.fold(); | ||
} | ||
if (conditions.get(0).condition().fold() == Boolean.TRUE) { | ||
return conditions.get(0).result().fold(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The defaultElse.fold() repetition should be avoided:
if (conditions.size() > 0 && conditions.get(0).condition().fold() == Boolean.TRUE) {
...
}
return defaultElse.fold();
} | ||
|
||
for (IfConditional conditional : conditions) { | ||
dataType = DataTypeConversion.commonType(dataType, conditional.dataType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I would just start with the defaultElse() type and then iterate over the conditions - this avoid repetition.
import java.util.Objects; | ||
|
||
/** | ||
* Helper function (cannot be directly from a query) to model a {@code WHEN <condition> ELSE <result>} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering this does not extend Function
I wouldn't call it a Helper function
- instead a conditional expression mapping WHEN ELSE
backing CASE
and potentially IIF
this(source, Arrays.asList(condition, result)); | ||
} | ||
|
||
private IfConditional(Source source, List<Expression> children) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this constructor needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason after all, it's a leftover from a different design. changing it.
private final Expression result; | ||
|
||
public IfConditional(Source source, Expression condition, Expression result) { | ||
this(source, Arrays.asList(condition, result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this(source, asList(condition,result));
this.condition = condition;
this.result = result;
if (newChildren.size() < 2) { | ||
throw new IllegalArgumentException("expected at least [2] children but received [" + newChildren.size() + "]"); | ||
} | ||
return new IfConditional(source(), newChildren); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IfConditional(source(), newChildren.get(0), newChildren(1));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
*Input*: | ||
|
||
One or multiple _WHEN *condition* THEN *result_* clauses are used and the expression can optionally have | ||
an _ELSE *default_result_* clause. Every *condition* should be boolean expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be a boolean expression
Implement the ANSI SQL CASE expression which provides the if/else functionality common to most programming languages. The CASE expression can have multiple WHEN branches and becomes a powerful tool for SQL queries as it can be used in SELECT, WHERE, GROUP BY, HAVING and ORDER BY clauses. Closes: #36200 (cherry picked from commit 8b25774)
Add a TIP on how to use CASE to achieve custom bucketing with GROUP BY. Follows: elastic#41349
* SQL: [Docs] Add example for custom bucketing with CASE Add a TIP on how to use CASE to achieve custom bucketing with GROUP BY. Follows: #41349 * address comments * address comment
Implement the ANSI SQL CASE expression which provides the if/else functionality common to most programming languages. The CASE expression can have multiple WHEN branches and becomes a powerful tool for SQL queries as it can be used in SELECT, WHERE, GROUP BY, HAVING and ORDER BY clauses. Closes: elastic#36200
* SQL: [Docs] Add example for custom bucketing with CASE Add a TIP on how to use CASE to achieve custom bucketing with GROUP BY. Follows: elastic#41349 * address comments * address comment
Implement the ANSI SQL CASE expression which provides the if/else
functionality common to most programming languages.
The CASE expression can have multiple WHEN branches and becomes a
powerful tool for SQL queries as it can be used in SELECT, WHERE,
GROUP BY, HAVING and ORDER BY clauses.
Closes: #36200