Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22937][SQL] SQL elt output binary for binary inputs #20135

Closed
wants to merge 3 commits into from

Conversation

maropu
Copy link
Member

@maropu maropu commented Jan 2, 2018

What changes were proposed in this pull request?

This pr modified elt to output binary for binary inputs.
elt in the current master always output data as a string. But, in some databases (e.g., MySQL), if all inputs are binary, elt also outputs binary (Also, this might be a small surprise).
This pr is related to #19977.

How was this patch tested?

Added tests in SQLQueryTestSuite and TypeCoercionSuite.

@@ -58,7 +58,7 @@ case class Concat(children: Seq[Expression]) extends Expression {
} else {
val childTypes = children.map(_.dataType)
if (childTypes.exists(tpe => !Seq(StringType, BinaryType).contains(tpe))) {
TypeCheckResult.TypeCheckFailure(
return TypeCheckResult.TypeCheckFailure(
s"input to function $prettyName should have StringType or BinaryType, but it's " +
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this pr though, this might be a bug I left in #19977.
(Since branch-2.3 has already been cut out, is it better to split this from this pr and backport into branch-2.3? @gatorsmile).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Also create a test case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

override def dataType: DataType = StringType

override def inputTypes: Seq[DataType] = IntegerType +: Seq.fill(children.size - 1)(StringType)
override def dataType: DataType = inputExprs.map(_.dataType).headOption.getOrElse(StringType)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return null of BinaryType when eltOutputAsString is false and there's only 1 parameter for Elt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, but I miss your point. What's the type of the only 1 parameter?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant for expression elt(1) with eltOutputAsString as false, since the index is out of range. Is it better to make the result null of BinaryType? Now I think your solution makes more sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We issue an exception when the input argument is 1.

case c @ Elt(children) =>
val index = children.head
val newIndex = ImplicitTypeCasts.implicitCast(index, IntegerType).getOrElse(index)
c.copy(children = newIndex +: children.tail)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is better to merge the common parts of the last 2 cases? We can add one more nested pattern match, but I'm not sure this is a better way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update soon.

@maropu
Copy link
Member Author

maropu commented Jan 2, 2018

I'll update the migration guide, too.

@SparkQA
Copy link

SparkQA commented Jan 2, 2018

Test build #85597 has finished for PR 20135 at commit 7330438.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class EltCoercion(conf: SQLConf) extends TypeCoercionRule
  • case class Elt(children: Seq[Expression]) extends Expression

@SparkQA
Copy link

SparkQA commented Jan 2, 2018

Test build #85603 has finished for PR 20135 at commit 2aeb56d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 4, 2018

Test build #85661 has finished for PR 20135 at commit 1ac512b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jan 4, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Jan 4, 2018

Test build #85673 has finished for PR 20135 at commit 1ac512b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jan 6, 2018

@gatorsmile

@gatorsmile
Copy link
Member

LGTM

@gatorsmile
Copy link
Member

Thanks! Merged to master/2.3

asfgit pushed a commit that referenced this pull request Jan 6, 2018
## What changes were proposed in this pull request?
This pr modified `elt` to output binary for binary inputs.
`elt` in the current master always output data as a string. But, in some databases (e.g., MySQL), if all inputs are binary, `elt` also outputs binary (Also, this might be a small surprise).
This pr is related to #19977.

## How was this patch tested?
Added tests in `SQLQueryTestSuite` and `TypeCoercionSuite`.

Author: Takeshi Yamamuro <[email protected]>

Closes #20135 from maropu/SPARK-22937.

(cherry picked from commit e8af7e8)
Signed-off-by: gatorsmile <[email protected]>
@asfgit asfgit closed this in e8af7e8 Jan 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants