Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8280][SPARK-8281][SQL]Handle NaN, null and Infinity in math #7451

Closed
wants to merge 3 commits into from

Conversation

yjshen
Copy link
Member

@yjshen yjshen commented Jul 16, 2015

@yjshen
Copy link
Member Author

yjshen commented Jul 16, 2015

Log family semantic comparision:

Expression Hive Spark SQL
ln(0.0) NULL -Infinity
ln(-1) NULL NaN

Since we disable udf7 to enable log base between (0.0, 1], we do not have log related comparsion with Hive. So, what about other semantics of Log?

@@ -387,27 +370,18 @@ case class Atan2(left: Expression, right: Expression)

protected override def nullSafeEval(input1: Any, input2: Any): Any = {
// With codegen, the values returned by -0.0 and 0.0 are different. Handled with +0.0
val result = math.atan2(input1.asInstanceOf[Double] + 0.0, input2.asInstanceOf[Double] + 0.0)
if (result.isNaN) null else result
math.atan2(input1.asInstanceOf[Double] + 0.0, input2.asInstanceOf[Double] + 0.0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need + 0.0 here, math.atan2 is calling java.lang.Math.atan2 inside

@yjshen
Copy link
Member Author

yjshen commented Jul 16, 2015

cc @rxin

@rxin
Copy link
Contributor

rxin commented Jul 16, 2015

Let's follow Hive for log, and make sure we add a unit test.

@yjshen
Copy link
Member Author

yjshen commented Jul 17, 2015

So we behave like Hive except that we do support log base in between (0.0, 1.0], and add a unit test like udf7

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Yup.

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Jenkins, add to whitelist.

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Jenkins, ok to test.

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Jenkins, test this please.

if ($c1 <= 0.0 || $c2 <= 0.0) {
${ev.isNull} = true;
} else {
${ev.primitive} = java.lang.Math.log($c2) / java.lang.Math.log($c1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed a space here

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Jul 17, 2015

Test build #37596 has finished for PR 7451 at commit fb73b87.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class UnaryLogExpression(f: Double => Double, name: String)
    • case class Log(child: Expression) extends UnaryLogExpression(math.log, "LOG")
    • case class Log10(child: Expression) extends UnaryLogExpression(math.log10, "LOG10")
    • case class Log1p(child: Expression) extends UnaryLogExpression(math.log1p, "LOG1P")

@SparkQA
Copy link

SparkQA commented Jul 17, 2015

Test build #37605 has finished for PR 7451 at commit 003ab23.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class RFormula(override val uid: String)
    • abstract class LeafExpression extends Expression
    • abstract class UnaryExpression extends Expression
    • abstract class BinaryExpression extends Expression
    • abstract class UnaryLogExpression(f: Double => Double, name: String)
    • case class Log(child: Expression) extends UnaryLogExpression(math.log, "LOG")
    • case class Log10(child: Expression) extends UnaryLogExpression(math.log10, "LOG10")
    • case class Log1p(child: Expression) extends UnaryLogExpression(math.log1p, "LOG1P")
    • case class Length(child: Expression) extends UnaryExpression with ExpectsInputTypes
    • case class FormatNumber(x: Expression, d: Expression)
    • abstract class BinaryNode extends LogicalPlan

@yjshen
Copy link
Member Author

yjshen commented Jul 17, 2015

rebased the code to resolve merge conflicts.

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Jul 17, 2015

Test build #37646 has finished for PR 7451 at commit a83b47c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class UnsafeProjection extends Projection
    • case class FromUnsafeProjection(fields: Seq[DataType]) extends Projection
    • abstract class BaseProjection extends Projection
    • class SpecificProjection extends $
    • class SpecificProjection extends $
    • abstract class UnaryLogExpression(f: Double => Double, name: String)
    • case class Log(child: Expression) extends UnaryLogExpression(math.log, "LOG")
    • case class Log10(child: Expression) extends UnaryLogExpression(math.log10, "LOG10")
    • case class Log1p(child: Expression) extends UnaryLogExpression(math.log1p, "LOG1P")

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Oops there is another conflict. I tried resolving this locally but it was a big one. Can you resolve it?

@yjshen
Copy link
Member Author

yjshen commented Jul 17, 2015

yes, I've resolved it locally, push here again in a minute.

@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Jenkins, retest this please.

1 similar comment
@rxin
Copy link
Contributor

rxin commented Jul 17, 2015

Jenkins, retest this please.

@rxin
Copy link
Contributor

rxin commented Jul 18, 2015

Thanks - I've merged this.

@asfgit asfgit closed this in 529a2c2 Jul 18, 2015
@SparkQA
Copy link

SparkQA commented Jul 18, 2015

Test build #1097 has finished for PR 7451 at commit 47a529d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class UnaryLogExpression(f: Double => Double, name: String)
    • case class Conv(numExpr: Expression, fromBaseExpr: Expression, toBaseExpr: Expression)
    • case class Log(child: Expression) extends UnaryLogExpression(math.log, "LOG")
    • case class Log10(child: Expression) extends UnaryLogExpression(math.log10, "LOG10")
    • case class Log1p(child: Expression) extends UnaryLogExpression(math.log1p, "LOG1P")

@SparkQA
Copy link

SparkQA commented Jul 18, 2015

Test build #37676 has finished for PR 7451 at commit 47a529d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class UnaryLogExpression(f: Double => Double, name: String)
    • case class Log(child: Expression) extends UnaryLogExpression(math.log, "LOG")
    • case class Log10(child: Expression) extends UnaryLogExpression(math.log10, "LOG10")
    • case class Log1p(child: Expression) extends UnaryLogExpression(math.log1p, "LOG1P")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants