Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18467][SQL] Extracts method for preparing arguments from StaticInvoke, Invoke and NewInstance and modify to short circuit if arguments have null when needNullCheck == true. #15901

Closed
wants to merge 29 commits into from
Closed
Changes from 3 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
9bf9aa9
Introduce `InvokeLike` to extract common logic from `StaticInvoke`, `…
ueshin Nov 15, 2016
28f6200
Refactor to remove unneeded null checking and fix nullability of `New…
ueshin Nov 15, 2016
2f30f53
Modify to short circuit if arguments have null when propageteNull == …
ueshin Nov 16, 2016
9ac3f28
Add a comment for `prepareArguments()`.
ueshin Nov 17, 2016
5ad6966
Define a variable for `propagateNull && arguments.exists(_.nullable)`.
ueshin Nov 17, 2016
a0ac177
Add a missing param comment for `propagateNull`.
ueshin Nov 17, 2016
757d33e
Rename variable from `argsHaveNull` to `containsNullInArguments`.
ueshin Nov 17, 2016
240fde4
Modify `InvokeLike` not to extend `Expression` because no need to ext…
ueshin Nov 17, 2016
6f6e0b3
Fix comments.
ueshin Nov 17, 2016
2bd6e50
Modify for readability.
ueshin Nov 17, 2016
bcb93db
Make `InvokeLike` extend `Expression` and `NonSQLExpression`.
ueshin Nov 17, 2016
d448b60
Use `propagatingNull`.
ueshin Nov 17, 2016
8894a96
Use `needNullCheck` instread of `propagatingNull`.
ueshin Nov 17, 2016
ca4558f
Modify `prepareArguments()` to return the result of argument null che…
ueshin Nov 17, 2016
4d9a037
Skip evaluate arguments if `obj.isNull == true` in `Invoke`.
ueshin Nov 17, 2016
ebb1241
Modify to prepare for further optimization.
ueshin Nov 17, 2016
99a59b2
Revert a part of last 2 commits.
ueshin Nov 17, 2016
243888a
Revert "Revert a part of last 2 commits."
ueshin Nov 17, 2016
e12a9bd
Revert "Modify to prepare for further optimization."
ueshin Nov 17, 2016
2c52d91
Use `obj.isNull` instead of `ev.isNull`.
ueshin Nov 17, 2016
43d2693
Revert and use a simple case.
ueshin Nov 17, 2016
bd9c09f
Remove unused argument.
ueshin Nov 18, 2016
501095f
Remove unneeded if.
ueshin Nov 18, 2016
1baac55
Modify generated code of `Invoke`.
ueshin Nov 18, 2016
831c521
Use resultIsNull as ev.isNull in `NewInstance`.
ueshin Nov 18, 2016
0b210f8
Remove unneeded zipWithIndex.
ueshin Nov 18, 2016
f8acda6
Use local variables for `resultIsNull`s and `argValue`s.
ueshin Nov 18, 2016
fe2871c
Revert "Use local variables for `resultIsNull`s and `argValue`s."
ueshin Nov 18, 2016
c88a1ff
Optimize a code in `StaticInvoke`.
ueshin Nov 19, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,73 @@ import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, ExprCo
import org.apache.spark.sql.catalyst.util.{ArrayBasedMapData, GenericArrayData}
import org.apache.spark.sql.types._

/**
* Common base class for [[StaticInvoke]], [[Invoke]], and [[NewInstance]].
*/
trait InvokeLike extends Expression {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to extend Expression?


def arguments: Seq[Expression]

def propagateNull: Boolean

def prepareArguments(ctx: CodegenContext, ev: ExprCode): (String, String, String) = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for this method? It is not simple enough to skip the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ev is not used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed.


val argsHaveNull =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this variable name looks like a boolean flag, can you think of a better name?

if (propagateNull && arguments.exists(_.nullable)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

propagateNull && arguments.exists(_.nullable) is used many times, define a variable for it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

val argsHaveNull = ctx.freshName("argsHaveNull")
ctx.addMutableState("boolean", argsHaveNull, "")
argsHaveNull
} else {
""
}
val argValues = arguments.zipWithIndex.map { case (e, i) =>
val argValue = ctx.freshName("argValue")
ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")
argValue
}

val argCodes =
if (propagateNull && arguments.exists(_.nullable)) {
s"$argsHaveNull = false;" +:
arguments.zipWithIndex.map { case (e, i) =>
val expr = e.genCode(ctx)
s"""
if (!$argsHaveNull) {
${expr.code}
""" +
(if (e.nullable) {
s"""
$argsHaveNull = ${expr.isNull};
${argValues(i)} = ${expr.value};
"""
} else {
s"${argValues(i)} = ${expr.value};"
}) +
"""
}
"""
}
} else {
arguments.zipWithIndex.map { case (e, i) =>
val expr = e.genCode(ctx)
s"""
${expr.code}
${argValues(i)} = ${expr.value};
"""
}
}
val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)

val setIsNull = if (propagateNull && arguments.exists(_.nullable)) {
s"${ev.isNull} = ${ev.isNull} || $argsHaveNull;"
} else {
""
}

(argCode, argValues.mkString(", "), setIsNull)
}
}

/**
* Invokes a static function, returning the result. By default, any of the arguments being null
* will result in returning null instead of calling the function.
Expand All @@ -50,7 +117,7 @@ case class StaticInvoke(
dataType: DataType,
functionName: String,
arguments: Seq[Expression] = Nil,
propagateNull: Boolean = true) extends Expression with NonSQLExpression {
propagateNull: Boolean = true) extends Expression with InvokeLike with NonSQLExpression {
Copy link
Contributor

@cloud-fan cloud-fan Nov 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not make InvokeLike extends Expression and NonSQLExpression, then we can just extends InvokeLike here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll modify them.


val objectName = staticObject.getName.stripSuffix("$")

Expand All @@ -62,16 +129,10 @@ case class StaticInvoke(

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
val javaType = ctx.javaType(dataType)
val argGen = arguments.map(_.genCode(ctx))
val argString = argGen.map(_.value).mkString(", ")

val callFunc = s"$objectName.$functionName($argString)"
val (argCode, argString, setIsNull) = prepareArguments(ctx, ev)

val setIsNull = if (propagateNull && arguments.nonEmpty) {
s"boolean ${ev.isNull} = ${argGen.map(_.isNull).mkString(" || ")};"
} else {
s"boolean ${ev.isNull} = false;"
}
val callFunc = s"$objectName.$functionName($argString)"

// If the function can return null, we do an extra check to make sure our null bit is still set
// correctly.
Expand All @@ -82,7 +143,8 @@ case class StaticInvoke(
}

val code = s"""
${argGen.map(_.code).mkString("\n")}
$argCode
boolean ${ev.isNull} = false;
$setIsNull
final $javaType ${ev.value} = ${ev.isNull} ? ${ctx.defaultValue(dataType)} : $callFunc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ${ev.isNull} at line 153 can be $resultIsNull. So the optimization for false ? ${ctx.defaultValue(dataType)} : $callFunc; can work here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. I modified. Thanks.

$postNullCheck
Expand All @@ -109,7 +171,7 @@ case class Invoke(
functionName: String,
dataType: DataType,
arguments: Seq[Expression] = Nil,
propagateNull: Boolean = true) extends Expression with NonSQLExpression {
propagateNull: Boolean = true) extends Expression with InvokeLike with NonSQLExpression {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment of Invoke doesn't have explanation for the param propagateNull. Can you also add it too?


override def nullable: Boolean = true
override def children: Seq[Expression] = targetObject +: arguments
Expand All @@ -131,8 +193,8 @@ case class Invoke(
override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
val javaType = ctx.javaType(dataType)
val obj = targetObject.genCode(ctx)
val argGen = arguments.map(_.genCode(ctx))
val argString = argGen.map(_.value).mkString(", ")

val (argCode, argString, setIsNull) = prepareArguments(ctx, ev)

val returnPrimitive = method.isDefined && method.get.getReturnType.isPrimitive
val needTryCatch = method.isDefined && method.get.getExceptionTypes.nonEmpty
Expand Down Expand Up @@ -164,12 +226,6 @@ case class Invoke(
"""
}

val setIsNull = if (propagateNull && arguments.nonEmpty) {
s"boolean ${ev.isNull} = ${obj.isNull} || ${argGen.map(_.isNull).mkString(" || ")};"
} else {
s"boolean ${ev.isNull} = ${obj.isNull};"
}

// If the function can return null, we do an extra check to make sure our null bit is still set
// correctly.
val postNullCheck = if (ctx.defaultValue(dataType) == "null") {
Expand All @@ -179,7 +235,8 @@ case class Invoke(
}
val code = s"""
${obj.code}
${argGen.map(_.code).mkString("\n")}
$argCode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to evaluate the arguments if obj.isNull == true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Thanks, I updated.

boolean ${ev.isNull} = ${obj.isNull};
$setIsNull
$javaType ${ev.value} = ${ctx.defaultValue(dataType)};
if (!${ev.isNull}) {
Expand Down Expand Up @@ -223,10 +280,10 @@ case class NewInstance(
arguments: Seq[Expression],
propagateNull: Boolean,
dataType: DataType,
outerPointer: Option[() => AnyRef]) extends Expression with NonSQLExpression {
outerPointer: Option[() => AnyRef]) extends Expression with InvokeLike with NonSQLExpression {
private val className = cls.getName

override def nullable: Boolean = propagateNull
override def nullable: Boolean = propagateNull && arguments.exists(_.nullable)

override def children: Seq[Expression] = arguments

Expand All @@ -245,51 +302,36 @@ case class NewInstance(

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
val javaType = ctx.javaType(dataType)
val argIsNulls = ctx.freshName("argIsNulls")
ctx.addMutableState("boolean[]", argIsNulls,
s"$argIsNulls = new boolean[${arguments.size}];")
val argValues = arguments.zipWithIndex.map { case (e, i) =>
val argValue = ctx.freshName("argValue")
ctx.addMutableState(ctx.javaType(e.dataType), argValue, "")
argValue
}

val argCodes = arguments.zipWithIndex.map { case (e, i) =>
val expr = e.genCode(ctx)
expr.code + s"""
$argIsNulls[$i] = ${expr.isNull};
${argValues(i)} = ${expr.value};
"""
}
val argCode = ctx.splitExpressions(ctx.INPUT_ROW, argCodes)
val (argCode, argString, setIsNull) = prepareArguments(ctx, ev)

val outer = outerPointer.map(func => Literal.fromObject(func()).genCode(ctx))

var isNull = ev.isNull
val setIsNull = if (propagateNull && arguments.nonEmpty) {
s"""
boolean $isNull = false;
for (int idx = 0; idx < ${arguments.length}; idx++) {
if ($argIsNulls[idx]) { $isNull = true; break; }
}
"""
val prepareIsNull = if (propagateNull && arguments.exists(_.nullable)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use propogatingNull?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I missed it, fixed.

s"boolean $isNull = false;"
} else {
isNull = "false"
""
}

val constructorCall = outer.map { gen =>
s"""${gen.value}.new ${cls.getSimpleName}(${argValues.mkString(", ")})"""
s"${gen.value}.new ${cls.getSimpleName}($argString)"
}.getOrElse {
s"new $className(${argValues.mkString(", ")})"
s"new $className($argString)"
}

val code = s"""
$argCode
${outer.map(_.code).getOrElse("")}
$prepareIsNull
$setIsNull
final $javaType ${ev.value} = $isNull ? ${ctx.defaultValue(javaType)} : $constructorCall;
"""
""" +
(if (propagateNull && arguments.exists(_.nullable)) {
s"final $javaType ${ev.value} = $isNull ? ${ctx.defaultValue(javaType)} : $constructorCall;"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If needNullCheck is true, we don't need to evaluate arguments if $isNull == true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya I think we do need because if needNullCheck is true, the $isNull stores the result of evaluating arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or am I misunderstanding what you mean..?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I see. This is correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Thanks.

Copy link
Contributor

@cloud-fan cloud-fan Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary? I'm pretty sure javac can optimize ... = false ? a : b to ... = b for us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use janino instead of javac?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I mean "java compiler"... I think janino is smart enough about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, updated. Thanks.

} else {
s"final $javaType ${ev.value} = $constructorCall;"
})
ev.copy(code = code, isNull = isNull)
}

Expand Down