-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20253][SQL] Remove unnecessary nullchecks of a return value from Spark runtime routines in generated Java code #17569
Conversation
Test build #75611 has finished for PR 17569 at commit
|
Test build #75614 has finished for PR 17569 at commit
|
@cloud-fan could you please review this? |
@@ -356,7 +361,8 @@ object ScalaReflection extends ScalaReflection { | |||
udt.userClass.getAnnotation(classOf[SQLUserDefinedType]).udt(), | |||
Nil, | |||
dataType = ObjectType(udt.userClass.getAnnotation(classOf[SQLUserDefinedType]).udt())) | |||
Invoke(obj, "deserialize", ObjectType(udt.userClass), getPath :: Nil) | |||
Invoke(obj, "deserialize", ObjectType(udt.userClass), getPath :: Nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deserialize
is totally implemented by users, can we guarantee not return null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It is UDT. I have checked deserialized
only in Spark runtime.
@@ -365,7 +371,8 @@ object ScalaReflection extends ScalaReflection { | |||
udt.getClass, | |||
Nil, | |||
dataType = ObjectType(udt.getClass)) | |||
Invoke(obj, "deserialize", ObjectType(udt.userClass), getPath :: Nil) | |||
Invoke(obj, "deserialize", ObjectType(udt.userClass), getPath :: Nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@@ -577,7 +584,7 @@ object ScalaReflection extends ScalaReflection { | |||
udt.userClass.getAnnotation(classOf[SQLUserDefinedType]).udt(), | |||
Nil, | |||
dataType = ObjectType(udt.userClass.getAnnotation(classOf[SQLUserDefinedType]).udt())) | |||
Invoke(obj, "serialize", udt, inputObject :: Nil) | |||
Invoke(obj, "serialize", udt, inputObject :: Nil, returnNullable = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@@ -586,7 +593,7 @@ object ScalaReflection extends ScalaReflection { | |||
udt.getClass, | |||
Nil, | |||
dataType = ObjectType(udt.getClass)) | |||
Invoke(obj, "serialize", udt, inputObject :: Nil) | |||
Invoke(obj, "serialize", udt, inputObject :: Nil, returnNullable = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sam here
@@ -608,7 +616,7 @@ case class MapObjects private( | |||
$convertedArray = $arrayConstructor; | |||
""", | |||
genValue => s"$convertedArray[$loopIndex] = $genValue;", | |||
s"new ${classOf[GenericArrayData].getName}($convertedArray);" | |||
s"new ${classOf[GenericArrayData].getName}($convertedArray); /*###*/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. It was my debug code...
if ($funcResult == null) { | ||
${ev.isNull} = true; | ||
} else { | ||
if (!returnNullable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we have postNullCheck
, can we always go to this branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, yes we can.
""" | ||
} | ||
|
||
// If the function can return null, we do an extra check to make sure our null bit is still set | ||
// correctly. | ||
val postNullCheck = if (ctx.defaultValue(dataType) == "null") { | ||
val postNullCheck = if (ctx.defaultValue(dataType) == "null" && returnNullable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, can we embed postNullCheck
to evaluate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done
@@ -96,6 +96,16 @@ class DatasetPrimitiveSuite extends QueryTest with SharedSQLContext { | |||
checkDataset(dsBoolean.map(e => !e), false, true) | |||
} | |||
|
|||
test("mapPrimitiveArray") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do these tests fail before this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I have just added to confirm this check works well.
|
||
case StringType => | ||
Invoke(input, "toString", ObjectType(classOf[String])) | ||
Invoke(input, "toString", ObjectType(classOf[String]), returnNullable = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we check how many places we set returnNullable
to true? If it's only a few, we can change the default value of returnNullable
to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is statistics for 59 call sites of Invoke()
.
18: dataType
is primitive type
21: returnNullable
is true (no specification at call site, as default)
19: returnNullable
is false
1: set a variable to `returnNullable
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok let's keep the default value unchanged
Test build #75619 has finished for PR 17569 at commit
|
Test build #75621 has finished for PR 17569 at commit
|
Seems there are places (i.e., |
@@ -225,25 +225,26 @@ case class Invoke( | |||
getFuncResult(ev.value, s"${obj.value}.$functionName($argString)") | |||
} else { | |||
val funcResult = ctx.freshName("funcResult") | |||
// If the function can return null, we do an extra check to make sure our null bit is still | |||
// set correctly. | |||
val postNullCheck = if (!returnNullable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename postNullCheck
. It is actually not only null check but also assigning the function result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, done
Test build #75625 has finished for PR 17569 at commit
|
Test build #75627 has finished for PR 17569 at commit
|
@@ -225,25 +225,26 @@ case class Invoke( | |||
getFuncResult(ev.value, s"${obj.value}.$functionName($argString)") | |||
} else { | |||
val funcResult = ctx.freshName("funcResult") | |||
// If the function can return null, we do an extra check to make sure our null bit is still | |||
// set correctly. | |||
val postNullCheckAndAssign = if (!returnNullable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about just assignResult
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
LGTM |
1 similar comment
LGTM |
Test build #75628 has finished for PR 17569 at commit
|
thanks, merging to master! |
What changes were proposed in this pull request?
This PR elminates unnecessary nullchecks of a return value from known Spark runtime routines. We know whether a given Spark runtime routine returns
null
or not (e.g.ArrayData.toDoubleArray()
never returnsnull
). Thus, we can eliminate a null check for the return value from the Spark runtime routine.When we run the following example program, now we get the Java code "Without this PR". In this code, since we know
ArrayData.toDoubleArray()
never returns ``null```, we can eliminate null checks at lines 90-92, and 97.Without this PR
With this PR (removed most of lines 90-97 in the above code)
How was this patch tested?
Add test suites to
DatasetPrimitiveSuite