Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22668][SQL] Ensure no global variables in arguments of method split by CodegenContext.splitExpressions() #20021

Closed
wants to merge 5 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

Passing global variables to the split method is dangerous, as any mutating to it is ignored and may lead to unexpected behavior.

To prevent this, one approach is to make sure no expression would output global variables: Localizing lifetime of mutable states in expressions.

Another approach is, when calling ctx.splitExpression, make sure we don't use children's output as parameter names.

Approach 1 is actually hard to do, as we need to check all expressions and operators that support whole-stage codegen. Approach 2 is easier as the callers of ctx.splitExpressions are not too many.

Besides, approach 2 is more flexible, as children's output may be other stuff that can't be parameter name: literal, inlined statement(a + 1), etc.

close #19865
close #19938

How was this patch tested?

existing tests

@cloud-fan
Copy link
Contributor Author

@kiszk
Copy link
Member

kiszk commented Dec 19, 2017

Does this PR just check whether the condition of approach 2 occurs or not? If approach 2 does not replace with a temporary variable, assertion may occur.

Am I wrong?

@cloud-fan
Copy link
Contributor Author

Yea, I actually manually checked all the caller side of ctx.splitExpressions and didn't find any issue, adding an assert helps me prove it.

I also reverted some changes that tried to localize global variables as they are not needed now.

@SparkQA
Copy link

SparkQA commented Dec 19, 2017

Test build #85114 has finished for PR 20021 at commit 9d22975.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Dec 19, 2017

I see. I agree that to add assert helps all of developers.

My question is what is a rule to decide whether a mutable state should be localized or not? I think that we have to ensure caller of ctx.splitExpression does not get any EvalCode.

if (Utils.isTesting) {
// Passing global variables to the split method is dangerous, as any mutating to it is
// ignored and may lead to unexpected behavior.
val mutableStateNames = mutableStates.map(_._2).toSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after finally merging SPARK-18016, this should be the union of arrayCompactedMutableStates.flatMap(_.arrayNames) and inlinedMutableStates.map(_._2), I think

@cloud-fan
Copy link
Contributor Author

My question is what is a rule to decide whether a mutable state should be localized or not

There is no rule as we don't need to localize global variables anymore.

if (Utils.isTesting) {
// Passing global variables to the split method is dangerous, as any mutating to it is
// ignored and may lead to unexpected behavior.
// We don't need to check `arrayCompactedMutableStates` here, as it results to array access
Copy link
Contributor

@mgaido91 mgaido91 Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we declare a variable with the same name of an arrayCompactedMutableStates ? Let's say that we have:

public class Foo {
 private Object[] ourArray;
 // ....
 private void ourMethod() {
   Object[] ourArray = new Object[1];
   ourSplitFunction(ourArray);
 }
 private void ourSplitFunction(Object[] ourArray) {
  ourArray[0] = null;
 }
 // ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any place we would do that? ctx.addMutableState returns an array access code, I can't image a caller would extract the array name from it and use it as parameters...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, but currently there is also no place which creates the problem for which this assertion is being introduces. Of course this case is very very unlikely, but since we are introducing the check, I think that the effort to ensure also this very remote corner case is very low...

Copy link
Member

@kiszk kiszk Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is no place now, but there could be a place. If smj_value16 is replaced with an array element by ctx.addMutableState, the array element was passed.

This PR fixed this issue by eliminating usage of mutable state.

@kiszk
Copy link
Member

kiszk commented Dec 19, 2017

Here is a fix that avoids to pass a global variable to ctx.splitExpressions. Before this PR, this error occurs. I understand that my fix is not localization, but one of approaches does not pass a global variable to successor steps like localization. My lesson learned is to pass a global variable accidentally happens thru deep call stack.

Would it be possible to share finding from your manual check at all the caller side of ctx.splitExpressions. Why can we ensure any global variables are not passed to ctx.splitExpressions? I think that it would be good to put the rational into the comment.

@SparkQA
Copy link

SparkQA commented Dec 19, 2017

Test build #85112 has finished for PR 20021 at commit aadb838.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 19, 2017

Test build #85120 has finished for PR 20021 at commit 18ec598.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 19, 2017

Test build #85121 has finished for PR 20021 at commit f1afb92.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -930,6 +930,18 @@ class CodegenContext {
// inline execution if only one block
blocks.head
} else {
if (Utils.isTesting) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we only do the assert in testing? Because passing global variables won't raise compile error, if we have any global variables passed in when not in testing, the codegen still work and may lead to wrong result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you said, it may lead, but likely it doesn't. Then I do think that the best option is to assert it only in testing, where this might help finding potential bugs. In production it is an overkill to throw an exception for a situation which most likely is not a problem IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a discussion about this testing.

@cloud-fan
Copy link
Contributor Author

Why can we ensure any global variables are not passed to ctx.splitExpressions?

You can also do this check. Look at the arguments parameter, all the caller sides are using local variables, so we can be sure that no global variables would appear as parameters in the split method.

@kiszk
Copy link
Member

kiszk commented Dec 20, 2017

I checked some call sites. Here is one example that extraArguments has ev.value instead of local variable.
WDYT?

@SparkQA
Copy link

SparkQA commented Dec 20, 2017

Test build #85150 has finished for PR 20021 at commit 3d44195.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Dec 20, 2017

Here is another example. This is complicated.

result is passed to arguments of ctx.splitExpressions in genHashForStruct. The result may get ev.value in a certain call path that is started from here while result is a local variable. I show the call path in the following example.

As this example points out, while arguments in ctx.splitExpressions has local variable, a local variable may have ev.value in some calling context. It does not seem to be easy to ensure no global variable in arguments.
IMHO, it would be good to combine localization and adding assert. Then, we declare a contract no global variable in arguments in comment.

WDYT?

// hash.scala, line 277
override def doGenCode(ctx, ev): ExprCode = { 
  ...
    computeHash(childGen.value, child.dataType, ev.value, ctx)
  ...
}

// hash.scala, line 448
protected def computeHash(input, dataType, result, ctx): String = 
  computeHashWithTailRec(input, dataType, result, ctx)

// hash.scala, line 414
private def computeHashWithTailRec(input, dataType, result, ctx): String = dataType match {
  ...
    case StructType(fields) => genHashForStruct(ctx, input, result, fields)
}


//hash.scala, line 402
protected def genHashForStruct(ctx, input, result, fields) {
  ...
  ctx.splitExpressions(
    ...
    arguments = Seq("InternalRow" -> input, hashResultType -> result),
    ...
}

@kiszk
Copy link
Member

kiszk commented Dec 20, 2017

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Dec 20, 2017

Test build #85164 has finished for PR 20021 at commit 3d44195.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

I checked some call sites. Here is one example that extraArguments has ev.value instead of local variable.

Hey, ev.value is not from children, it's the output of the current expression, which we can make sure it's local variable, e.g. https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L296

@kiszk
Copy link
Member

kiszk commented Dec 20, 2017

Oh, you are right. I misunderstood. After our optimizations, output is also a part of arguments. Let me check others again.
In particular, why it occurred.

@mgaido91
Copy link
Contributor

Honestly, I liked very much doing the test only for testing and not throwing an exception in production. IMHO it is an overkill to throw an exception in production and in the remote case that we happen to forget one place where this check can throw the exception, but it is not an issue, as it is perfectly possible, this would also cause a regression.

Thus, honestly I am strongly against this solution.

@cloud-fan
Copy link
Contributor Author

Well, I proposed to check it only for tests at the beginning, but I don't have a strong preference now as the new approach I took can guarantee that no place would violate it, by looking at all the caller sides of ctx.splitExpressions.

Anyway only checking it in tests is safer, WDYT @viirya ?

@viirya
Copy link
Member

viirya commented Dec 20, 2017

Ok. I'm fine with only checking it in tests.

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85240 has finished for PR 20021 at commit 900f246.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Dec 21, 2017

retest this please.

@viirya
Copy link
Member

viirya commented Dec 21, 2017

LGTM

@mgaido91
Copy link
Contributor

LGTM too, thanks!

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85245 has finished for PR 20021 at commit 900f246.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85262 has finished for PR 20021 at commit 900f246.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

thanks, merging to master!

@asfgit asfgit closed this in 8a0ed5a Dec 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants