Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25044][SQL][followup] add back UserDefinedFunction.inputTypes #22319

Closed
wants to merge 9 commits into from

Conversation

cloud-fan
Copy link
Contributor

@cloud-fan cloud-fan commented Sep 3, 2018

What changes were proposed in this pull request?

This is a followup of #22259 .

Scala case class has a wide surface: apply, unapply, accessors, copy, etc.

In #22259 , we change the type of UserDefinedFunction.inputTypes from Option[Seq[DataType]] to Option[Seq[Schema]]. This breaks backward compatibility.

This PR changes the type back, and use a var to keep the new nullable info.

How was this patch tested?

N/A

@cloud-fan
Copy link
Contributor Author

cc @srowen @gatorsmile

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95598 has finished for PR 22319 at commit e9c6fbc.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.


private var _nameOption: Option[String] = None
private var _nullable: Boolean = true
private var _deterministic: Boolean = true

// This is to keep backward compatibility for this case class.
// TODO: revisit this case class in Spark 3.0, and narrow down the public surface.
def inputTypes: Option[Seq[DataType]] = {
Copy link
Member

@HyukjinKwon HyukjinKwon Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan, I think this still breaks compatibility when UserDefinedFunction's used in a pattern match.

object UserDefinedFunction {
// This is to keep backward compatibility for this case class.
// TODO: revisit this case class in Spark 3.0, and narrow down the public surface.
def unapply(arg: UserDefinedFunction): Option[(AnyRef, DataType, Option[Seq[DataType]])] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this still break binary compatibility since we bind to another signature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked locally and apparently not - looks explicit unapply here and unapply from case class are handled as the same signature from my cursory test.

@@ -129,3 +135,11 @@ case class UserDefinedFunction protected[sql] (
}
}
}

object UserDefinedFunction {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private[sql] since we don't explicitly mention expressions package is meant to be internal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay if this is the only way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'm adding back the public unapply method, so this must be public.

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95599 has finished for PR 22319 at commit 5acb1df.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95600 has finished for PR 22319 at commit 2245395.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

object UserDefinedFunction {
// This is to keep backward compatibility for this case class.
// TODO: revisit this case class in Spark 3.0, and narrow down the public surface.
def unapply(obj: Any): Option[(AnyRef, DataType, Option[Seq[DataType]])] = obj match {
Copy link
Member

@HyukjinKwon HyukjinKwon Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wenchen, can we have a small test to check if the previous pattern matching works or not, and can you double check this by mimicking the case by javap -v manually at least since this looks not being checked?

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95611 has finished for PR 22319 at commit 01a51ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

It's too hard to work around the case class compatibility issue. I'm leaving it unchanged, and add private mutable variables to store the nullable info.

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95626 has finished for PR 22319 at commit c77db87.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

I prefer the current way as well.

@SparkQA
Copy link

SparkQA commented Sep 3, 2018

Test build #95630 has finished for PR 22319 at commit d290fec.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -41,12 +41,16 @@ import org.apache.spark.sql.types.DataType
case class UserDefinedFunction protected[sql] (
f: AnyRef,
dataType: DataType,
inputTypes: Option[Seq[ScalaReflection.Schema]]) {
inputTypes: Option[Seq[DataType]]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. This is why we added _nameOption, _nullable and _deterministic in 2.3 release.

Please also remove the changes of MimaExcludes.scala made in #22259

@SparkQA
Copy link

SparkQA commented Sep 4, 2018

Test build #95644 has finished for PR 22319 at commit 4791240.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 4, 2018

Test build #95660 has finished for PR 22319 at commit 2b2c894.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Sep 4, 2018

Hm, still need one MiMa exclusion:

[error]  * the type hierarchy of object org.apache.spark.sql.expressions.UserDefinedFunction is different in current version. Missing types {scala.runtime.AbstractFunction3}
[error]    filter with: ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.sql.expressions.UserDefinedFunction$")

@HyukjinKwon
Copy link
Member

Shall we update migration guide about the compatibility?

@cloud-fan
Copy link
Contributor Author

I've fixed all the compatibility issues. Is there something else we want to let users know?

@SparkQA
Copy link

SparkQA commented Sep 5, 2018

Test build #95691 has finished for PR 22319 at commit 9e060a4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please


// We have to use a name different than `UserDefinedFunction` here, to avoid breaking the binary
// compatibility of the auto-generate UserDefinedFunction object.
private[sql] object SparkUserDefinedFunction {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so we're working around by new object here. SGTM. I assumed mutable status and copy stuff but trivial or not an issue.

@gatorsmile
Copy link
Member

LGTM

@SparkQA
Copy link

SparkQA commented Sep 5, 2018

Test build #95696 has finished for PR 22319 at commit 9e060a4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Sep 5, 2018

Test build #95700 has finished for PR 22319 at commit 9e060a4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is a little hacky internally but maximizes compatibility.

@cloud-fan
Copy link
Contributor Author

thanks, merging to master!

@asfgit asfgit closed this in 341b55a Sep 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants