Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better UDF support #152

Merged
merged 29 commits into from
Jun 21, 2022
Merged

Better UDF support #152

merged 29 commits into from
Jun 21, 2022

Conversation

Jolanrensen
Copy link
Collaborator

from issue: #143

One gotcha I noticed is that

val toNormalClass2 by udf.register { a: String, b: Int ->
    NormalClass(b, a)
}
shouldThrow<AnalysisException> { // toNormalClass2 is never accessed, so the delegate getValue function is not executed
    spark.sql("select toNormalClass2(first, second) from test2").show()
}

won't work since toNormalClass2 is never accessed, so the register function is not executed. A simple val a = toNormalClass2 already fixes this, but it's not optimal...

@Jolanrensen Jolanrensen added this to the 1.1.1 milestone May 30, 2022
@Jolanrensen Jolanrensen changed the base branch from spark-3.2 to main May 30, 2022 11:54
Created UserDefinedFunction and NamedUserDefinedFunction and 22 instances of each.
udf() and udf.register() functions with/without name for lambda, function reference, functional value reference.
.register() extension function in KSparkSession as well
aggregatorOf() function and udaf(). Can also be done using udf.register()
@Jolanrensen Jolanrensen changed the title Created udf delegates and tests Better UDF support Jun 7, 2022
@Jolanrensen
Copy link
Collaborator Author

Maybe a vararg typed udf as final one?

…y udf (since only WrappedArrays are allowed in udfs anyways).

Added TypedColumn.asWrappedArray() functions for list-likes
working on vararg udfs
Vararg unwrapper for spark (wip)
fixes overall
tests, tests, tests (wip)
# Conflicts:
#	pom.xml
@Jolanrensen
Copy link
Collaborator Author

To do: Finish tests, add examples, add in wiki/readme, probably merge udt first since column functions are changed here as well.

# Conflicts:
#	jupyter/src/main/kotlin/org/jetbrains/kotlinx/spark/api/jupyter/SparkIntegration.kt
#	kotlin-spark-api/3.2/src/main/kotlin/org/jetbrains/kotlinx/spark/api/SparkSession.kt
@Jolanrensen
Copy link
Collaborator Author

@Jolanrensen Jolanrensen requested a review from asm0dey June 20, 2022 13:20
@Jolanrensen
Copy link
Collaborator Author

@asm0dey whenever you have time to check it :)

// }
//
///** Creates delegate of [UDFRegistration]. Use like:
// * ```kotlin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this file should be deleted

@@ -91,6 +51,7 @@ class UDFWrapper0(private val udfName: String) {
* Registers the [func] with its [name] in [this].
*/
@OptIn(ExperimentalStdlibApi::class)
@Deprecated("Use new UDF notation", ReplaceWith("this.register(name, func)"), DeprecationLevel.HIDDEN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this!

@asm0dey
Copy link
Contributor

asm0dey commented Jun 20, 2022

It's definitely 1.2.0, not 1.1.1

@Jolanrensen Jolanrensen merged commit c85eee0 into main Jun 21, 2022
@Jolanrensen Jolanrensen deleted the udf+ branch June 21, 2022 19:22
This was referenced Jun 21, 2022
Jolanrensen added a commit that referenced this pull request Jun 21, 2022
Jolanrensen added a commit that referenced this pull request Jun 22, 2022
@Jolanrensen
Copy link
Collaborator Author

For future reference, these files were used to generate the UDFs:

generateUDF.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants