-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23141][SQL][PYSPARK] Support data type string as a returnType for registerJavaFunction. #20307
Conversation
cc @HyukjinKwon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, LGTM.
@@ -310,14 +310,22 @@ def registerJavaFunction(self, name, javaClassName, returnType=None): | |||
... "javaStringLength", "test.org.apache.spark.sql.JavaStringLength", IntegerType()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, seems we need to fix :param returnType:
across all other related APIs saying it takes DDL-formatted type string.
@ueshin, mind opening a minor PR for this - udf
, pandas_udf
, registerJavaFunction
and register
separately? If you are busy, will do it tonight. Doing this here is fine to me too, up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll update them here soon.
""" | ||
|
||
jdt = None | ||
if returnType is not None: | ||
if not isinstance(returnType, DataType): | ||
returnType = _parse_datatype_string(returnType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The param doc needs to be modified too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that's #20307 (comment) :).
Test build #86317 has finished for PR 20307 at commit
|
Test build #86322 has finished for PR 20307 at commit
|
python/pyspark/sql/functions.py
Outdated
@@ -2108,7 +2108,8 @@ def udf(f=None, returnType=StringType()): | |||
can fail on special rows, the workaround is to incorporate the condition into the functions. | |||
|
|||
:param f: python function if used as a standalone function | |||
:param returnType: a :class:`pyspark.sql.types.DataType` object | |||
:param returnType: the return type of the registered user-defined function. The value can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems typo: the return type of the registered user-defined function.
-> the return type of the user-defined function.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I'll fix it. Thanks!
Test build #86339 has finished for PR 20307 at commit
|
Merged to master and branch-2.3. |
…for registerJavaFunction. ## What changes were proposed in this pull request? Currently `UDFRegistration.registerJavaFunction` doesn't support data type string as a `returnType` whereas `UDFRegistration.register`, `udf`, or `pandas_udf` does. We can support it for `UDFRegistration.registerJavaFunction` as well. ## How was this patch tested? Added a doctest and existing tests. Author: Takuya UESHIN <[email protected]> Closes #20307 from ueshin/issues/SPARK-23141. (cherry picked from commit 5063b74) Signed-off-by: hyukjinkwon <[email protected]>
What changes were proposed in this pull request?
Currently
UDFRegistration.registerJavaFunction
doesn't support data type string as areturnType
whereasUDFRegistration.register
,@udf
, or@pandas_udf
does.We can support it for
UDFRegistration.registerJavaFunction
as well.How was this patch tested?
Added a doctest and existing tests.