-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23256][ML][PYTHON] Add columnSchema method to PySpark image reader #20475
Conversation
@MrBago, @BryanCutler, @imatiach-msft, and @MLnick, could you take a look please? |
python/pyspark/ml/image.py
Outdated
:return: a :class:`StructType` for image column, | ||
``struct<origin:string, height:int, width:int, nChannels:int, mode:int, data:binary>``. | ||
|
||
.. versionadded:: 2.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with 2.4.0. Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this came out of the 2.3 ml QA and it's mostly an improvement to the python API, I think maybe 2.4 is best. But it is a new API do maybe it's ok to include in 2.3..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, let me go with 2.4.0.
Test build #86932 has finished for PR 20475 at commit
|
@HyukjinKwon looks like a great change to me, thank you for exposing the method in pyspark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice method!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @HyukjinKwon!
python/pyspark/ml/image.py
Outdated
:return: a :class:`StructType` for image column, | ||
``struct<origin:string, height:int, width:int, nChannels:int, mode:int, data:binary>``. | ||
|
||
.. versionadded:: 2.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this came out of the 2.3 ml QA and it's mostly an improvement to the python API, I think maybe 2.4 is best. But it is a new API do maybe it's ok to include in 2.3..
Test build #86996 has finished for PR 20475 at commit
|
Thank you @imatiach-msft, @dongjoon-hyun, @felixcheung and @BryanCutler. Merged to master only. |
What changes were proposed in this pull request?
This PR proposes to add
columnSchema
in Python side too.How was this patch tested?
Manually tested and unittest was added in
python/pyspark/ml/tests.py
.