Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] add type hints for custom objective and metric functions in scikit-learn interface #4547
[python] add type hints for custom objective and metric functions in scikit-learn interface #4547
Changes from 1 commit
a8af94e
9d00c4c
ea1aada
f469ccb
07d2170
d39c040
a2ff6ee
ccca965
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify why are you distinguishing all these three types (
_ArrayLike
,_GroupType
,np.ndarray
)? They are all documented asarray-like
.LightGBM/python-package/lightgbm/sklearn.py
Lines 32 to 49 in bd28a36
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize that
y_true
andy_pred
could be lists, I thought they had to be a pandas Series, numpy array, or scipy matrix.For grad and hess, it seems that they cannot be scipy matrices or pandas DataFrames / Series (although I didn't realize they could be lists)
LightGBM/python-package/lightgbm/basic.py
Lines 2956 to 2957 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Lines 2972 to 2977 in bd28a36
To be honest, I'm pretty unsure about the meaning of "array-like" in different parts of LightGBM's docs and I'm not always sure which combinations of these are supported when I see that:
So I took a best guess based on a quick look through the code, but I probably need to test all of those combinations and then updated this PR / the docs as appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, absolutely agree with that
array-like
everywhere in the sklearn-wrapper looks confusing. I might be wrong, but it was written before scikit-learn introduced a formal definition ofarray-like
term:https://scikit-learn.org/stable/glossary.html#term-array-like
All these values in custom function signatures are supposed to have exactly 1 dimension, right? I believe it will be safe for now assign them the following types which we treat as 1-d array internally
LightGBM/python-package/lightgbm/basic.py
Lines 179 to 180 in bd28a36
For
grad
andhess
that functionlist_to_1d_numpy
is applied directly.LightGBM/python-package/lightgbm/basic.py
Lines 2984 to 2985 in bd28a36
For
weight
andgroup
onlynp.ndarray
is possible, if I'm not mistaken:LightGBM/python-package/lightgbm/sklearn.py
Lines 176 to 179 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Lines 2215 to 2225 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Lines 2271 to 2288 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Lines 1507 to 1510 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Lines 2099 to 2119 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Lines 2141 to 2162 in bd28a36
For
y_true
the same logic is applicable as forweight
andgroup
.LightGBM/python-package/lightgbm/sklearn.py
Line 172 in bd28a36
For
y_pred
onlynp.ndarray
is possibleLightGBM/python-package/lightgbm/basic.py
Line 2956 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Line 3732 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Line 3763 in bd28a36
LightGBM/python-package/lightgbm/basic.py
Line 3750 in bd28a36
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so long to get back to this one!
I just pushed ea1aada with my best understanding of your comments above, but to be honest I still am confused about exactly what is allowed.
Here is my interpretation of those comments / links:
y_true
= list, numpy array, or pandas Seriesy_pred
= numpy arraygroup
= numpy arrayweight
= numpy arrayy_true
= list, numpy array, or pandas Seriesy_pred
= numpy arraygroup
= numpy arraygrad
(output) = list, numpy array, or pandas Serieshess
(output) = list, numpy array, or pandas SeriesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double-checked this and I think your interpretation is fine. Thanks for the detailed investigation.