Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dask] [python-package] include support for column array as label #3943

Merged
merged 16 commits into from
Feb 24, 2021

Conversation

jmoralez
Copy link
Collaborator

This solves #3909 by allowing to have an (n_samples, 1) array as label when data is a pandas.DataFrame.

@jmoralez
Copy link
Collaborator Author

Hi @jameslamb. Can you give me your thoughts on these changes?

@jameslamb
Copy link
Collaborator

Thanks so much for looking into this!

Can you give me your thoughts on these changes?

Can you please add tests (one in test_sklearn.py, one in test_dask.py) testing that training is successful and correct when data is a pandas DataFrame and label is a numpy column array? Once you've written tests like that proving that this PR fixes the issue described in #3909, I'd be happy to give a more thorough review on the approach.

@jmoralez jmoralez marked this pull request as ready for review February 12, 2021 04:32
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The fix looks good to me. I left some suggestions for making the tests a bit stronger.

tests/python_package_test/test_dask.py Show resolved Hide resolved
tests/python_package_test/test_dask.py Show resolved Hide resolved
@jameslamb jameslamb changed the title Include support for column array as label [dask] [python-package] include support for column array as label Feb 13, 2021
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes look good to me, thanks for picking this up!

@jmoralez
Copy link
Collaborator Author

Thank you for your help, James! I just realized I wasn't fitting to the column array in the test for dask so I fixed that.

@StrikerRUS
Copy link
Collaborator

@jmoralez
Sorry for the inconvenience. We just merged #3947. Could you please resolve conflicts?

@jameslamb
Copy link
Collaborator

@jmoralez sorry again, but can you please merge master into this? We just fixed some additional CI issues 😬

python-package/lightgbm/basic.py Outdated Show resolved Hide resolved
python-package/lightgbm/basic.py Outdated Show resolved Hide resolved
tests/python_package_test/test_sklearn.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! Thank you very much!

@StrikerRUS StrikerRUS merged commit 5dacd60 into microsoft:master Feb 24, 2021
@jmoralez jmoralez deleted the column-array-as-label branch February 24, 2021 18:07
@jmoralez
Copy link
Collaborator Author

Thank you for your comments!

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants