[dask] test that Dask automatically treats 'category' columns as categorical features #3932
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Based on thread starting at #3930 (comment).
#3908 added some tests that the
lightgbm.dask
estimators correctly handled Dask DataFrames where some columns arepandas
"category" columns. That PR used the parametercategorical_feature
to explicitly provide a list of which features should be treated as categorical.This is unnecessary, since
auto
is the default behavior andauto
tells LightGBM "treat 'category' columns as categorical features" (LightGBM/python-package/lightgbm/basic.py
Lines 513 to 514 in 846b512
This PR removes the unnecessary
categorical_features
parameter in tests. As of this PR, we'll now be testing not only thatlightgbm.dask
estimators correctly train on categorical features, but also that it also automatically