-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] Dataset construction fails if data
does not have feature names and categorical_features is specified
#4374
Comments
This problem also happens when the data does have feature names. When I look at length(colnames) for my data, it returns 10, yet I get this message for the last column in the categorical feature list. This package is really difficult to use. |
Are you able to provide a reproducible example of the behavior you're describing? That would help us understand exactly what you mean by "get this message for the last column".
specific comments indicating surprising or incorrect behavior you run into using LightGBM are very helpful and welcome. Sweeping complaints like "this package is difficult to use" do not help to improve this project and are very much not welcomed. Please keep your comments in this repo polite and focused on improving the project or getting more information about it. |
I apologize for my sweeping comment. I was able to fix my problem by reading my data
|
…ata does not have column names (fixes #4374) (#5184) * check for number of columns if data is matrixx for categorical indices check * check for error when using a greater index than the number of columns * apply suggestion Co-authored-by: James Lamb <[email protected]> * revert whitespace change * check if is filename instead of matrix Co-authored-by: James Lamb <[email protected]>
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Description
If you specify
categorical_features
and the training data does not have feature names, Dataset construction fails.Reproducible example
Results in the following error message:
Environment info
LightGBM version or commit hash: latest
master
(53ffba7)Command(s) you used to install LightGBM
sh build-cran-package.sh R CMD INSTALL lightgbm_*.tar.gz
Additional Comments
For anyone reading this who is not familiar with LightGBM's internals...this bug will also affect
lightgbm()
,lgb.cv()
andlgb.train()
.This bug is caused by the fact that the R packages uses
length(colnames)
to compute "number of features".LightGBM/R-package/R/lgb.Dataset.R
Line 148 in 53ffba7
This is not reliable, since creating a
Dataset
without feature names is supported.The text was updated successfully, but these errors were encountered: