Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] LightGBMError: bin size 257 cannot run on GPU #3339

Closed
Tracked by #5153
rohan-gt opened this issue Aug 28, 2020 · 14 comments · Fixed by #6019
Closed
Tracked by #5153

[Bug] LightGBMError: bin size 257 cannot run on GPU #3339

rohan-gt opened this issue Aug 28, 2020 · 14 comments · Fixed by #6019
Labels

Comments

@rohan-gt
Copy link

rohan-gt commented Aug 28, 2020

I'm getting the following error while running the latest LightGBM GPU using these params:

params = {
            'device_type': 'gpu',
            'gpu_device_id': 0,
            'gpu_platform_id': 0,
            'gpu_use_dp': 'false',
            'max_bin': 255
}

on Google Colab using this Kaggle dataset: https://www.kaggle.com/c/ieee-fraud-detection. I'm dropping all the categorical variables:

LightGBMError: bin size 257 cannot run on GPU

@hengzhe-zhang
Copy link

I also have the same problem. Is there any way to solve this problem?

@guolinke
Copy link
Collaborator

sorry for missing this issue.
The max-bin actually cannot limit the number of bins for categorical feature.
there are two workarounds:

  1. use the categorical encodings, converting categorical features to numerical ones.
  2. split one categorical feature to multi categorical features, and make sure the number of categories in each splitted feature smaller than 256.

@pseudotensor
Copy link

pseudotensor commented Mar 16, 2021

I hit this randomly for no reason with categorical_features as explicitly empty. Has nothing to do with that. The test that hit this normally has passed 1000 times before.

File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/sklearn.py", line 794, in fit
    categorical_feature=categorical_feature, callbacks=callbacks, init_model=init_model)
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/sklearn.py", line 637, in fit
    callbacks=callbacks, init_model=init_model)
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/engine.py", line 230, in train
    booster = Booster(params=params, train_set=train_set)
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/basic.py", line 2104, in __init__
    ctypes.byref(self.handle)))
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/basic.py", line 52, in _safe_call
    raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

The number of bins was 255 and there are no categorical features as explicitly chosen.

@George3d6
Copy link

Same issue happened for me:

  File "/usr/local/lib/python3.6/dist-packages/lightgbm/engine.py", line 228, in train
    booster = Booster(params=params, train_set=train_set)
  File "/usr/local/lib/python3.6/dist-packages/lightgbm/basic.py", line 2237, in __init__
    ctypes.byref(self.handle)))
  File "/usr/local/lib/python3.6/dist-packages/lightgbm/basic.py", line 110, in _safe_call
    raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

@George3d6
Copy link

Some of the values are categorical in my case but not as many as 257 different ones, combined with @pseudotensor comment, I assume this is something else.

@lewis-morris
Copy link

I am also getting the same error.

lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

@MAxx8371
Copy link

MAxx8371 commented Feb 14, 2022

What causes this error?Is the bin_size of a categorical feature bigger than the max_bin that causes the error? Or it is because the memory is not enough. And the model can work on CPU. Thank you!

@jameslamb jameslamb mentioned this issue Apr 14, 2022
60 tasks
@jiluojiluo
Copy link

lightgbm.basic.LightGBMError: bin size 407 cannot run on GPU

@jiluojiluo
Copy link

lightgbm.basic.LightGBMError: bin size 407 cannot run on GPU

this is a bug for lightGBM run on GPU,when use CPU,it is OK. SO ,LGBM on GPU need improve.

@ChiHangChen
Copy link

Same error encountered, any update?

@aforadi
Copy link

aforadi commented Aug 4, 2022

Same here.

lightgbm.basic.LightGBMError: bin size 670 cannot run on GPU

@CVPaul
Copy link
Contributor

CVPaul commented Aug 4, 2023

It seems that I've identified the cause of the error: The calculation method for num_total_bin used during Exclusive Feature Bundling

(bin_mappers[fidx]->GetDefaultBin() == 0 ? -1 : 0);
doesn't align completely with the way num_total_bin is calculated during the creation of a FeatureGroup
if (bin_mappers_[i]->GetMostFreqBin() == 0) {
As a result, the max_bin_per_group (=256) is working during Bundling, but it is not working when creating the FeatureGroup. When I replaced the GetDefaultBin() at dataset.cpp#L134 with GetMostFreqBin(), the issue was resolved. I had tested with the case reported here: #4082

@XQ-UT
Copy link

XQ-UT commented Nov 28, 2023

Same issue here. Can we prioritize the fixing MR?

shiyu1994 added a commit that referenced this issue Feb 20, 2024
* solve 'bin size 257 cannot run on GPU #3339'

#3339 (comment)

* fix  typo LeafIndex -> leaf_index

---------

Co-authored-by: shiyu1994 <[email protected]>
Co-authored-by: James Lamb <[email protected]>
@damvantai
Copy link

damvantai commented May 8, 2024

I also encountered the same error in this situation in fold 3th (total 5 fold), when my category characteristic had many NAN values.
Can do value np.NAN, np.int in feature category -> max_bin
I have use labelencoder convert feature category -> maxbin

But after i use
df_train[cat_cols] = df_train[cat_cols].astype(str)
df_train[cat_cols] = df_train[cat_cols].astype("category")
then the training process will good!

or can setup
"min_data_in_bin": 256, or higher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.