Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gpu] [python] LightGBMError: No OpenCL device found #4497

Closed
Tracked by #5153
aidiss opened this issue Aug 1, 2021 · 10 comments
Closed
Tracked by #5153

[gpu] [python] LightGBMError: No OpenCL device found #4497

aidiss opened this issue Aug 1, 2021 · 10 comments
Labels

Comments

@aidiss
Copy link

aidiss commented Aug 1, 2021

Description

Reproducible example

Connect to localhost:8888 jupyter notebook

from lightgbm import LGBMClassifier
from sklearn.datasets import make_moons
model = LGBMClassifier(boosting_type='gbdt', num_leaves=31, max_depth=- 1, learning_rate=0.1, n_estimators=300, device = "gpu")
train, label = make_moons(n_samples=300000, shuffle=True, noise=0.3, random_state=None)
model.fit(train, label)

Results in

LightGBMError                             Traceback (most recent call last)
<ipython-input-1-3cadc7bec646> in <module>
      3 model = LGBMClassifier(boosting_type='gbdt', num_leaves=31, max_depth=- 1, learning_rate=0.1, n_estimators=300, device = "gpu")
      4 train, label = make_moons(n_samples=300000, shuffle=True, noise=0.3, random_state=None)
----> 5 model.fit(train, label)

Environment info

LightGBM version or commit hash: Docker version

Command(s) you used to install LightGBM

Run everything according to https://github.com/microsoft/LightGBM/tree/master/docker/gpu

mkdir lightgbm-docker
cd lightgbm-docker
wget https://raw.githubusercontent.com/Microsoft/LightGBM/master/docker/gpu/dockerfile.gpu
docker build -f dockerfile.gpu -t lightgbm-gpu .
nvidia-docker run --rm -d --name lightgbm-gpu -p 8888:8888 -v /home:/home lightgbm-gpu

Full traceback

---------------------------------------------------------------------------
LightGBMError                             Traceback (most recent call last)
<ipython-input-1-3cadc7bec646> in <module>
      3 model = LGBMClassifier(boosting_type='gbdt', num_leaves=31, max_depth=- 1, learning_rate=0.1, n_estimators=300, device = "gpu")
      4 train, label = make_moons(n_samples=300000, shuffle=True, noise=0.3, random_state=None)
----> 5 model.fit(train, label)

/opt/conda/envs/py3/lib/python3.8/site-packages/lightgbm/sklearn.py in fit(self, X, y, sample_weight, init_score, eval_set, eval_names, eval_sample_weight, eval_class_weight, eval_init_score, eval_metric, early_stopping_rounds, verbose, feature_name, categorical_feature, callbacks, init_model)
    888                     valid_sets[i] = (valid_x, self._le.transform(valid_y))
    889 
--> 890         super().fit(X, _y, sample_weight=sample_weight, init_score=init_score, eval_set=valid_sets,
    891                     eval_names=eval_names, eval_sample_weight=eval_sample_weight,
    892                     eval_class_weight=eval_class_weight, eval_init_score=eval_init_score,

/opt/conda/envs/py3/lib/python3.8/site-packages/lightgbm/sklearn.py in fit(self, X, y, sample_weight, init_score, group, eval_set, eval_names, eval_sample_weight, eval_class_weight, eval_init_score, eval_group, eval_metric, early_stopping_rounds, verbose, feature_name, categorical_feature, callbacks, init_model)
    681             init_model = init_model.booster_
    682 
--> 683         self._Booster = train(params, train_set,
    684                               self.n_estimators, valid_sets=valid_sets, valid_names=eval_names,
    685                               early_stopping_rounds=early_stopping_rounds,

/opt/conda/envs/py3/lib/python3.8/site-packages/lightgbm/engine.py in train(params, train_set, num_boost_round, valid_sets, valid_names, fobj, feval, init_model, feature_name, categorical_feature, early_stopping_rounds, evals_result, verbose_eval, learning_rates, keep_training_booster, callbacks)
    226     # construct booster
    227     try:
--> 228         booster = Booster(params=params, train_set=train_set)
    229         if is_valid_contain_train:
    230             booster.set_train_data_name(train_data_name)

/opt/conda/envs/py3/lib/python3.8/site-packages/lightgbm/basic.py in __init__(self, params, train_set, model_file, model_str, silent)
   2232             params_str = param_dict_to_str(params)
   2233             self.handle = ctypes.c_void_p()
-> 2234             _safe_call(_LIB.LGBM_BoosterCreate(
   2235                 train_set.handle,
   2236                 c_str(params_str),

/opt/conda/envs/py3/lib/python3.8/site-packages/lightgbm/basic.py in _safe_call(ret)
    108     """
    109     if ret != 0:
--> 110         raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
    111 
    112 

LightGBMError: No OpenCL device found

Additional Comments

I have seen similar issues that are closed as resolved. As I understand the solution was to add
mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd. But this now is included into the Dockerfile.

@valentasgruzauskas
Copy link

I also have this type of error. I tried to follow these guidelines: https://docs.docker.com/docker-for-windows/wsl/
However, I still did not manage to run docker LightGBM + GPU. I can use LightGBM + GPU without docker by installing the library locally, however, on docker, I still cannot use it.

I think my issue is related to drivers, either with windows pass-trough (e.g. https://stackoverflow.com/questions/49589229/is-gpu-pass-through-possible-with-docker-for-windows).

When I try to run:

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

I receive this error:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

Can you clarify what kind of steps I should follow to use docker LightGBM + GPU on my windows 10 Pro?

@StrikerRUS
Copy link
Collaborator

@aidiss @valentasgruzauskas Are you both using WSL?

@aidiss
Copy link
Author

aidiss commented Aug 3, 2021

@StrikerRUS no, I am on linux

@RyanVereque
Copy link

@aidiss

I have seen similar issues that are closed as resolved. As I understand the solution was to add
mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd. But this now is included into the Dockerfile.

I was encountering the same issue as you and was looking at these same solutions.

But it turns out that for me libnvidia-opencl.so.1 is no longer present anywhere within the filesystem when running the Dockerfile at https://github.com/microsoft/LightGBM/blob/master/docker/gpu/dockerfile.gpu for some reason. Installing nvidia-opencl-icd-375 adds it in. Yet running clinfo it still shows 0 devices are present.

@jameslamb jameslamb added the bug label Aug 31, 2021
@jameslamb jameslamb changed the title LightGBMError: No OpenCL device found [gpu] [python] LightGBMError: No OpenCL device found Aug 31, 2021
@eXTure
Copy link

eXTure commented Nov 15, 2021

I have the same issue, I'm using WSL.

@RyanVereque
Copy link

I haven't tried running sample code since my last comment except just now, but running it now in the latest commit it doesn't give this error anymore. Haven't checked why/how though!

@jameslamb jameslamb mentioned this issue Apr 14, 2022
60 tasks
@aidiss
Copy link
Author

aidiss commented Jul 1, 2022

It was working good for few months.
Now it started receiving same error.

@shiyu1994
Copy link
Collaborator

@aidiss Thanks for using LightGBM. So now even with
mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd
the problem still occurs?

@github-actions
Copy link

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants