Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gpu] Large dataset In LGBMRegressor Failed #4926

Closed
jiapengwen opened this issue Jan 5, 2022 · 4 comments · Fixed by #4928
Closed

[gpu] Large dataset In LGBMRegressor Failed #4926

jiapengwen opened this issue Jan 5, 2022 · 4 comments · Fixed by #4928
Labels

Comments

@jiapengwen
Copy link
Contributor

Description

use gpu version, I have a Large dataset, but execute failed

Reproducible example

import pandas as pd
import numpy as np
import os
import time
from lightgbm import LGBMRegressor
from sklearn.datasets import make_classification
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
import lightgbm as lgbm




if __name__ == '__main__':
    time1=time.time()
    # X=np.random.random((4300000,2200)) #work
    # y=np.random.random((4300000)) 

    X=np.random.random((8000000,2200))
    y=np.random.random((8000000))
    # X=np.random.random((7000000,2200)) # work
    # y=np.random.random((7000000))
    time2 = time.time()
    print('contruct data cost:',time2-time1)
    print(X[2][2])
    print(y[2:10])
    time1=time.time()
    model = lgbm.LGBMRegressor(device="gpu",n_estimators=1000,verbose=4,max_bin=16,tree_learner = 'serial',gpu_use_dp='false',n_jobs=1,
                                max_depth=7,num_leaves=31 ,min_child_samples=17000)
    model.fit(X, y,callbacks=[lgbm.log_evaluation()])
    time2 = time.time()
    print('gpu cost:',time2-time1)


######################################
[LightGBM] [Warning] Accuracy may be bad since you didn't explicitly set num_leaves OR 2^max_depth > num_leaves. (num_leaves=31).
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 35200
[LightGBM] [Info] Number of data points in the train set: 8000000, number of used features: 2200
[LightGBM] [Info] Using GPU Device: Quadro RTX 6000, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 16 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::compute::opencl_error> >'
  what():  Memory Object Allocation Failure

Environment info

LightGBM version or commit hash:

LightGBM version "3.3.1";

Command(s) you used to install LightGBM

pip3 install lightgbm --install=--gpu

python version python3.6.9

Additional Comments

My machine has 227G memory,

@jameslamb jameslamb changed the title Large dataset In LGBMRegressor Failed [gpu] Large dataset In LGBMRegressor Failed Jan 5, 2022
@jameslamb jameslamb added the bug label Jan 5, 2022
@jiapengwen
Copy link
Contributor Author

jiapengwen commented Jan 6, 2022

I have fix it, already make a pr, please check , #4928

@jiapengwen
Copy link
Contributor Author

@jameslamb

@jameslamb
Copy link
Collaborator

closed by #4928

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants