Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] Add quantile regression objective for new CUDA version #5605

Merged
merged 60 commits into from
Mar 21, 2023

Conversation

shiyu1994
Copy link
Collaborator

Add quantile regression objective for new CUDA version.

update log in test_register_logger
add test cases for regression objectives
@shiyu1994 shiyu1994 requested a review from jameslamb as a code owner November 25, 2022 16:06
@shiyu1994 shiyu1994 changed the title [WIP] [CUDA] Add quantile regression objective for new CUDA version [CUDA] Add quantile regression objective for new CUDA version Dec 28, 2022
@shiyu1994
Copy link
Collaborator Author

shiyu1994 commented Mar 16, 2023

It is weird that

nvlink error   : Entry function '_ZN8LightGBM33CUDAConstructHistogramDenseKernelIhfLm11120EEEvPKNS_20CUDALeafSplitsStructEPKfS5_PKT_PKjSA_PKii' uses too much shared data (0xcbfc bytes, 0xc000 max) (target: sm_60)
nvlink error   : Entry function '_ZN8LightGBM34CUDAConstructHistogramSparseKernelItmfLm11120EEEvPKNS_20CUDALeafSplitsStructEPKfS5_PKT_PKT0_SB_PKji' uses too much shared data (0xcbfc bytes, 0xc000 max) (target: sm_60)

should arise when no related code of CUDAConstructHistogramSparseKernel or CUDAConstructHistogramDenseKernel is changed in this PR for cuda 10.0.

Note that for cuda 10.0, in master branch we've already used a smaller shared memory size than the claimed maximum 0c000 bytes (with which we should be able to use DP_SHARED_HIST_SIZE = 6144) to avoid the error above.

#if CUDART_VERSION == 10000
#define DP_SHARED_HIST_SIZE (5560)
#else
#define DP_SHARED_HIST_SIZE (6144)
#endif
#define SP_SHARED_HIST_SIZE (DP_SHARED_HIST_SIZE * 2)

It is unclear to me why adding code in this PR will have to make the DP_SHARED_HIST_SIZE = 5560 for cuda 10.0 smaller.

@guolinke Do you have any idea?

@guolinke
Copy link
Collaborator

how many bytes were exceeded? It looks quite large. are there any functions that possibly inherit or call other functions with shared memory?

@shiyu1994
Copy link
Collaborator Author

how many bytes were exceeded?

0xcbfc - 0xc000 = 0x0bfc = 3068 bytes

are there any functions that possibly inherit or call other functions with shared memory?

No. The code compiles with cuda 11.0 but not with cuda 10.0.

@shiyu1994
Copy link
Collaborator Author

@guolinke Let me adjust the #define DP_SHARED_HIST_SIZE (5560) to a lower value so that the compilation can succeed. And let's drop the support for CUDA 10.0 in another PR.

@shiyu1994
Copy link
Collaborator Author

It is unclear to me why adding code in this PR will have to make the DP_SHARED_HIST_SIZE = 5560 for cuda 10.0 smaller.

Just use binary search to find out that the maximum allowed value is 5176.

@shiyu1994
Copy link
Collaborator Author

@guolinke This is ready for review.

@shiyu1994 shiyu1994 merged commit ce0813e into master Mar 21, 2023
@shiyu1994 shiyu1994 deleted the cuda/objective-regression-quantile branch March 21, 2023 04:21
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants