Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Provide more informative error when custom objective function returns data with incorrect shape #5323

Closed
TpTheGreat opened this issue Jun 23, 2022 · 6 comments · Fixed by #5329

Comments

@TpTheGreat
Copy link

Description

I try to run a custom loss function but get the error:
booster$update(fobj = fobj) : No object function provided

Reproducible example

not completely reproducible in terms of entire data, but the code looks like this

MAELoss<- function(preds, dtrain) {
  labels <- attr(dtrain, 'label')
  grad <- abs(preds - labels)
  hess <- rep(0,length(preds))
  return(list(hess = hess,grad = grad))
}


 model <- lgb.train(modelparams
                    ,LGBData
                    ,nthread = 8
                    , verbose= 1
                    , eval_freq = VerboseFreq
                    ,valids=list(OOF=LGBOOF,Train=LGBData) 
                    ,categorical_feature = CategoricalFeats
                    ,early_stopping_rounds = EarlyStoppingN
                    ,nrounds = NumRounds
                    ,obj = MAELoss
                   )

Environment info

lgbm 3.3.2
install.packages("lightgbm")

Windows 10
R 4.2

if i change the 'MAELoss' to "regression"
it works ok, it just won't accept my function.

Thanks!

@jameslamb jameslamb changed the title R Package - Custom loss not working -"booster$update(fobj = fobj) : No object function provided" [R-package] Custom loss not working -"booster$update(fobj = fobj) : No object function provided" Jun 23, 2022
@jameslamb
Copy link
Collaborator

Thanks for using LightGBM and for your report!

In the future, please try to provide a minimal, reproducible example if possible. Many issues using {lightgbm} and other modeling libraries are tightly related to the size and shape of the input data, and the combination of those data characteristics and the parameters you provide. Without an example like that, anyone looking to help you is left guessing what parameters and data you used, which can extend the time it takes to resolve issues.


Happy to say, though, that for this I was able to create such an example that reproduces this error. Ran the following code today on my Mac, using R 4.1.1, compiling with clang.

library(lightgbm)

data(EuStockMarkets)
stockDF <- as.data.frame(EuStockMarkets)

# create a Dataset
feature_names <- c("SMI", "CAC", "FTSE")
target_name <- "DAX"

X_train <- data.matrix(stockDF[, feature_names])
y_train <- stockDF[[target_name]]

# colnames are stored in private$colnames when initializing Dataset
dtrain <- lightgbm::lgb.Dataset(
    data = X_train
    , label = y_train
    , colnames = feature_names
)

MAELoss <- function(preds, dtrain) {
  labels <- attr(dtrain, 'label')
  grad <- abs(preds - labels)
  hess <- rep(0, length(preds))
  return(list(hess = hess,grad = grad))
}

model <- lightgbm::lgb.train(
    params = list()
    , data = dtrain
    , verbose= 1
    , nrounds = 10
    , obj = MAELoss
)

On latest master (eb13f39), this produces the following logs:

[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000500 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 765
[LightGBM] [Info] Number of data points in the train set: 1860, number of used features: 3
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Fatal] No object function provided
Error in booster$update(fobj = fobj) : No object function provided

And on v3.3.2, it produces the following logs.

[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000196 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 765
[LightGBM] [Info] Number of data points in the train set: 1860, number of used features: 3
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Fatal] No object function provided
Error in booster$update(fobj = fobj) : No object function provided

It's pretty surprising, since this project has unit tests on the use of custom objective functions passed to lgb.train(), and the unit tests look very similar to the code above.

test_that("using a custom objective, custom eval, and no other metrics works", {
set.seed(708L)
bst <- lgb.train(
params = list(
num_leaves = 8L
, learning_rate = 1.0
, verbose = VERBOSITY
)
, data = dtrain
, nrounds = 4L
, valids = watchlist
, obj = logregobj
, eval = evalerror
)

Those tests are passing for me locally, and in this project's many CI jobs.

@jmoralez
Copy link
Collaborator

Isn't attr related? I just tried your example with get_field and it works for me.

@jameslamb
Copy link
Collaborator

ahhhh RIGHT AFTER I posted that, I think I found the root cause.

You cannot use attr(dtrain, 'label') to access the label data.

library(lightgbm)

data(EuStockMarkets)
stockDF <- as.data.frame(EuStockMarkets)

feature_names <- c("SMI", "CAC", "FTSE")
target_name <- "DAX"

X_train <- data.matrix(stockDF[, feature_names])
y_train <- stockDF[[target_name]]

dtrain <- lightgbm::lgb.Dataset(
    data = X_train
    , label = y_train
    , colnames = feature_names
    , free_raw_data = FALSE
)
dtrain$construct()
attr(dtrain, 'label')
# NULL

Instead, please use get_field().

dtrain$get_field("label")
# [1] 1628.75 1613.63 1606.51 1621.04 1618.16 1610.61 ...

When I change your custom objective function to access the label that way, training proceeds successfully.


I'm going to leave this issue open as a bug, since there should be an opportunity for {lightgbm} to provide a more informative error in this situation.

@jameslamb
Copy link
Collaborator

ha @jmoralez you were faster than me. Yeah it's exactly that.

@jameslamb jameslamb added the bug label Jun 23, 2022
@jameslamb jameslamb changed the title [R-package] Custom loss not working -"booster$update(fobj = fobj) : No object function provided" [R-package] Provide more informative error when custom objective function returns data with incorrect shape Jun 23, 2022
@TpTheGreat
Copy link
Author

Thank you guys!!!
It works now!

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants