[python] Weights is ignored when customized objective function is used #5027

shiyu1994 · 2022-02-23T14:21:58Z

Description

Weights are not multiplied to the gradients and hessians from customized objective function in Python API.

Reproducible example

import numpy as np
import lightgbm as lgb

def fobj(preds, train_data):
    labels = train_data.get_label()
    return preds - labels, np.ones_like(labels)

def test():
    np.random.seed(123)
    num_data = 10000
    num_feature = 100
    train_X = np.random.randn(num_data, num_feature)
    train_y = np.mean(train_X, axis=-1)
    valid_X = np.random.randn(num_data, num_feature)
    valid_y = np.mean(valid_X, axis=-1)
    weights = np.random.rand(num_data)
    train_data = lgb.Dataset(train_X, train_y, weight=weights) # comment out weights will get the same output
    valid_data = lgb.Dataset(valid_X, valid_y)
    params = {
        "verbose": 2,
        "metric": "rmse",
        "learning_rate": 0.2,
        "num_trees": 20,
    }
    booster = lgb.train(train_set=train_data, valid_sets=[valid_data], valid_names=["valid"], params=params, fobj=fobj)

if __name__ == "__main__":
    test()

LightGBM version or commit hash:
Version 3.3.2

Command(s) you used to install LightGBM
Install from source

jmoralez · 2022-02-23T21:38:03Z

Moving the conversation from #4925 (comment) here. This may be intentional because the weights are available in the custom objective function through the training API and not through scikit-learn's but it'd be nice to clarify this.

StrikerRUS · 2022-02-23T22:36:43Z

This may be intentional because the weights are available in the custom objective function through the training API and not through scikit-learn's but it'd be nice to clarify this.

I think this is a oversight because one form of custom evaluation function accepts weights in scikit-learn API:

LightGBM/python-package/lightgbm/sklearn.py

Lines 138 to 141 in 97c8d94

    
                       Expects a callable with following signatures: 
        
                       ``func(y_true, y_pred)``, 
        
                       ``func(y_true, y_pred, weight)`` 
        
                       or ``func(y_true, y_pred, weight, group)``

I guess the same can be done for custom objective function.

jmoralez · 2022-02-23T22:41:15Z

Hmm now I'm more confused because for the objective function weights aren't allowed

LightGBM/python-package/lightgbm/sklearn.py

Lines 105 to 112 in 97c8d94

    
           labels = dataset.get_label() 
        
           argc = len(signature(self.func).parameters) 
        
           if argc == 2: 
        
               grad, hess = self.func(labels, preds) 
        
           elif argc == 3: 
        
               grad, hess = self.func(labels, preds, dataset.get_group()) 
        
           else: 
        
               raise TypeError(f"Self-defined objective function should have 2 or 3 arguments, got {argc}")

but for eval they are

LightGBM/python-package/lightgbm/sklearn.py

Lines 192 to 201 in 97c8d94

    
           labels = dataset.get_label() 
        
           argc = len(signature(self.func).parameters) 
        
           if argc == 2: 
        
               return self.func(labels, preds) 
        
           elif argc == 3: 
        
               return self.func(labels, preds, dataset.get_weight()) 
        
           elif argc == 4: 
        
               return self.func(labels, preds, dataset.get_weight(), dataset.get_group()) 
        
           else: 
        
               raise TypeError(f"Self-defined eval function should have 2, 3 or 4 arguments, got {argc}")

StrikerRUS · 2022-02-23T22:42:15Z

Hmm now I'm more confused because for the objective function weights aren't allowed

Yeah, exactly! I'm proposing to allow passing weights for the objective function.

jmoralez · 2022-02-23T22:51:40Z

I think it may be more user friendly to weigh things automatically. I think specifying sample weights either through the Dataset or a method in the sklearn API kind of implies that I want to use them to weigh my samples everywhere (grad, hess, metrics). It would be awkward that the grad and hess are weighted automatically but the metric isn't, and that if I switch to the training API I have to weigh everything myself. WDYT?

StrikerRUS · 2022-02-23T23:09:12Z

kind of implies that I want to use them to weigh my samples everywhere

Highly likely. But what if no?.. There will be no way to unweight them then. I think it's better to not weight automatically anything but allow user to choose weight or not weight.

jmoralez · 2022-02-24T01:28:25Z

I think it's better to not weight automatically anything but allow user to choose weight or not weight.

I agree with you, it gives the user full control and could enable use cases like #4995, which could be achieved by weighing only the metric but not the grad and hess. So we should remove this then, right?

LightGBM/python-package/lightgbm/sklearn.py

Lines 115 to 122 in 97c8d94

    
           if weight is not None: 
        
               if grad.ndim == 2:  # multi-class 
        
                   num_data = grad.shape[0] 
        
                   if weight.size != num_data: 
        
                       raise ValueError("grad and hess should be of shape [n_samples, n_classes]") 
        
                   weight = weight.reshape(num_data, 1) 
        
               grad *= weight 
        
               hess *= weight

shiyu1994 · 2022-02-24T12:38:10Z

Highly likely. But what if no?.. There will be no way to unweight them then. I think it's better to not weight automatically anything but allow user to choose weight or not weight.

Does that mean, in the example above, we should let the user to weight the gradients in the fobj function?

StrikerRUS · 2022-02-24T22:49:19Z

So we should remove this then, right?

Does that mean, in the example above, we should let the user to weight the gradients in the fobj function?

I guess so.

shiyu1994 · 2022-03-02T16:11:38Z

Shall we open an PR to remove the weighting in sklearn API?

StrikerRUS · 2022-03-03T00:21:40Z

I'm for it.

…loses #5027) (#5211) * allow custom weighing in sklearn api * add suggestions from review Co-authored-by: Nikita Titov <[email protected]>

github-actions · 2023-08-19T03:49:28Z

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

shiyu1994 added the bug label Feb 23, 2022

jameslamb mentioned this issue Apr 14, 2022

[RFC] 4.0.0 Release #5153

Closed

60 tasks

jmoralez mentioned this issue May 13, 2022

[python-package] allow custom weighing in fobj for scikit-learn API (closes #5027) #5211

Merged

StrikerRUS closed this as completed in #5211 Jun 27, 2022

StrikerRUS added a commit that referenced this issue Jun 27, 2022

[python-package] allow custom weighing in fobj for scikit-learn API (c…

b6deb9a

…loses #5027) (#5211) * allow custom weighing in sklearn api * add suggestions from review Co-authored-by: Nikita Titov <[email protected]>

jameslamb mentioned this issue Oct 7, 2022

[DO NOT MERGE] Release v3.3.3 #5525

Closed

40 tasks

github-actions bot locked as resolved and limited conversation to collaborators Aug 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] Weights is ignored when customized objective function is used #5027

[python] Weights is ignored when customized objective function is used #5027

shiyu1994 commented Feb 23, 2022

jmoralez commented Feb 23, 2022

StrikerRUS commented Feb 23, 2022 •

edited

Loading

jmoralez commented Feb 23, 2022

StrikerRUS commented Feb 23, 2022

jmoralez commented Feb 23, 2022

StrikerRUS commented Feb 23, 2022

jmoralez commented Feb 24, 2022

shiyu1994 commented Feb 24, 2022

StrikerRUS commented Feb 24, 2022

shiyu1994 commented Mar 2, 2022

StrikerRUS commented Mar 3, 2022

github-actions bot commented Aug 19, 2023

[python] Weights is ignored when customized objective function is used #5027

[python] Weights is ignored when customized objective function is used #5027

Comments

shiyu1994 commented Feb 23, 2022

Description

Reproducible example

jmoralez commented Feb 23, 2022

StrikerRUS commented Feb 23, 2022 • edited Loading

jmoralez commented Feb 23, 2022

StrikerRUS commented Feb 23, 2022

jmoralez commented Feb 23, 2022

StrikerRUS commented Feb 23, 2022

jmoralez commented Feb 24, 2022

shiyu1994 commented Feb 24, 2022

StrikerRUS commented Feb 24, 2022

shiyu1994 commented Mar 2, 2022

StrikerRUS commented Mar 3, 2022

github-actions bot commented Aug 19, 2023

StrikerRUS commented Feb 23, 2022 •

edited

Loading