-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] remove 'fobj' in favor of passing custom objective function in params #5052
[python-package] remove 'fobj' in favor of passing custom objective function in params #5052
Conversation
…remaMiguel/LightGBM into feat/custom_objective_metric_in_params
@TremaMiguel Thanks for working on this! Could you please check my comments above? |
@shiyu1994 thanks for the time for reviewing, I've provided some answer to open threads. |
Close and reopen to retriever CI. |
Signed-off-by: Miguel Trejo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! The changes LGTM now. I'm not sure if other reviewers would like to take a look before this is merging. So gently ping @StrikerRUS @jameslamb @jmoralez @guolinke.
@TremaMiguel Thanks for your work! Since scikit-learn wrapper internally calls LightGBM/python-package/lightgbm/sklearn.py Lines 753 to 764 in 9a4e706
we can remove duplicated code from sklearn.py that is in engine.py now, right?LightGBM/python-package/lightgbm/sklearn.py Lines 668 to 675 in 9a4e706
|
Also, please make the same changes for |
These changes are quite complicated and require more work that it seems from the first glance. So I suggest to split this PR into two ones with changes related to custom objective and metric respectively. |
params['objective'] = 'None' # objective = nullptr for unknown objective | ||
params['objective'] = _ObjectiveFunctionWrapper(self._objective) | ||
else: | ||
params['objective'] = 'None' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see now.
Seems that LightGBM checks the consistency for objective and number of classes even during predicting.
Line 261 in 6b56a90
void Config::CheckParamConflict() { |
Full logs:
==================================================================================================== FAILURES ====================================================================================================
________________________________________________________________________________________ test_multiclass_custom_objective ________________________________________________________________________________________
def test_multiclass_custom_objective():
centers = [[-4, -4], [4, 4], [-4, 4]]
X, y = make_blobs(n_samples=1_000, centers=centers, random_state=42)
params = {'n_estimators': 10, 'num_leaves': 7}
builtin_obj_model = lgb.LGBMClassifier(**params)
builtin_obj_model.fit(X, y)
builtin_obj_preds = builtin_obj_model.predict_proba(X)
custom_obj_model = lgb.LGBMClassifier(objective=sklearn_multiclass_custom_objective, **params)
custom_obj_model.fit(X, y)
> custom_obj_preds = softmax(custom_obj_model.predict(X, raw_score=True))
test_sklearn.py:1299:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
d:\miniconda3\lib\site-packages\lightgbm\sklearn.py:1050: in predict
result = self.predict_proba(X, raw_score, start_iteration, num_iteration,
d:\miniconda3\lib\site-packages\lightgbm\sklearn.py:1063: in predict_proba
result = super().predict(X, raw_score, start_iteration, num_iteration, pred_leaf, pred_contrib, **kwargs)
d:\miniconda3\lib\site-packages\lightgbm\sklearn.py:813: in predict
return self._Booster.predict(X, raw_score=raw_score, start_iteration=start_iteration, num_iteration=num_iteration,
d:\miniconda3\lib\site-packages\lightgbm\basic.py:3538: in predict
return predictor.predict(data, start_iteration, num_iteration,
d:\miniconda3\lib\site-packages\lightgbm\basic.py:813: in predict
preds, nrow = self.__pred_for_np2d(data, start_iteration, num_iteration, predict_type)
d:\miniconda3\lib\site-packages\lightgbm\basic.py:903: in __pred_for_np2d
return inner_predict(mat, start_iteration, num_iteration, predict_type)
d:\miniconda3\lib\site-packages\lightgbm\basic.py:873: in inner_predict
_safe_call(_LIB.LGBM_BoosterPredictForMat(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
ret = -1
def _safe_call(ret: int) -> None:
"""Check the return value from C API call.
Parameters
----------
ret : int
The return value from C API calls.
"""
if ret != 0:
> raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
E lightgbm.basic.LightGBMError: Number of classes must be 1 for non-multiclass training
d:\miniconda3\lib\site-packages\lightgbm\basic.py:142: LightGBMError
---------------------------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------------------------
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000075 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 510
[LightGBM] [Info] Number of data points in the train set: 1000, number of used features: 2
[LightGBM] [Info] Start training from score -1.096614
[LightGBM] [Info] Start training from score -1.099613
[LightGBM] [Info] Start training from score -1.099613
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000071 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 510
[LightGBM] [Info] Number of data points in the train set: 1000, number of used features: 2
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
---------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------
[LightGBM] [Fatal] Number of classes must be 1 for non-multiclass training
Let's then pass 'None'
for the custom object during predict
phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more minor suggestions for tests.
@StrikerRUS thanks for the time taken to review, I've addressed your comments. Let me know if something else is missing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this hard work with API and test changes! LGTM!
@jameslamb @shiyu1994 @jmoralez @guolinke Please help with another pair of eyes. |
Perhaps a helping hand here @jameslamb ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the changes and threads tonight, I have no additional comments. Thanks very much for all the hard work on this @TremaMiguel , and as always for the very thorough review @StrikerRUS .
I changed the PR title to explicitly include |
@jameslamb @StrikerRUS @shiyu1994 thank you very much for reviewing this. 🙌 |
@TremaMiguel Let's go further with |
Is there a specific way to use this with .train() I've tried to specifically set params['objective'] but get Unknown type parameter:objective got:function And I'm simply passing a function reference |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Context
This PR
closescontributes to #3244Changes
Any custom evaluation passed totrain
throughparams['metric'] or
feval` is accepted.params['objective']
.In the case thatparams['objective']
andfobj
are both callable the first one is taken into account.fobj
argument is removed.Test