-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request #161
Comments
Thanks for your message! Do you know if there has been some research on whether such learning rate shrinking has showned added benefits? I am not aware that other common frameworks such as XGBoost, lightboost or CatBoost has implemented such feature. I think it would be fairly simple to implement, and perhaps even hack right now without any modification given that the hyper parameters contained in one of the EvoTypes ( That being said, I'm unclear whether the LR shrinking principle can be expected to result in an improved fitting unlike first-order gradient like in neural networks since the magnitude of the error is more directly taken into account with boosted trees. As such, I'd tend to favor the selection of the smallest constant learning rate which results in the largest computationally acceptable number of trees (ie: couple of hundreds). |
Some experiments... https://towardsdatascience.com/betaboosting-2cd6c697eb93 But actually saying what @jeremiedb wrote above - very likely not usefull. |
GIven the lack of both empirical and theoretical support for such feature, I'd close for now as I don't see a proper motivation to implement in a foreseeable future. |
I was wondering if we can have a feature where the learning rate is reduced by some percent (user-defined parameter) once the eval metric increases by some amount.
So instead of early_stopping after 20 rounds, the learning rate might be reduced by 90% instead.
This should allow the model to start learning again.
The idea is to generate more trees in the low loss space of models.
Consistently reducing the learning rate should allow us to move more slowly in this space and harvest a lot more models to average over.
The text was updated successfully, but these errors were encountered: