Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing redundant lerning_rate(s) parameter #515

Open
JackTemaki opened this issue May 21, 2021 · 8 comments
Open

Removing redundant lerning_rate(s) parameter #515

JackTemaki opened this issue May 21, 2021 · 8 comments
Labels
potential-new-behavior Discussions about RETURNN behaviour

Comments

@JackTemaki
Copy link
Collaborator

Discussion related to #508

Right now there are both a learning_rate and a learning_rates parameter, which often causes confusion to the users.
Here I am not sure what would be the better solution (which one to keep), but the current situation is not good.

@JackTemaki JackTemaki added the potential-new-behavior Discussions about RETURNN behaviour label May 21, 2021
@albertz
Copy link
Member

albertz commented May 21, 2021

They are not exactly redundant or the same. learning_rate defines the initial and default learning rate.

learing_rates sets predefined learning rates for some specified epochs.

@christophmluscher
Copy link
Collaborator

can the different behaviours be merged into a single variable? maybe something like:
learning_rates = {"initial": FLOAT, "default": FLOAT, "predefined": LIST}

@albertz
Copy link
Member

albertz commented May 21, 2021

Currently learning_rates is either a list or lrs or dict epoch -> lr (so a sparse list or lrs).
I'm not sure if extending/changing learning_rates to also cover initial/default lr is really in any way helpful (or easier) for the user than to just specify it separately. I personally think learning_rate as a separate option is more clean.

@albertz
Copy link
Member

albertz commented May 21, 2021

I think @JackTemaki is right in the way, when learning_rates also specifies the LR of epoch 1. However, I don't see this too much of a problem. And this also does not necessarily need to be the case.

Maybe there can be an extra check (but independent from new behavior) whether learning_rate has no effect and is different from max(learning_rates.values()) or so, and then just print a warning.

I think I would still leave learning_rate as a config option.

@christophmluscher
Copy link
Collaborator

But then we are still left with learning_rate and learning_rates, @JackTemaki wanted to change exactly that. Merging this into a single variable would not take away your options but mainly increase readability since the keywords add information. That was my reasoing..

@albertz
Copy link
Member

albertz commented May 21, 2021

But then we are still left with learning_rate and learning_rates,

Yes, this is what I meant. I would prefer to keep both.

Merging this into a single variable would not take away your options but mainly increase readability since the keywords add information.

But what new information? It's exactly the same information, just expressed in a different way (which I think is worse than before).
And you even have still the same ambiguity problem, like:

learning_rates = {"initial": 1.0, "default": 0.5, 1: 0.3}

When does it use 'initial', 'default' or the one from epoch 1?

Actually I would also need to read the code to understand which is used in which case exactly. I'm not really sure now, when learning_rates specifies epoch 1, if learning_rate is ever used.


Another suggestion:

I would still keep both options, but only allow one of them by the user. If learning_rates is used, it must define the LR of the first epoch.

@JackTemaki
Copy link
Collaborator Author

I would still keep both options, but only allow one of them by the user. If learning_rates is used, it must define the LR of the first epoch.

This sounds like a good compromise. I was not even aware that learning_rates can have this kind of dict syntax, I guess this should at some point be updated in the docs as well.

@christophmluscher
Copy link
Collaborator

christophmluscher commented May 21, 2021

Yes, this is what I meant. I would prefer to keep both.

OK.

When does it use 'initial', 'default' or the one from epoch 1?

Good point.

I would still keep both options, but only allow one of them by the user. If learning_rates is used, it must define the LR of the first epoch.

I like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential-new-behavior Discussions about RETURNN behaviour
Projects
None yet
Development

No branches or pull requests

3 participants