Removing redundant lerning_rate(s) parameter #515

JackTemaki · 2021-05-21T08:56:37Z

Discussion related to #508

Right now there are both a learning_rate and a learning_rates parameter, which often causes confusion to the users.
Here I am not sure what would be the better solution (which one to keep), but the current situation is not good.

The text was updated successfully, but these errors were encountered:

albertz · 2021-05-21T08:59:18Z

They are not exactly redundant or the same. learning_rate defines the initial and default learning rate.

learing_rates sets predefined learning rates for some specified epochs.

christophmluscher · 2021-05-21T09:34:04Z

can the different behaviours be merged into a single variable? maybe something like:
learning_rates = {"initial": FLOAT, "default": FLOAT, "predefined": LIST}

albertz · 2021-05-21T09:36:20Z

Currently learning_rates is either a list or lrs or dict epoch -> lr (so a sparse list or lrs).
I'm not sure if extending/changing learning_rates to also cover initial/default lr is really in any way helpful (or easier) for the user than to just specify it separately. I personally think learning_rate as a separate option is more clean.

albertz · 2021-05-21T09:40:41Z

I think @JackTemaki is right in the way, when learning_rates also specifies the LR of epoch 1. However, I don't see this too much of a problem. And this also does not necessarily need to be the case.

Maybe there can be an extra check (but independent from new behavior) whether learning_rate has no effect and is different from max(learning_rates.values()) or so, and then just print a warning.

I think I would still leave learning_rate as a config option.

christophmluscher · 2021-05-21T09:44:57Z

But then we are still left with learning_rate and learning_rates, @JackTemaki wanted to change exactly that. Merging this into a single variable would not take away your options but mainly increase readability since the keywords add information. That was my reasoing..

albertz · 2021-05-21T09:53:09Z

But then we are still left with learning_rate and learning_rates,

Yes, this is what I meant. I would prefer to keep both.

Merging this into a single variable would not take away your options but mainly increase readability since the keywords add information.

But what new information? It's exactly the same information, just expressed in a different way (which I think is worse than before).
And you even have still the same ambiguity problem, like:

learning_rates = {"initial": 1.0, "default": 0.5, 1: 0.3}

When does it use 'initial', 'default' or the one from epoch 1?

Actually I would also need to read the code to understand which is used in which case exactly. I'm not really sure now, when learning_rates specifies epoch 1, if learning_rate is ever used.

Another suggestion:

I would still keep both options, but only allow one of them by the user. If learning_rates is used, it must define the LR of the first epoch.

JackTemaki · 2021-05-21T09:55:53Z

I would still keep both options, but only allow one of them by the user. If learning_rates is used, it must define the LR of the first epoch.

This sounds like a good compromise. I was not even aware that learning_rates can have this kind of dict syntax, I guess this should at some point be updated in the docs as well.

christophmluscher · 2021-05-21T09:57:16Z

Yes, this is what I meant. I would prefer to keep both.

OK.

When does it use 'initial', 'default' or the one from epoch 1?

Good point.

I would still keep both options, but only allow one of them by the user. If learning_rates is used, it must define the LR of the first epoch.

I like this.

JackTemaki added the potential-new-behavior Discussions about RETURNN behaviour label May 21, 2021

JackTemaki mentioned this issue May 21, 2021

Ideas for RETURNN behaviour changes #508

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removing redundant lerning_rate(s) parameter #515

Removing redundant lerning_rate(s) parameter #515

JackTemaki commented May 21, 2021

albertz commented May 21, 2021

christophmluscher commented May 21, 2021

albertz commented May 21, 2021

albertz commented May 21, 2021

christophmluscher commented May 21, 2021

albertz commented May 21, 2021

JackTemaki commented May 21, 2021

christophmluscher commented May 21, 2021 •

edited

Loading

Removing redundant lerning_rate(s) parameter #515

Removing redundant lerning_rate(s) parameter #515

Comments

JackTemaki commented May 21, 2021

albertz commented May 21, 2021

christophmluscher commented May 21, 2021

albertz commented May 21, 2021

albertz commented May 21, 2021

christophmluscher commented May 21, 2021

albertz commented May 21, 2021

JackTemaki commented May 21, 2021

christophmluscher commented May 21, 2021 • edited Loading

christophmluscher commented May 21, 2021 •

edited

Loading