Models will always be initialized without dropout layers in self-tuning ruleset #753

georgedahl · 2024-04-04T01:44:29Z

In submission_runner.py, if we are in the self-tuning rules, the hyperparameters argument to train_once will always be None.

Then in this code snippet

    dropout_rate = None
    aux_dropout_rate = None
    if hasattr(hyperparameters, 'dropout_rate'):
      dropout_rate = hyperparameters.dropout_rate
    if hasattr(hyperparameters, 'aux_dropout_rate'):
      aux_dropout_rate = hyperparameters.aux_dropout_rate
    model_params, model_state = workload.init_model_fn(
        model_init_rng, dropout_rate, aux_dropout_rate)

workload.init_model_fn will always get None for dropout_rate and aux_dropout_rate, so Dropout layers won't ever be added to the model.

Although submissions could call workload.init_model_fn again themselves to make use of its side effect of setting workload._model, this is awkward and also challenging for workloads near the memory limit since it involves superfluously reconstructing model_params again on device.

The text was updated successfully, but these errors were encountered:

priyakasimbeg · 2024-12-12T18:55:14Z

Our current API has 2 dropout related limitations:

Currently, in the external tuning ruleset we read the dropout value from the hparam config and pass it to the model initialization functions. In the self-tuning ruleset there exist no convenient way to specify the dropout value in the model initialization.
Furthermore, there is no way to change the dropout value during training.
Having a workload function to change the dropout value that submitters can call will remove both of these limitations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models will always be initialized without dropout layers in self-tuning ruleset #753

Models will always be initialized without dropout layers in self-tuning ruleset #753

georgedahl commented Apr 4, 2024

priyakasimbeg commented Dec 12, 2024

Models will always be initialized without dropout layers in self-tuning ruleset #753

Models will always be initialized without dropout layers in self-tuning ruleset #753

Comments

georgedahl commented Apr 4, 2024

priyakasimbeg commented Dec 12, 2024