Add support for NeMo scope Optimizers support and add Novograd Optimizer #793

titu1994 · 2020-07-01T02:02:59Z

Salient points

Refactor optimizer to support changing optimizers from command line. Run the codebase using the below examples

Default to Adam if no optimizer provided via --optimizer.
Create base optimizer if no overriding --opt_args are passed
Unify optimizer support for all domains of NeMo. Standardized interface to add optimizer, lr and opt_args to argparse.

Usage

Use Adam and just override LR

python speech_to_text.py \
        --asr_model "bad_quartznet15x5.yaml" \
        --train_dataset "./an4/train_manifest.json" \
        --eval_dataset "./an4/test_manifest.json" \
        --gpus 4 \
        --distributed_backend "ddp" \
        --max_epochs 1 \
        --fast_dev_run \
        --lr 0.001

Change optimizer and override LR

python speech_to_text.py \
        --asr_model "bad_quartznet15x5.yaml" \
        --train_dataset "./an4/train_manifest.json" \
        --eval_dataset "./an4/test_manifest.json" \
        --gpus 4 \
        --distributed_backend "ddp" \
        --max_epochs 1 \
        --fast_dev_run \
        --optimizer novograd \
        --lr 0.01

Change optimizer, override LR, override optimizer args

python speech_to_text.py \
        --asr_model "bad_quartznet15x5.yaml" \
        --train_dataset "./an4/train_manifest.json" \
        --eval_dataset "./an4/test_manifest.json" \
        --gpus 4 \
        --distributed_backend "ddp" \
        --max_epochs 1 \
        --fast_dev_run \
        --optimizer novograd \
        --lr 0.01 \
        --opt_args betas=0.95,0.5 weight_decay=1e-3

Usage - Overriding default args in the model itself

When calling add_optimizer_args(parser), we can pass arguments directly here, thereby overriding the argparsers default values, as shown below. As such, with the same api, different domains can use different optimizers with different arguments.

Base initialization

parser = add_optimizer_args(parser)  # Use adam and empty opt_args list by default

Override optimizer

parser = add_optimizer_args(parser, optimizer='novograd')  # Use novograd and empty opt_args list by default

Override optimizer and args

novograd_args = {betas:(0.95, 0.5), weight_decay:0.001}
parser = add_optimizer_args(parser, optimizer='novograd', default_opt_args=novograd_args)  # Use novograd and custom defaults

Signed-off-by: smajumdar [email protected]

Signed-off-by: smajumdar <[email protected]>

okuchaiev

just few minor comments

examples/asr/speech_to_text.py

nemo/collections/asr/models/ctc_models.py

nemo/core/classes/optimizers.py

Signed-off-by: smajumdar <[email protected]>

nemo/collections/asr/models/ctc_models.py

blisc · 2020-07-01T22:34:32Z

examples/asr/speech_to_text.py

    asr_model.setup_training_data(model_config['AudioToTextDataLayer'])
    asr_model.setup_validation_data(model_config['AudioToTextDataLayer_eval'])
-    asr_model.setup_optimization(optim_params={'lr': 0.0003})
+    asr_model.setup_optimization(optim_params={'optimizer': args.optimizer, 'lr': args.lr, 'opt_args': args.opt_args})


I suppose this is out of scope of PR, but these three lines look holly out of line with pytorch lightning code. Just do all of this in init(). I fail to see the reason we need to do this separately.

@blisc are you proposing to have models.init() take: (1) model hyper parameters, (2) optimizer hyper parameters and (3) train/test/eval data parameters instead of having setup_* functions?

@blisc I came to exactly the same conclusions yesterday - thus my email.

I think the solution is to properly parametrize NeMo Models.

We no longer need to manually extract the kwargs from the parsed args, vars(args) is concise and serves the same purpose.

blisc · 2020-07-01T22:35:52Z

nemo/collections/asr/models/ctc_models.py

+        optimizer = get_optimizer(optimizer_name)
+        self.__optimizer = optimizer(self.parameters(), lr=lr, **optimizer_args)


why not merge these two lines into one?

We could do that, I just thought its better to separate in the case that optimizer_name is not valid, and therefore get_optimizer will raise an error. The traceback would point to a pretty dense line in that case. But sure, we can merge it too.

This wasn't what I had in mind actually. I was thinking more return get_optimizer(optimizer_name, self.parameters(), lr=lr, **optimizer_args), ie I would expect get_optimizer to instantiate an optimizer for me.

If you want to keep your original design, I would actually prefer the old:

optimizer = get_optimizer(optimizer_name) self.__optimizer = optimizer(self.parameters(), lr=lr, **optimizer_args)

rather than the changed:

optimizer = get_optimizer(optimizer_name)(self.parameters(), lr=lr, **optimizer_args)

Oh I misunderstood. Yes, I'll revert to follow the older design. As to merging the two lines together, I would prefer not to do that for two reasons - 1) we may want the class without instantiation to wrap into another class (say we have experimental optimizer), 2) we want to pass the class as an argument without instantiation to perform defered computation or typecheck in tests.

nemo/collections/asr/models/ctc_models.py

nemo/core/optim/optimizers.py

blisc · 2020-07-01T22:46:30Z

Seems mostly fine to me, just some minor comments

Signed-off-by: smajumdar <[email protected]>

…tion setup Signed-off-by: smajumdar <[email protected]>

VahidooX · 2020-07-02T03:11:05Z

examples/asr/speech_to_text.py

@@ -49,7 +51,7 @@ def main(args):
    model_config['AudioToTextDataLayer_eval']['manifest_filepath'] = args.eval_dataset
    asr_model.setup_training_data(model_config['AudioToTextDataLayer'])
    asr_model.setup_validation_data(model_config['AudioToTextDataLayer_eval'])
-    asr_model.setup_optimization(optim_params={'lr': 0.0003})
+    asr_model.setup_optimization(optim_params=vars(args))


Are we passing all args here?I If yes, I think it does not look nice that we pass all the args here. Can we pass three variables: lr, optimizer_kind and opt_params? like asr_model.setup_optimization(lr=args.lr, optimizer_kind=args.optimizer_kind, optim_params=args.opt_param)?

It makes it easier to understandable by the user. I think it looks like a magic function like this that gets all the args and return the optimizer?

Yes, that is what it was before, and we can go back to that. I'm a bit worried about how many args are going to be passed for optimizer + scheduler but might be better to be explicit.

Reverted it to pass args explicitly. I agree, this looks cleaner and more understandable.

It is a good point about the scheduler. How about having 5 inputs: 1- optimizer_kind, 2-lr 3-opt_params 4-lr_policy 5-lr_policy_params?

That's a good idea. We should be able to do that cleanly. Though for now, I am hard coding scheduler until the team finds this approach to Optimizer good enough to extend to schedulers as well.

nemo/collections/asr/models/ctc_models.py

Signed-off-by: smajumdar <[email protected]>

* In addition to the HTML report which is generated * The term listing will be seen in CI job output Signed-off-by: Mark Sturdevant <[email protected]>

titu1994 marked this pull request as draft July 1, 2020 02:03

titu1994 requested a review from okuchaiev July 1, 2020 02:03

titu1994 force-pushed the candidate_optimizer_refactor branch from 1d1d931 to 36f755a Compare July 1, 2020 05:20

titu1994 added 6 commits July 1, 2020 09:28

Add support for global scope optimizers and add Novograd

2d27f21

Signed-off-by: smajumdar <[email protected]>

Formatting fixes

e5eda15

Signed-off-by: smajumdar <[email protected]>

Add support for any model to override basic configurations

4017eea

Signed-off-by: smajumdar <[email protected]>

Rebase against candidate

ba5c438

Signed-off-by: smajumdar <[email protected]>

Simplify optimizer args parsing inside setup

ee23063

Signed-off-by: smajumdar <[email protected]>

Rebase with candidate branch

122fbde

Signed-off-by: smajumdar <[email protected]>

titu1994 force-pushed the candidate_optimizer_refactor branch from 1d10104 to 122fbde Compare July 1, 2020 16:38

titu1994 added 4 commits July 1, 2020 10:02

Clean up imports and add docstring to parser

56cc8ae

Signed-off-by: smajumdar <[email protected]>

Add None parsing support, add comments to parser steps

5d07828

Signed-off-by: smajumdar <[email protected]>

Reorder checks for bool to be last

909b766

Signed-off-by: smajumdar <[email protected]>

remove unused import

d08ee98

Signed-off-by: smajumdar <[email protected]>

okuchaiev requested changes Jul 1, 2020

View reviewed changes

Refactor to create nemo.core.optim

e32c196

Signed-off-by: smajumdar <[email protected]>

okuchaiev requested review from blisc and VahidooX July 1, 2020 22:27

blisc requested changes Jul 1, 2020

View reviewed changes

titu1994 added 2 commits July 1, 2020 17:48

make changes according to review

1dfbda0

Signed-off-by: smajumdar <[email protected]>

Replace hardcoded dictionary with vars() to cleanup input to optimiza…

07efac3

…tion setup Signed-off-by: smajumdar <[email protected]>

VahidooX requested changes Jul 2, 2020

View reviewed changes

Revert vars(args) and use explicit selection of args

87f6a6d

Signed-off-by: smajumdar <[email protected]>

okuchaiev approved these changes Jul 2, 2020

View reviewed changes

blisc approved these changes Jul 2, 2020

View reviewed changes

revert joint get and call of optimizer for clarity

6ef9f9f

Signed-off-by: smajumdar <[email protected]>

titu1994 marked this pull request as ready for review July 2, 2020 17:12

VahidooX approved these changes Jul 2, 2020

View reviewed changes

okuchaiev merged commit fe14046 into NVIDIA:candidate Jul 2, 2020

titu1994 deleted the candidate_optimizer_refactor branch July 2, 2020 18:31

blisc mentioned this pull request Jul 6, 2020

Hydra Configs #816

Merged

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for NeMo scope Optimizers support and add Novograd Optimizer #793

Add support for NeMo scope Optimizers support and add Novograd Optimizer #793

titu1994 commented Jul 1, 2020 •

edited

Loading

okuchaiev left a comment

blisc Jul 1, 2020

okuchaiev Jul 1, 2020

tkornuta-nvidia Jul 2, 2020

titu1994 Jul 2, 2020

blisc Jul 1, 2020

titu1994 Jul 1, 2020

blisc Jul 2, 2020

titu1994 Jul 2, 2020

blisc commented Jul 1, 2020

VahidooX Jul 2, 2020

titu1994 Jul 2, 2020

titu1994 Jul 2, 2020

VahidooX Jul 2, 2020

titu1994 Jul 2, 2020

		optimizer = get_optimizer(optimizer_name)
		self.__optimizer = optimizer(self.parameters(), lr=lr, **optimizer_args)

Add support for NeMo scope Optimizers support and add Novograd Optimizer #793

Add support for NeMo scope Optimizers support and add Novograd Optimizer #793

Conversation

titu1994 commented Jul 1, 2020 • edited Loading

Salient points

Usage

Use Adam and just override LR

Change optimizer and override LR

Change optimizer, override LR, override optimizer args

Usage - Overriding default args in the model itself

Base initialization

Override optimizer

Override optimizer and args

okuchaiev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blisc commented Jul 1, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

titu1994 commented Jul 1, 2020 •

edited

Loading