You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, mixed precision training is implemented in NNTrainer, but gradient clipping considering loss scale has not been implemented yet.
In Torch's example, it is implemented as follows, and there is a need to implement this in NNTrainer too.
scaler=GradScaler()
forepochinepochs:
forinput, targetindata:
optimizer.zero_grad()
withautocast(device_type='cuda', dtype=torch.float16):
output=model(input)
loss=loss_fn(output, target)
scaler.scale(loss).backward()
# Unscales the gradients of optimizer's assigned params in-placescaler.unscale_(optimizer)
# Since the gradients of optimizer's assigned params are unscaled, clips as usual:torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm)
# optimizer's gradients are already unscaled, so scaler.step does not unscale them,# although it still skips optimizer.step() if the gradients contain infs or NaNs.scaler.step(optimizer)
# Updates the scale for next iteration.scaler.update()
The text was updated successfully, but these errors were encountered:
Currently, mixed precision training is implemented in NNTrainer, but gradient clipping considering loss scale has not been implemented yet.
In Torch's example, it is implemented as follows, and there is a need to implement this in NNTrainer too.
The text was updated successfully, but these errors were encountered: