You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a model uses some dropout layers the training loss might be underestimated if calculated when model is in train mode.
In order to properly calculate the training loss, I do the following:
def__init__(
self, model: torch.nn.Module,
loss: Callable, metric: Callable,
lr: float=1e-3
):
super().__init__()
self.model=modelself.lr=lrself.loss=lossself.metric=metric# For epoch-level operations.self.train_step_preds= []
self.train_step_targets= []
self.val_step_preds= []
self.val_step_targets= []
self.test_step_preds= []
self.test_step_targets= []
self.save_hyperparameters(ignore=['model', 'loss', 'metric'])
defforward(self, x):
r""" Run forward pass (forward method) of ``model``. """returnself.model(x)
deftraining_step(self, batch, batch_idx):
r""" Compute and return training loss on a single ``batch`` from the train set. Also make and store predictions on this single ``batch``. """assertself.trainingasserttorch.is_grad_enabled()
x, y=batchy_pred=self(x)
loss=self.loss(input=y_pred, target=y)
# Account for BatchNorm and Dropout.self.eval()
withtorch.no_grad():
preds=self(x)
self.train()
# Store for epoch-level operations.self.train_step_preds.append(preds.detach())
self.train_step_targets.append(y)
returnlossdefon_train_epoch_end(self):
r""" Log ``metric`` calculated on the whole train set. """preds=torch.cat(self.train_step_preds)
targets=torch.cat(self.train_step_targets)
metric=self.metric(preds=preds, target=targets)
self.log('train_metric', metric, prog_bar=True)
self.logger.experiment.add_scalars(
'learning_curve', {'train_acc': metric},
global_step=self.global_step,
)
self.train_step_preds.clear()
self.train_step_targets.clear()
As you can see, I need two forward passes inside the training_step which might be causing extra overhead. Is there any solution that correctly calculates the training loss with a single forward pass?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
If a model uses some dropout layers the training loss might be underestimated if calculated when
model
is intrain
mode.In order to properly calculate the training loss, I do the following:
As you can see, I need two forward passes inside the
training_step
which might be causing extra overhead. Is there any solution that correctly calculates the training loss with a single forward pass?Beta Was this translation helpful? Give feedback.
All reactions