Elegant way to properly calculate training loss when model uses dropout #20073

adosar · 2024-07-11T18:32:16Z

adosar
Jul 11, 2024

If a model uses some dropout layers the training loss might be underestimated if calculated when model is in train mode.

In order to properly calculate the training loss, I do the following:

    def __init__(
            self, model: torch.nn.Module,
            loss: Callable, metric: Callable,
            lr: float=1e-3
            ):
        super().__init__()

        self.model = model
        self.lr = lr
        
        self.loss = loss
        self.metric = metric

        # For epoch-level operations.
        self.train_step_preds = []
        self.train_step_targets = []

        self.val_step_preds = []
        self.val_step_targets = []

        self.test_step_preds = []
        self.test_step_targets = []

        self.save_hyperparameters(ignore=['model', 'loss', 'metric'])

    def forward(self, x):
        r"""
        Run forward pass (forward method) of ``model``.
        """
        return self.model(x)

    def training_step(self, batch, batch_idx):
        r"""
        Compute and return training loss on a single ``batch`` from the train set.

        Also make and store predictions on this single ``batch``.
        """
        assert self.training
        assert torch.is_grad_enabled()

        x, y = batch
        y_pred = self(x)
        loss = self.loss(input=y_pred, target=y)

        # Account for BatchNorm and Dropout.
        self.eval()
        with torch.no_grad():
            preds = self(x)
        self.train()

        # Store for epoch-level operations.
        self.train_step_preds.append(preds.detach())
        self.train_step_targets.append(y)

        return loss
    
     def on_train_epoch_end(self):
          r"""
          Log ``metric`` calculated  on the whole train set.
          """
          preds = torch.cat(self.train_step_preds)
          targets = torch.cat(self.train_step_targets)
  
          metric = self.metric(preds=preds, target=targets)
  
          self.log('train_metric', metric, prog_bar=True)
          self.logger.experiment.add_scalars(
                  'learning_curve', {'train_acc': metric},
                  global_step=self.global_step,
                  )
  
          self.train_step_preds.clear()
          self.train_step_targets.clear()

As you can see, I need two forward passes inside the training_step which might be causing extra overhead. Is there any solution that correctly calculates the training loss with a single forward pass?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elegant way to properly calculate training loss when model uses dropout #20073

{{title}}

Replies: 0 comments

Select a reply

Elegant way to properly calculate training loss when model uses dropout #20073

adosar Jul 11, 2024

Replies: 0 comments

adosar
Jul 11, 2024