Why is in 6.3 Training ... no "model_3.train()" #1084

OneManShow0815 · 2024-09-09T18:18:59Z

OneManShow0815
Sep 9, 2024

Why is in 6.3 Training ... no "model_3.train()" in the for-loop?
In 3.2 Building a... is that in.
all use in "### Testing" the model_3.eval() mode, but do not switch back. to model_3.train() mode.
Why is that?

Regards
Sven

mrdbourke · 2024-09-12T04:25:52Z

mrdbourke
Sep 12, 2024
Maintainer

Hi @OneManShow0815 ,

The model is set to train() by default.

However, if you are calling training/test steps individually, you should explicitly state model.train() and model.eval().

If this is a mistake in course, that's my bad, I'm happy to fix it if you have a link/reference.

For example:

Train step

Calls model.train() at start.

def train_step(model: torch.nn.Module, 
               dataloader: torch.utils.data.DataLoader, 
               loss_fn: torch.nn.Module, 
               optimizer: torch.optim.Optimizer,
               device: torch.device) -> Tuple[float, float]:
  """Trains a PyTorch model for a single epoch.

  Turns a target PyTorch model to training mode and then
  runs through all of the required training steps (forward
  pass, loss calculation, optimizer step).

  Args:
    model: A PyTorch model to be trained.
    dataloader: A DataLoader instance for the model to be trained on.
    loss_fn: A PyTorch loss function to minimize.
    optimizer: A PyTorch optimizer to help minimize the loss function.
    device: A target device to compute on (e.g. "cuda" or "cpu").

  Returns:
    A tuple of training loss and training accuracy metrics.
    In the form (train_loss, train_accuracy). For example:

    (0.1112, 0.8743)
  """
  # Put model in train mode
  model.train()

  # Setup train loss and train accuracy values
  train_loss, train_acc = 0, 0

  # Loop through data loader data batches
  for batch, (X, y) in enumerate(dataloader):
      # Send data to target device
      X, y = X.to(device), y.to(device)

      # 1. Forward pass
      y_pred = model(X)

      # 2. Calculate  and accumulate loss
      loss = loss_fn(y_pred, y)
      train_loss += loss.item() 

      # 3. Optimizer zero grad
      optimizer.zero_grad()

      # 4. Loss backward
      loss.backward()

      # 5. Optimizer step
      optimizer.step()

      # Calculate and accumulate accuracy metric across all batches
      y_pred_class = torch.argmax(torch.softmax(y_pred, dim=1), dim=1)
      train_acc += (y_pred_class == y).sum().item()/len(y_pred)

  # Adjust metrics to get average loss and accuracy per batch 
  train_loss = train_loss / len(dataloader)
  train_acc = train_acc / len(dataloader)
  return train_loss, train_acc

Test step

Calls model.eval() at start.

def test_step(model: torch.nn.Module, 
              dataloader: torch.utils.data.DataLoader, 
              loss_fn: torch.nn.Module,
              device: torch.device) -> Tuple[float, float]:
  """Tests a PyTorch model for a single epoch.

  Turns a target PyTorch model to "eval" mode and then performs
  a forward pass on a testing dataset.

  Args:
    model: A PyTorch model to be tested.
    dataloader: A DataLoader instance for the model to be tested on.
    loss_fn: A PyTorch loss function to calculate loss on the test data.
    device: A target device to compute on (e.g. "cuda" or "cpu").

  Returns:
    A tuple of testing loss and testing accuracy metrics.
    In the form (test_loss, test_accuracy). For example:

    (0.0223, 0.8985)
  """
  # Put model in eval mode
  model.eval() 

  # Setup test loss and test accuracy values
  test_loss, test_acc = 0, 0

  # Turn on inference context manager
  with torch.inference_mode():
      # Loop through DataLoader batches
      for batch, (X, y) in enumerate(dataloader):
          # Send data to target device
          X, y = X.to(device), y.to(device)

          # 1. Forward pass
          test_pred_logits = model(X)

          # 2. Calculate and accumulate loss
          loss = loss_fn(test_pred_logits, y)
          test_loss += loss.item()

          # Calculate and accumulate accuracy
          test_pred_labels = test_pred_logits.argmax(dim=1)
          test_acc += ((test_pred_labels == y).sum().item()/len(test_pred_labels))

  # Adjust metrics to get average loss and accuracy per batch 
  test_loss = test_loss / len(dataloader)
  test_acc = test_acc / len(dataloader)
  return test_loss, test_acc

Cheers,

Daniel

1 reply

coopercox315 Oct 2, 2024

Hey Daniel,
I noticed the same thing with the file I downloaded for '03_pytorch_computer_vision.ipynb', where the train step function was missing from the code, I have added it myself on my local version but could be worth modifying the file for others on here!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why is in 6.3 Training ... no "model_3.train()" #1084

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why is in 6.3 Training ... no "model_3.train()" #1084

Uh oh!

OneManShow0815 Sep 9, 2024

Replies: 1 comment · 1 reply

Uh oh!

mrdbourke Sep 12, 2024 Maintainer

Train step

Test step

Uh oh!

coopercox315 Oct 2, 2024

OneManShow0815
Sep 9, 2024

Replies: 1 comment 1 reply

mrdbourke
Sep 12, 2024
Maintainer