Notebook 03: What is the dimension that I should pass to softmax? #465

vence-andersen · 2023-05-25T20:36:35Z

vence-andersen
May 25, 2023

In Chapter 3, he uses dim=0 and dim=1 for SoftMax. Ref here and here

Both the places the input is a 4-dimensional vector, but 1 had 32 as the batch size and the other has its batch size to be 1. I don't know how a batch size can impact that.

Can someone please explain it. Thanks, in advance.

Answered by mrdbourke

May 26, 2023

Hi @vence-andersen ,

You'll want to perform the softmax operation on the logits dimension.

For example, with batch_size=1, you can perform it on dim=0 (there is only one value).

But with batch_size=32, you'll want to perform it across dim=1 (assuming your tensor shape is [batch_size, logits]).

Changing the softmax code in Notebook 03 to in the make_predictions() function from dim=0 to dim=1 will error:

def make_predictions(model: torch.nn.Module, data: list, device: torch.device = device):
    pred_probs = []
    model.eval()
    with torch.inference_mode():
        for sample in data:
            # Prepare sample
            sample = torch.unsqueeze(sample, dim=0).to(device) # Add an ext…

View full answer

mrdbourke · 2023-05-26T04:28:57Z

mrdbourke
May 26, 2023
Maintainer

Hi @vence-andersen ,

You'll want to perform the softmax operation on the logits dimension.

For example, with batch_size=1, you can perform it on dim=0 (there is only one value).

But with batch_size=32, you'll want to perform it across dim=1 (assuming your tensor shape is [batch_size, logits]).

Changing the softmax code in Notebook 03 to in the make_predictions() function from dim=0 to dim=1 will error:

def make_predictions(model: torch.nn.Module, data: list, device: torch.device = device):
    pred_probs = []
    model.eval()
    with torch.inference_mode():
        for sample in data:
            # Prepare sample
            sample = torch.unsqueeze(sample, dim=0).to(device) # Add an extra dimension and send sample to device

            # Forward pass (model outputs raw logit)
            pred_logit = model(sample)

            # Get prediction probability (logit -> prediction probability)
            pred_prob = torch.softmax(pred_logit.squeeze(), dim=1) #### Changed to dim=1, will error ####

            # Get pred_prob off GPU for further calculations
            pred_probs.append(pred_prob.cpu())
            
    # Stack the pred_probs to turn list into a tensor
    return torch.stack(pred_probs)

import random
random.seed(42)
test_samples = []
test_labels = []
for sample, label in random.sample(list(test_data), k=9):
    test_samples.append(sample)
    test_labels.append(label)

# View the first test sample shape and label
print(f"Test sample image shape: {test_samples[0].shape}\nTest sample label: {test_labels[0]} ({class_names[test_labels[0]]})")

# Make predictions on test samples with model 2
pred_probs= make_predictions(model=model_2, 
                             data=test_samples)

# View first two prediction probabilities list
print(pred_probs[:2])

Output:

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

I'd recommend giving it a try and see for yourself, notebook link: https://www.learnpytorch.io/03_pytorch_computer_vision/

I've added a note in the notebook to reflect your question too.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Notebook 03: What is the dimension that I should pass to softmax? #465

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Notebook 03: What is the dimension that I should pass to softmax? #465

Uh oh!

vence-andersen May 25, 2023

Replies: 1 comment

Uh oh!

mrdbourke May 26, 2023 Maintainer

vence-andersen
May 25, 2023

mrdbourke
May 26, 2023
Maintainer