Notebook 03: What is the dimension that I should pass to softmax? #465
-
In Chapter 3, he uses dim=0 and dim=1 for SoftMax. Ref here and here Both the places the input is a 4-dimensional vector, but 1 had 32 as the batch size and the other has its batch size to be 1. I don't know how a batch size can impact that. Can someone please explain it. Thanks, in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @vence-andersen , You'll want to perform the softmax operation on the logits dimension. For example, with But with Changing the softmax code in Notebook 03 to in the def make_predictions(model: torch.nn.Module, data: list, device: torch.device = device):
pred_probs = []
model.eval()
with torch.inference_mode():
for sample in data:
# Prepare sample
sample = torch.unsqueeze(sample, dim=0).to(device) # Add an extra dimension and send sample to device
# Forward pass (model outputs raw logit)
pred_logit = model(sample)
# Get prediction probability (logit -> prediction probability)
pred_prob = torch.softmax(pred_logit.squeeze(), dim=1) #### Changed to dim=1, will error ####
# Get pred_prob off GPU for further calculations
pred_probs.append(pred_prob.cpu())
# Stack the pred_probs to turn list into a tensor
return torch.stack(pred_probs)
import random
random.seed(42)
test_samples = []
test_labels = []
for sample, label in random.sample(list(test_data), k=9):
test_samples.append(sample)
test_labels.append(label)
# View the first test sample shape and label
print(f"Test sample image shape: {test_samples[0].shape}\nTest sample label: {test_labels[0]} ({class_names[test_labels[0]]})")
# Make predictions on test samples with model 2
pred_probs= make_predictions(model=model_2,
data=test_samples)
# View first two prediction probabilities list
print(pred_probs[:2]) Output:
I'd recommend giving it a try and see for yourself, notebook link: https://www.learnpytorch.io/03_pytorch_computer_vision/ I've added a note in the notebook to reflect your question too. |
Beta Was this translation helpful? Give feedback.
Hi @vence-andersen ,
You'll want to perform the softmax operation on the logits dimension.
For example, with
batch_size=1
, you can perform it ondim=0
(there is only one value).But with
batch_size=32
, you'll want to perform it acrossdim=1
(assuming your tensor shape is[batch_size, logits]
).Changing the softmax code in Notebook 03 to in the
make_predictions()
function fromdim=0
todim=1
will error: