Replies: 3 comments
-
If I could make a suggestion. I would suggest calling the optimizer |
Beta Was this translation helpful? Give feedback.
-
I had the same issue, and as the @toddsp22 suggested the problem was in the 'optimizer' (I spelled the optimizer wrong in the definition of 'model_2' so it goes back to the existing 'optimizer' which was the 'optimizer' of model_1) |
Beta Was this translation helpful? Give feedback.
-
Hello, My functions and setup:
The training and testing code:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
When I train model_2 I get the following accuracies:
Epoch: 0
Train loss: 2.30229 | Train acc: 10.00%
Test loss: 2.30231 | Test acc: 9.99%
Epoch: 1
Train loss: 2.30228 | Train acc: 10.00%
Test loss: 2.30231 | Test acc: 9.99%
Epoch: 2
Train loss: 2.30228 | Train acc: 10.00%
Test loss: 2.30231 | Test acc: 9.99%
Train time on cpu: 147.086 seconds
So, there is something in wrong in:
my training function, my model or my function call.
Here is my train_function:
def train_step(model: torch.nn.Module,
data_loader: torch.utils.data.DataLoader,
loss_fn: torch.nn.Module,
optimizer: torch.optim.Optimizer,
accuracy_fn,
device: torch.device = device):
This code works on model_1 # The to(device) is not needed since I'm working only on a cpu. It didn't make a difference anyway.
Here is my MNIST V2:
class FashionMNISTModelV2(nn.Module):
def init(self, input_shape: int, hidden_units: int, output_shape: int):
super().init()
self.conv_block_1 = nn.Sequential(
nn.Conv2d(in_channels=input_shape,
out_channels=hidden_units,
kernel_size=3,
stride=1,
padding=1),
nn.ReLU(),
nn.Conv2d(in_channels=hidden_units,
out_channels=hidden_units,
kernel_size=3,
stride=1,
padding=1),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2)
)
self.conv_block_2 = nn.Sequential(
nn.Conv2d(hidden_units,hidden_units,3,padding=1),
nn.ReLU(),
nn.Conv2d(hidden_units,hidden_units,3,padding=1),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2)
)
self.classifier=nn.Sequential(
nn.Flatten(),
nn.Linear(in_features=hidden_units77,
out_features=output_shape)
)
Maybe someone could come up with a bug that I don't see.
Here is my epoch loop:
orch.manual_seed(42)
torch.cuda.manual_seed(42)
from timeit import default_timer as timer
train_time_start_model_2 = timer()
train and test model
epochs = 3
for epoch in tqdm(range(epochs)):
print(f"Epoch: {epoch}\n---------")
train_step(model=model_2,
data_loader = train_dataloader,
loss_fn=loss_fn,
optimizer=optimizer,
accuracy_fn=accuracy_fn,
device=device)
test_step(model=model_2,
data_loader = test_dataloader,
loss_fn=loss_fn,
accuracy_fn=accuracy_fn,
device=device)
train_time_end_model_2 = timer()
total_train_time_model_2 = print_train_time(start=train_time_start_model_2,
end=train_time_end_model_2,
device=device)
Perhaps someone could find a problem there.
Also I noticed something I thought was interesting. While trying to see if my functions were the problem, I noticed this:
If you are training model_1 and use model_2's parameters it won't learn. Here is what I used though:
loss_fn= nn.CrossEntropyLoss()
optimizer == torch.optim.SGD(params=model_2.parameters(),
lr=0.1)
Please I've spent hours combing through my code and I can't find an answer.
A version problem?
Beta Was this translation helpful? Give feedback.
All reactions