run_training_epoch duration increases with more epochs #17694
Unanswered
nilsleh
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment
-
I am also having issues with this. I tried training a CycleGAN model on |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have a LightningModule, DataModule and Trainer that I am using on a Regression Problem. I have observed that as epochs increase, the iterations/s on the tqdm bar decrease significantly by a factor of about 2-5. To look into this I used the
SimpleProfiler
and recordedrun_training_epoch
at each epoch insideon_train_epoch_end()
:when I plot these after 1000 epochs, I get the following:

I cannot share the full example that produced the plot above, but I tried to create a small toy example in a google colab notebook. The trend is not as severe as it is in the above picture but still there, so I am wondering where else the source of this could be as an individual training batch or the optimization step is not having this stark linear trend.
I have tried with
lightning=2.0.2
andpytorch_lighning=1.9.5
.Beta Was this translation helpful? Give feedback.
All reactions