How Does GradientAccumulationScheduler behave if steps in an epoch is not evenly divisible by accumulation steps? #19605
Unanswered
jeffwillette
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment 1 reply
-
My particular task only has 3-5 steps in each epoch and I set batchsize to 2 and accumulation step to 4. So far, the performance seems correct and sometimes better than no accumulation. So I think we can assume the accumulation step counter will not be reset between epoches. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have been trying to answer this question by looking through the Trainer (https://lightning.ai/docs/pytorch/stable/_modules/lightning/pytorch/trainer/trainer.html#Trainer.__init__) implementation, but I have not found the answer yet. Specifically, say that...
I think there are two ways this could be handled, possibly:
I would assume that option 1 would be the best choice, but I have not been able to verify that this is the behavior. Does anyone know where to verify this?
Beta Was this translation helpful? Give feedback.
All reactions