Will more dataloaders/datasets cause GPU/CPU memory usage to increase? #17677
Closed
Unanswered
24-solar-terms
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I ran into a problem while training my model.
I have 21 datasets, each with thousand samples, and Out of CUDA Memory occurs when I return a list of dataloaders using LightningDataModule's train_dataloader (self) function.
So I tried using 1 dataset, which returned a list of 1 dataloder, it worked fine and cost 23G GPU memory.
When I used 2 datasets, which returned a list of 2 dataloaders, it cost 42G GPU memory.
I tried use CombinedLoader in pytorch lightning to deal with list of dataloaders, but the problem wasn't solved.
However, when I used ConcatDataset in pytorch to combine datasets and created 1 dataloader, CUDA memory usage was stable, but CPU memory increasing while training until OOM occured.
Here is my pytorch dataset:
Here is my LightningDataModule
Here is train.py
Where is the potential for a memory leak?
Beta Was this translation helpful? Give feedback.
All reactions