Replies: 7 comments 4 replies
-
Solution https://pytorch-lightning.readthedocs.io/en/stable/advanced/multi_gpu.html |
Beta Was this translation helpful? Give feedback.
-
Solution #2807 |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Thanks, @Programmer-RD-AI. Sadly your solutions point to multi-gpu training, but I am looking into training multiple models on a single GPU (and not the other way around!). And the issue you reference is just training a lot of models sequentially (if I understood correctly) |
Beta Was this translation helpful? Give feedback.
-
@grudloff I don't know how this could be done or if this is even possible, but if you don't find the answer here perhaps you could also ask in the PyTorch forum. |
Beta Was this translation helpful? Give feedback.
-
It is possible to run multiple trainings on a single GPU using joblib. But as far as I know, you need to instantiate your |
Beta Was this translation helpful? Give feedback.
-
Hi, I am doing testing in parallel on a single GPU using joblib. I am instantiating all modules inside the delayed function. However, I had a problem with exceptions in pytorch (relevant issue), which was solved by specifying |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is there a recommended way of training multiple models in parallel in a single GPU? I tried using joblib's
Parallel
&delayed
but I got aCUDA OOM
with two instances even though a single model uses barely a fourth of the total memory. And is a speedup compared to sequential calling expected?Beta Was this translation helpful? Give feedback.
All reactions