Is there any best practice using torchcodec with pytorch dataloader ? #696

Ash-one · 2025-05-22T14:48:51Z

I am using torchcodec instead of decord, and it is more stable than decord on cpu. I want to know if there is a best practice code snippet for accelerating the dataloading and processing phase in pytorch training on GPU? Because I am getting trouble in loading too many videos with random sample strategy, which cost much time in the forward progress. I tried DALI, but it was quite hard to define sample strategies.

I will appreciate your prompt reply! Thanks!

NicolasHug · 2025-05-27T09:37:19Z

Thanks for the request @Ash-one , we don't have official recommendations at the moment, but it would be good to have some. Can you share more about your specific use case and what you have currently tried?

Ash-one · 2025-05-28T08:12:54Z

@NicolasHug Hi! I'd like to share my use case as the following:

My application scenario requires extracting a fixed number of video frames and audio from an MP4 file, and processing them into (T,C,H,W) video input and (mel_bins, frames) audio fbank features.

The current processing solution is two-stage: first, using ffmpeg to split the audio and video into MP4 and WAV files, then processing them separately with torchcodec and torchaudio.

Additionally, running on the VGGSound dataset with 200K videos, to speed up the data preprocessing process, I confirmed that torchcodec works well on the CPU, and then tried to accelerate the preprocessing of torchcodec under distributed conditions using GPUs, but encountered initialization issues.

It seems that repeatedly creating and releasing the VideoDecoder on the GPU has caused extra overhead, but sorry for that I don't quite understand the specific principles. By the way, I also want to ask if you recommend this operation?

NicolasHug · 2025-05-28T11:39:46Z

It seems that repeatedly creating and releasing the VideoDecoder on the GPU has caused extra overhead

Yeah, this is fairly common. We try to cache the GPU context as much as possible but maybe there are things we could improve (that's on our stack). And in any case, we should document the best way to use GPU resources with torchcodec.

The current processing solution is two-stage: first, using ffmpeg to split the audio and video into MP4 and WAV files, then processing them separately with torchcodec and torchaudio.

Just a note: you should be able to use the VideoDecoder and the AudioDecoder on the same video file (i.e. a file containing both video and audio streams). I think this could allow you to avoid the first ffmpeg step where you split the video and audio in separate files.

Ash-one · 2025-06-03T03:28:11Z

@NicolasHug Thanks to your reply!

Yeah, this is fairly common. We try to cache the GPU context as much as possible but maybe there are things we could improve (that's on our stack). And in any case, we should document the best way to use GPU resources with torchcodec.

Hope to see this soon! I would like to try them in the nightly version!

Just a note: you should be able to use the VideoDecoder and the AudioDecoder on the same video file (i.e. a file containing both video and audio streams).

That is so useful, I will try this method in my new pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is there any best practice using torchcodec with pytorch dataloader ? #696

Is there any best practice using torchcodec with pytorch dataloader ? #696

Ash-one commented May 22, 2025 •

edited

Loading

NicolasHug commented May 27, 2025

Uh oh!

Ash-one commented May 28, 2025 •

edited

Loading

Uh oh!

NicolasHug commented May 28, 2025

Uh oh!

Ash-one commented Jun 3, 2025

Uh oh!

Is there any best practice using torchcodec with pytorch dataloader ? #696

Is there any best practice using torchcodec with pytorch dataloader ? #696

Comments

Ash-one commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

NicolasHug commented May 27, 2025

Uh oh!

Ash-one commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug commented May 28, 2025

Uh oh!

Ash-one commented Jun 3, 2025

Uh oh!

Ash-one commented May 22, 2025 •

edited

Loading

Ash-one commented May 28, 2025 •

edited

Loading