Skip to content

Clarification of Decoding Settings for Video Benchmarks of PLM-8B #44

Open
@cheongalc

Description

@cheongalc

Dear authors, thank you very much for releasing this work. I am trying to reproduce the results of PLM-8B on Video-MME and PLM-VideoBench, and would like to clarify the following points:

  • During evaluation, can I check that 32, not 8 frames are uniformly sampled from each test video? I see that in page 16, Table 12 of the PE paper, it is stated that "For our PLM-8B we use PElangG as the vision encoder with 36 tiles for images and 32 frames for video". The default number of frames in params.json on Huggingface is also 32 indeed. Were 32 frames uniformly sampled from each video in each benchmark, regardless of the video length?

  • May I know what was the temperature, and whether there was any top_p and top_k decoding when doing the evaluations on page 16, Table 12 of the PE paper?

  • May I know what was max_tokens set as during your evaluations on page 16, Table 12 of the PE paper? The current value of 11264 is causing VRAM usage to be high.

The reason for the 2nd and 3rd question is that I am experiencing a strange issue, where PLM-8B tends to loop its outputs and fail to emit the end of sequence token properly.

This problem manifests even when I clone this repository, setup a fresh environment as per the instructions in README, and use the default generation config which is deterministic, has temperature=0, and does not use top_p or top_k sampling.

An example video is this 30 second one, with a prompt of Describe what is going on in this video in great detail. The generation I'm getting is this:

A black cat is lying on a rug and a ball rolls towards it. The cat rolls over and plays with the ball. The ball rolls away and the cat follows it. The cat then lies on its back and plays with the ball. The ball rolls away and the cat follows it. The cat then sits up and plays with the ball. The ball rolls away and the cat follows it. The cat then lies on its back and plays with the ball. The ball rolls away and the cat follows it. The cat then sits up and plays with the ball. The ball rolls away and the cat follows it. The cat then lies on its back and plays with the ball. The ball rolls away and the cat follows it. The cat then sits up and plays with the ball. The ball rolls away and the cat follows it. The cat then lies on its back and plays with the ball. The ball rolls away and the cat follows it. The cat then sits up and plays with the ball. The ball rolls away and the cat follows it. The cat then lies on its back and plays with the ball. The ball rolls away and the cat follows it. The cat then sits up and plays with the ball. The ball rolls away and the cat follows it. The

Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions