You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a couple of questions regarding the dataset preprocessing:
For WebVid-10M, you mentioned filtering out videos with captions that do not contain dynamic content using the LLaMA-3 model. Could you please share the criteria or code used to determine whether a caption contains dynamic content?
For Panda-70M, you stated that 5.3 million videos were downloaded. Could you clarify which subset of videos were selected and how they were chosen?