Skip to content

Vggish feature vs i3d flow visual feature #121

@1980x

Description

@1980x

Hi. I am trying to extract visual and audio features on raw video clips. For visual features,
python main.py stack_size=24 step_size=8 extraction_fps=25 feature_type=i3d
Eg. it gives 112x1024 dimensional rgb and flow features on converted 25fps video using above command.

But for audio features, after converting the video fps to 25
python main.py feature_type=vggish
produces features which don't match with that of visual feature in the first dimension
Eg. It gives 32x128 dim feature only.

Can you please tell what needs to be done so that I can get same 112x128 audio feature?

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions