Vggish feature vs i3d flow visual feature

Hi. I am trying to extract visual and audio features on raw video clips. For visual features,
python main.py stack_size=24 step_size=8 extraction_fps=25 feature_type=i3d
Eg. it gives 112x1024 dimensional rgb and flow features on converted 25fps video using above command.

But for audio features, after converting the video fps to 25
python main.py feature_type=vggish 
produces features which don't match with that of visual feature in the first dimension
Eg. It gives 32x128 dim feature only.

Can you please tell what needs to be done so that I can get same 112x128 audio feature? 

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vggish feature vs i3d flow visual feature #121

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Vggish feature vs i3d flow visual feature #121

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions