Ask for help

Hello, I have learned from the example of extracting features from speech using the AST model. I mimicked this example to extract features from new speech using my own model, and the shapes I obtained are all [1, 1214, 768]. However, I only want to get features similar to [1, 768]. So, I want to ask, are the features obtained from the final layer of AST all [1, 1214, 768]? Or have I made a mistake in my operation? Thank you for your assistance, and I look forward to your reply.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ask for help #130

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Ask for help #130

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions