Questions about training setup and model details

Hello, 

First of all, thank you so much for sharing this model! 

I have a few questions regarding the training and usage details:

1. **Training configuration** – Could you please share some information about the training setup (GPU type, total training time, batch size, number of epochs, etc.)?
2. **Audio format** – What sampling rate was used for the training audio? (e.g., 16 kHz, 24 kHz, or 44.1 kHz?)
3. **Inference performance** – What is the current inference speed on your GPU? Have you tried using frameworks such as **vLLM** or others to optimize runtime performance?
4. **Multi-speaker training** – Have you experimented with multi-speaker training yet?
5. **Dataset format** – Would it be possible to release the dataset (or a subset) in a format like **Parquet**, containing fields such as `audio | transcript | sampling_rate`?

Thanks again for your contribution! 

Best regards,


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions about training setup and model details #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Questions about training setup and model details #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions