Skip to content

How to call multiple voice in SSML

szhaomsft edited this page Mar 13, 2020 · 14 revisions

Customer may want to use multiple voiced in one SSML. Azure TTS support combing multiple voice with SSML.

Multiple standard voices

To use multiple standard voices, one should have SSML composed to refer to the voices to be used.

<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US"> <voice name="en-US-AriaNeural"> This is the text that is spoken. </voice> <voice name="en-US-GuyNeural"> This is the text that is spoken. </voice> </speak>

then everything is the same like SSML with single voice.

Multiple custom voices

For customer voice, currently the custom endpoint needs to have the custom voice deployment id. Refer to: https://docs.microsoft.com/bs-latn-ba/azure/cognitive-services/speech-service/regions#custom-voices

To access multiple custom voice in the SSML like above, the endpoint also needs to have multiple deployment IDs.

https://eastasia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId=c30728d-31f9-49bc-8bc6-cc8cf5a99d11&deploymentId=c30728d-31f9-49bc-8bc6-cc8cf5a99d12&deploymentId=c30728d-31f9-49bc-8bc6-cc8cf5a99d13

if there are too many voices to put into the URL, it is recommended to have some code to construct the URL dynamically based on the SSML content.

Clone this wiki locally