Unexpected 1-Second Silence Added After Each Segment in HD Voice Synthesis

I’m experiencing an issue with StartSpeakingSsmlAsync while synthesizing HD voice output.
When I synthesize a single sentence, the output is as expected.
But when I synthesize multiple sentences one after another, each audio segment ends with approximately 1 second of silence that seems to be automatically inserted.
Could you please clarify why this silence is added and whether it can be removed?