geneing t1_j3hzfpy wrote on January 8, 2023 at 6:40 PM

Reply to comment by Intelligent_Rough_21 in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21

I think what you are looking for is called "expressive TTS". There have been a ton of papers in the last couple of years on the topic. Many provide code.

I've had some success with simply preserving the hidden state of the network from one sentence to the next.

SSML may not be expressive enough for your application.

Intelligent_Rough_21 OP t1_j3lkkbq wrote on January 9, 2023 at 12:10 PM

Thanks for the reference I’ll look into it