2022-04-12 21:28:36
Silero TTS V3 Finally Released
We have just released a brand new Russian speech synthesis model.
We have made a number of promises we kept:
- Model size reduced 2x;
- New models are 10x faster (!);
- We added flags to control stress;
- Now the models can make proper pauses;
- High quality voice added (and unlimited "random" voices);
- All speakers squeezed into the same model;
- Input length limitations lifted, now models can work with paragraphs of text;
- Pauses, speed and pitch can be controlled via SSML;
- Sampling rates of 8, 24 or 48 kHz are supported;
- Models are much more stable — they do not omit words anymore;
Next steps:
- Release models for the CIS languages, English, some European languages and Hindic languages
- Even further 2-4x speed up
- Updated stress model
- Phonemes support and and built-in voice transfer
Links:
- GitHub - https://github.com/snakers4/silero-models#text-to-speech
- Colab - https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb
- Russian article - https://habr.com/ru/post/660565/
- English article - https://habr.com/ru/post/660571/
3.4K views18:28