
ClariNet: Parallel Wave Generation in End-To-End Text-To-Speech - kainan
Baidu Silicon Valley AI Lab has reached a new milestone in speech synthesis with the release of ClariNet, the first fully end-to-end TTS model that directly converts text into audio waveform with a single neural network. It generates all samples of a waveform in parallel.
======
kainan
Blog post: [http://research.baidu.com/Blog/index-
view?id=106](http://research.baidu.com/Blog/index-view?id=106)

