
ForwardTacotron – Generating speech without attention - datitran
https://github.com/as-ideas/ForwardTacotron
======
datitran
We've just open-sourced our first text-to-speech project! It's also our first
public PyTorch project. Inspired by Microsoft's FastSpeech, we modified
Tacotron (Fork from fatchord's WaveRNN) to generate speech in a single forward
pass without using any attention. Hence, we call the model ⏩ ForwardTacotron.

The model has several advantages:

* Robustness: No repeats and failed attention modes for complex sentences

* Speed: Generating a spectrogram takes about 0.04s on a RTX2080

* Controllability: You can control the speed of the speech synthesis

️* Efficiency: No usage of attention so memory size grows linearly with text
size

We also provide a Colab notebook to try out our pre-trained model trained 100k
steps on LJSpeech and also some Samples. Check it out!

* Github: [https://github.com/as-ideas/ForwardTacotron](https://github.com/as-ideas/ForwardTacotron)

* Samples: [https://as-ideas.github.io/ForwardTacotron/](https://as-ideas.github.io/ForwardTacotron/)

* Colab notebook: [https://colab.research.google.com/github/as-ideas/ForwardTac...](https://colab.research.google.com/github/as-ideas/ForwardTacotron/blob/master/notebooks/synthesize.ipynb)

