The models are mostly user trained and submitted, and can widely in quality. Some of them are great, but many of them are not.
This one is trained on graphemes only (no Arpabet phonemes, which hinders pronunciation), and doesn't have a fine tuned vocoder (which causes spectral distortion "noise"). No notes on the data set source or quality.
The models are mostly user trained and submitted, and can widely in quality. Some of them are great, but many of them are not.
This one is trained on graphemes only (no Arpabet phonemes, which hinders pronunciation), and doesn't have a fine tuned vocoder (which causes spectral distortion "noise"). No notes on the data set source or quality.
It's about what I'd expect for a Jpow model.