I read recently about big advances in AI speech production. I assume that is all still proprietary?
(And note, the voice is not bad compared to what computer speech synthesis used to be. I went back and listened again and was quite surprised by some human-like intonations. But, again, I couldn't listen to an article that would be longer than a minute or two.)
Neural voices are 4x the price of standard ones though! Standard is $4 per 1 million characters, neural is $16. https://aws.amazon.com/polly/pricing/
For what it's worth, Google, Amazon, and basically everywhere else I've looked have the $4/million or $16/million for the higher quality voices.
I'm currently just using Joanna since I can get some cost savings by not having to run Text to Speech on the same article multiple times, so long as everyone gets the same voice. But I'm considering offering a limited selection of voices for non-american listeners soon.
I have friends and family who would easily pay $100+/year to use this (compared to the $36/year you're charging now). The British neural option above is much more palatable than the current default, imo.
All in all, nice work!
For anyone interested in this, send me an email at email@example.com saying you’d like to hear when I release the higher quality voices at a higher price.
Customer service and newscast
I like Polly from an integration perspective but I’ve been searching for some as good as Azure Jenny.
Anyone found something as good or better?
It probably pays to get a foot in the door with this idea now because if the prices do eventually collapse (or, better, high quality open source TTS makes huge advances) you'll be well positioned to take advantage of it.
Edit: Oh, this new service provides value in that it generates a custom podcast feed using the articles, which you then subscribe to.
For people who want a curated selection of high-quality articles read by best of breed narrators, check out Audm . It isn’t going to be be on-demand or cover everything that one would want, but the source material is quite good (The Atlantic, New York Times, ProPublica, the New Yorker, Wired, etc.) and the narration is notable because most of the roster are well-known audiobook readers, which is different from similar services or even the stuff Apple is doing with Apple News+. For me, I’ve found that not only does having a human narrator make a huge difference, but having a good reader is a game changer. I’ve been a subscriber for a few years and it’s one of the most high-value subscriptions I have.
I read a lot — at least four or five books a month and hundreds of pages of blogs, magazines, scientific papers, etc. — but I really do like to listen to books or news when I have the chance. I can read about twice as fast as I can listen, but it isn’t always ideal.
In fact, apart from a few quirks (a pause at the "." of Mr. for instance), the iOS TTS engine sounds almost pleasant to me. It's no where near a good narrator, of course, but good enough if there is no poetic value in the words.
I use the Alex voice on my MacBook with a bookmarklet script to both read any web page and highlight text as it is being read. Special integrations for reddit and HN to skip reading user names and chrome. I used to have a version of this script work in PDF files as well, but recently that route has been closed. A nice UX feature is to trigger reading by clicking on the desired word, so you can easily pause and resume.
I currently use https://read2me.online/ which uses the ML TTS voices from one of the big cloud providers....Google (I think). I then have a custom RSS file hosted on an S3 bucket.
I hand crank the whole system when there's something new I want to listen to
Doesn't seem to be working. Putting a link returns "Bad request"; uploading a .txt file returns "Bad referrer/request type".
Here's the error from file upload:
Here's the error from using a link:
I thought that maybe listenlater.fm would record a high-quality reading from the most sent-in articles a day (do things that don't scale).
However, it apparently uses text-to-speech :/
So yea thanks for making this went for premium right away I assume its good but in general want to support audio based products :)