Hacker News new | past | comments | ask | show | jobs | submit login

I really can’t wait until voice cloning means we get a version of audiobooks read in the author’s voice. Of course it will never be quite as good as them reading it themselves but I think the author’s voice adds something that voice actors can’t- they appear to be too generic and too affected in their pronunciation for me to connect with.



What the author adds, if they're not also a trained/well-practiced voice author, is that their inflection exactly matches how they meant the words in the book to be spoken/understood.

AI isn't going to be able to do that. As good as it may get, it won't be able to read the mind of the author. It's going to be even more generic than a human reader.


Exactly, the improvement will be in rerecording terrible readings into something enjoyable or at least inoffensive. That and personalization so you can choose the voice that you prefer.


Will this be the case when/if books become largely written with the help of AI? Let alone when AI start writing the books themselves.


There already are a lot of ai generated books on Amazon Kindle. Especially books for children (which require less text) are popular. I think that's a big problem, since LLMs are pretty good at imitating the style (good enough to fool a potential buyer) but don't seem to build the story on some sort of good message for the child.


It's an interesting question, but I don't think the models are trained to be spoken-language first like humans are. I think we all ("we" meaning "people who communicate by speaking," because I'm less sure if this would be true of sign languages) inherently think of writing as a somewhat lossy graphical representation of speech (which is itself an extremely lossy form of telepathy).

But the LLMs don't, do they? If anything, they're written-first, or even on some deep level, binary-first. I don't think what they write even has "a way they meant the words to be emphasized in speech."


Seems like a lot of guessing. I'm not convinced AI aren't "thinking" in the same manner we are. Eventually we'll have models trained on speech only, or modes of expression we can't fathom. Humans have no moat.


I wonder if AIs would even decide to do things in the same way we do. Most of what humans do has come from generations of having to operate within constraints that change over time. AI gets to leapfrog those constraints for a whole different kind.

Why would we assume what comes from them will even aspire to "being like humans"?

The number of reasons AIs might not add the same things to an audio version of a book (the context we're talking about in this thread) is essentially infinite. It seems vastly more likely that they won't add what the author adds than that they will.

Humanity may not have a moat, but each individual human does, especially when it comes to art, where I'd include writing.


If humans have unique capabilities individually, we would have them collectively as well. I have yet to see a single argument that any biological process can't be replicated or synthesized. Until there is such an argument, it's special pleading.

I can't say anything an an AIs aspirations, but that fact that we're imbuing them with all of our collective data, means they will be skewed to perceive the world similarly us, at least initially.


+1000


So you just need a sample of the author saying the "odd" words.


Odd, because I actually worry about this. I don't see why you'd want your books read by the author. Trained Voice Actors do a much better job, and can modulate their voices based on tone.

Autobiographies? Fine, but most of the time they are usually read by their authors.


If you think that a voice actor reading an audio book is too generic then I've got bad news about an AI trained on the author's voice...


I was hoping it would be voice transfer so the voice actor would give all the intonation and emotion and the AI would take that and make it sound like the author. Reading text with AI is getting better but yes it’ll be worse for a long time.


I have nearly no desire to have my book read my the author. They are good at writing, and an audiobook is not simply “reading” the words on the page. Maybe something like Descript that the author can use to tweak pronunciation after it’s narrated but I don’t want the author’s voice.

I would like train a model on Allyson Johnson’s voice (narrated the Honor Harrington books) and then use that to re-narrate the 1-2 books in one of the spinoffs (I think it was in the Saganami Island series?) where they used a different narrator (who was horrible).

I also might be interested in using it to clean up the Wheel of Time series where, while it’s the same 2 narrators, they change the pronunciation of various names/words book-to-book. “Moghedien” being the one that stands out most. They pronounce it at least 3 different ways:

* Mo-gid-e-on

* Mo-ga-dean

* Mog-a-din


It's curious, to me at least, why they didn't just go back and fix those themselves later. The early ones were on CD (or tape?), so maybe that's why.


I also wonder that. I'm not an audiobook narrator but if I were I'd need a audio "library" of names/places/etc that I could refer back to before reading a passage with a word I can't remember how to pronounce. The source of that "library" could either be from the author and/or my previous pronunciations. Without that I'd have no idea how I would stay consistent.


I think I'd prefer to have options for each audiobook. I have favorite narrators, and find others unlistenable. There are also thousands and thousands of books that will never otherwise be turned into audio format unless an AI is used.


Writing and being a voice actor are two quite different skills. My experience with author narrated audiobooks is that there isn’t very much overlap.


Never be as good as human? I disagree, seems like it’ll be nailed, no way to tell from the outside.


Audiobooks in the authors voice….. fine for non fiction, usually terrible for fiction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: