When I read Rikki Tikki Tavi to my 8 yr old daughter, we play a game. She asks me to change one of the words in the page and she tries to listen and see if she can figure it out. It is mentally taxing at the end of a long day to do that on the fly without pausing to figure out the word to slip in. And, my daughter is very sharp and catches them.
I listened to a few of these. The voice sounds muted at times, as if the reader has a stuffy nose. H.G. Wells was read with a pause in between each period because it "thinks" that each letter boundary is a sentence change, which drove me batty. And, there is zero life in the stories. It might be a good thing to put in front of a kid to put them to sleep, maybe? But, it would not put me to sleep because it is just aggravating to listen to these stories stripped of all life by AI.
Like Louis CK said: "Everything is amazing and no one is happy." I know this is incredible that AI can take in a transcript and produce something that most people would be able to distinguish between a real human. But, we should ask if you would want to hang out with the voice actor at a party.
> Like Louis CK said: "Everything is amazing and no one is happy."
Everything is not amazing. Sure things are amazing from a technical perspective. But most tech advancements I think have been harmful to society in the last 30 years or so. Its awesome that computers are so powerful and we have awesome video and photos and can share things so easily. But technology should better lives, and not cheapen it, which it often does. Tech is being used to try and replace essential human lived experiences to try and inject advertising into it and extract money.
Technology can not replace the human, its impossible. No matter how good the AI is at reading the book, it will never replace sitting next to your parent and them reading it. No matter how easy it is to share a video or a photo, it will never replace sitting next to someone and them showing you photos, or better yet being there when the photo was taken.
I forget the exact quote but the thing I've seen making the rounds sums it up pretty well: Computers were supposed to do the work so people could make art and write poetry. Now the computers are making art and writing poetry and I still have to have a job.
In another life I'd love to do voice over work. (I even have a face for radio!) But, instead, technology is being used to avoid even having humans do that type of work. Sure, today it's PG, but they're definitely doing this with an eye to replacing actual voiceover actors.
Every advance in AI is "how can we replace people and save money?" and not "how can people have better lives and work less?" And it's going to continue until it's "what the fuck do we do with all these jobless people who've been replaced?"
As software developers, we know what getting workers to have better lives while working less looks like. There's some sleight of hand at play, though, in the employer/employee relationship (favoring the employer).
> Every advance in AI is "how can we replace people and save money?" and not "how can people have better lives and work less?"
It's not just AI, but technology generally. And it's because when it comes to managing people, organizations for the most part don't actually concern themselves with getting their employees to produce value—that is, whether they are, and how much, and at what cost (to the business) it comes at, and where that measure of productivity lies (objectively) when scored against some rubric. Instead what they make their most immediate concern is whether their employees are exposed to sufficient toil. Look at any example that involves someone accepting a new job with a set of work duties/expectations where they proceed to automate part of their workload and thus provide the same value (or more) in comparison to what they were doing before, or in comparison to their coworkers, or in comparison to whomever would have ended up with the job if the person who did accept and automate it had accepted an offer elsewhere instead: they end up soliciting feedback (or opining themselves) about whether what they're doing is unethical.
This is the mechanism that wealth disparity through concentration of wealth comes from, but everyone (the employer and the employee alike) walks around as if they either don't notice it or—if they do—as if it's wrong when there's a known path for the concentration to flow upward but it isn't happening.
> their most immediate concern is whether their employees are exposed to sufficient toil.
Protestant work ethic, twisted and disfigured through late capitalism
has become a sadistic and wholly disgusting human trait. To impose
moral, intellectual and physical labour on others, not of necessity,
nor to create value, but to serve a system rooted in guilt and a
craving for validation in the eyes of others is about as un-Christian
as can be.
Only until your business is targetted by big corp and goes bust/bankrupt.
Or acquired. Then the wealth flows upwards, employees are cut, to make things more profitable, and the people who originally created something great are not getting much for it. Instead, if they are not let go, they are under new lords, who take a big chunk of the profits.
Or a competitor gets VC funded and by means of marketting and sales, instead of actually making a better product and your business' product's adoption is dwarfed.
I think there are many reasons why some business can fail, and most of them are not about the amount of created value. The free market is not a rationally acting person.
You haven't responded (clearly) here to anything I actually said. You just posted two short, dismissive comments consisting of glib non-specifics.
If you want to dispute what I'm saying, how about starting with the example I gave (an employee figures out how to automate part of their job, enabling them to either 1. deliver the same amount of value to their employer at a fraction of the effort, or 2. deliver something like 2x–10x or more value, owing to the fact that they've been able to automate it)?
If you automate part of your job resulting in a 2x improvement of your productivity, you have demonstrated a skill that you can sell for more money. That's how you realize the value you created.
> The wealth didn't "flow" to you. You created it.
Er, right. The "flow" here refers to what happens to the wealth after creation.
If after you create it you or your employer then undertake some change to the work arrangement (e.g. imposing a higher quota on productivity—thus allowing your employer to capture the additional value that you created while keeping your net wealth constant, along with the day-to-day toil you the worker are subjected to, probably—or maybe even increasing it), that would be an example of wealth flowing upward.
You created outsized value relative to needs of the employer. Your (wealthier) employer captured it and enriched themselves. That's wealth concentration.
When buying lunch are you paying the (fixed) price asked by its creator or are you sharing (some of) the future value that hamburger is providing you in nourishment and work energy for the rest of the afternoon?
Listening to and reading books by economists. If you think about it, you'll see it in action all the time. After all, consider yourself. Does wealth "flow" to you? Or do you create value at your job, and exchange that value for your paycheck?
> Every advance in AI is "how can we replace people and save money?"
this is not true now, and also does not have to be true. Instead of a "look at the incentives" talk to someone having a bad comment moment.. instead we can be reminded of Doug Englebart, who said "computer systems can augment human intelligence and team interaction" and specifically NOT "replace humans" .. As I understand it, in Palo Alto, Doug found great interest among the DoD crowd .. a good portion of whom would have a second meeting after his demos, and then discuss how they can get back to the important work of replacing people.
Consider the incentives, consider who has an interest in this hype cycle, and sales profits. When you see a US visit to Vietnam this week, with MSFT pitching "social trust" AI services to "ordinary people" .. does this really sound like trust in the making? Is AI drones in combat really what we need now ? Replacing striking Hollywood writers and getting name-brand actors for pennies on the dollar, is that what "we" need?
I do not agree that AI can only replace people.. however, there is a lot of short term profit and control ready for those that do.. maybe something needs to be done about that?
In my opinion we seriously need to think about providing people with new perspectives, as we replace their jobs / automate them away. We need a social system, that encourages learning at every step in people's lives. A nation should have an interest in getting people back into meaningful jobs and should act according to that interest. The coal mining industry worker, who loses their job, because we no longer want to mine coal? How can we get that person a good new job? How can we make it so that that person gets the necessary qualifications?
We are still (I think in most countries around the world and at the very least where I live.) throwing away enormous amount of human potential.
We'll invent a third World War long before that happens - to thin the herd and remind everyone using rationing and austerity about how great consumerism is, while creating plenty of jobs rebuilding the industrialized world.
I wouldn't blame folks wanting to work on fast takeoff AI with no human alignment concerns. Heads, the world ends because you've bootstrapped something unsympathetic and more powerful than humanity. Tails, you've bootstrapped something that might be able to overpower entrenched interests, providing a chance at a better societal outcome.
I think the thing that has surprised everyone in this revolution is that the opposite has happened. Musk wasted billions trying to automate vehicle manufacturing while AI is threatening to take the jobs of novelists and graphic designers.
On the other hand, I am reminded of a quote by Christopher Hitchens (from memory), “They say that everyone has a book in them. For most of them, it would be better if it stayed there".
Some of the films and TV programmes I've watched recently have made me wonder, as I gaze across at the writers on strike who have some legitimate concerns but who have also provided some bloody awful writing, if I wouldn't prefer AI to take over the production of art - it certainly wouldn't be able to produce a messy bed, would it? That'd be a win too.
> I gaze across at the writers on strike who have some legitimate concerns but who have also provided some bloody awful writing
Too often the writing of film and TV is dictated by the producers/studio -- people who have an interest in financial returns, not quality. Those writers would probably love to write their own show, their way, unhindered, and would probably produce something watchable.
Of course, writers subvert their instructions sometimes to great effect. On BSG I believe they were told their show was "too dark" and someone insisted someone have "a birthday party". Which they duly put in, and then had them all die in some kind of terror/bomb incident.
It hasn't been a surprise to anyone in the field. Turns out it's much easier to read digital content in the form of bits then to read real world data. Hardware is harder than software.
These kinds of responses are so easy to write after the fact. Show us a quote from ten years ago then, please, that says that creative writing and art will be among the first things to be automated at a mass scale. Since this
> hasn't been a surprise to anyone in the field
apparently, it should be pretty easy to find such a quote.
> it will never replace sitting next to someone and them showing you photos
It definitely does replace that. It sucks so much to be trapped next to someone showing you their photo album or vacation slides, when you don’t really care, that this became a stock scene in 20th-century comedy TV series and films. Nowadays when people are sharing their photos online, that gives their peers the choice of whether to look or whether to ignore, and that is immensely freeing.
The photo slide show of someone's vacation was a stock scene in comedies. But have you never sat down with family and went through old photos? Having conversations about where was that? who was this? who was this as a baby? Its a very different and much more personal experience than flipping through facebook.
Sure have, it’s hellacious. I don’t give a fuck about who that baby was, why would I? Relations who my parents only vaguely remember going somewhere boring that I would never go, or someone’s 12th trip to the same lake, is a great use for Facebook. Most people are crazy boring, if I care I can always ask.
> But most tech advancements I think have been harmful to society in the last 30 years or so.
I thought this was overly cynical until I read your comment. Now I'm not so sure. Has our attention span really become so shot that just being with family has become a chore, and we'd rather our parents just post their life stories into the void that is Facebook?
Sounds like I struck a nerve. Next time you’re telling someone a detailed story about what route you took to dinner, stop and ask yourself if it’s actually that interesting.
Seriously though, technology has nothing to do with this. People have been bored out of their mind by other people for centuries, the only difference is there’s no excuse for having nothing interesting to say now.
You’re talking about _being shown_ photos you don’t care about, while GP is talking about showing your own photos to someone. I agree with the example you’re discussing, though.
Ah, you’re talking about main character syndrome. Showing pictures to someone is one of the cruelest things you can do; you’re probably boring and a terrible storyteller (most people are) but they’re going to feel obligated to not tell you that.
It’s always amazing to me that people almost universally hate other people’s slideshows, and then don’t have the self awareness to realize that they do the exact same thing.
>Showing pictures to someone is one of the cruelest things you can do
So you are saying that friends, family and significant others better no share old photos and memories with you? Because that is one of the cruelest things to do?
They can share those old photos and memories on social media, where their friends, families, and significant others can choose whether to look or ignore it.
Elevenlabs is a lot closer to compelling audiobook narration (needs a better way to deal with multiple characters in a story without manual use of multiple voices): https://pub-a24da573c61f4b2d905bdebb2d0ecf88.r2.dev/ElevenLa... (an H.G.Wells example I just asked it to read).
I was going to mention ElevenLabs, too. Their samples are very impressive in how the intonation and word stress are varied based on the text’s meaning. Their pricing is a bit high for personal use, though.
(The link you posted seems to have been truncated. Can you try posting it again?)
Yeah, sadly it'd cost about $100 to get a book per month... Not quite competitive with Audible yet, but give it a year perhaps, or a few iterations of the open source models... (fixed the link)
Now you have a pretty good idea how blind people must feel. Yes, a good audiobook should be read by a human. But if you dont have that, speech synthesis is the best or even only thing you can get. And then, many years later, you read a post like yours. And you realize that man is spoiled.
> But, we should ask if you would want to hang out with the voice actor at a party.
I think the question is really “Will I be able to enjoy great books I otherwise would not have experienced?”
For me, it’s not that these are superior or equivalent books to parents reading to their kid or voice actors; it’s whether I’ll listen to a book for free that I wouldn’t be able to afford $10-30.
Plus lots of books don’t have audiobooks. I’ve a few sitting on my to read list for years on end just cuz there’s no audio. Being able to make one myself with AI would be awesome.
Also, I’ve bought ebooks and want to listen to the book but don’t want to pay again for the voice acting. So I’ve bought a license to the IP and would like to listen rather than read.
Audible is $7.95 a month and you can listen to whatever book you want (like Spotify). If you’re not willing to go even with that in order to listen to an actual human, then maybe yeah, you can try AI.
> Audible is $7.95 a month and you can listen to whatever book you want (like Spotify)
Not true at all. Audible Plus gives you access to a tiny subset of the full library, the rest (which includes all the best titles) need to be purchased separately.
Unless things have changed since I was a subscriber, you get a token every month which can be used to purchase any book from the full library. So its effectively 1 book a month + a few extras bonuses for $7.95
You don't get a token without paying for the premium plan at $15 a month. Also, don't tell anyone but if you subscribe and then cancel and give the reason that it is too expensive you can often get a reduced price the next few months.
I typically buy at least two titles per month, and the best deal ended up being:
Audible Premium Plus - 1 Credit Every Other Month for $17 ($8.50/mo)
You can buy 3 more credits for $37.99 (12.66/ea). It’s also worth checking individual titles because quite a few cost less than the credits.
Correction: I guess I actually buy slightly fewer than 2/mo because there’s a plan for that ($22.95/mo) that’s cheaper than the 27.50/mo from my numbers above. I had that one for a while but ended up feeling pressured to use them before they expire.
You probably won't be interested since it's even more pressure to use them before they expire, but there's also annual plans which are even cheaper if you can be happy using 12 (or 24) tokens within 12 months (you get them at the start and they expire at the end of the year):
Audible Premium Plus Annual - 12 Credits
$149.50/year (way cheaper in UK: £69.99/year)
Audible Premium Plus Annual - 24 Credits
$229.50/year (£109.99/year)
Although, as soon as I'm logged in with my account (UK) which had subscribed in the past but isn't currently, it doesn't seem to be giving me any options except to start a 1 month free trial for 1 token/month, not sure if other options aren't available or just extremely well hidden...
edit: no it is available for my account, though I'm going to remain a non-subscriber and keep using my local digital library :)
Seth Godin did a whole Akimbo podcast that was written by ChatGPT, and the audio was AI generated. The voice was spot on, the content and delivery was dead. I almost fell asleep listening to it, which is NEVER the case for any other episode of Akimbo I've listened to.
> H.G. Wells was read with a pause in between each period because it "thinks" that each letter boundary is a sentence change
This is why I'm a firm "two spaces after the period" guy. Makes it unambiguous the difference between the abbrevs. period and the sentence-end period. Otherwise you get sentences like "Let's not forget that Dr. Principal does not care about this." which can be read in two valid ways.
Of course some style guides would tell you not to put a dot after "Dr" because "r" is the last letter of "Doctor". Similarly, the abbreviation of "Saint" would be "St", while the abbreviation of "Street" would be "St.", according to those style guides.
Meanwhile the GB military style guide says never to use a dot after any abbreviation, I think.
Also, the style guides I'm familiar with prescribe "H. G. Wells", rather than "H.G. Wells", but "H.G.W." if you're abbreviating all of the words.
None of this is of much interest to anyone who isn't an editor but I thought I'd mention it anyway.
Right. That's probably the most common historical form, and is a good example of how the punctuation for sentence-ends and abbreviations is often the same - period and then single-space.
This trick doesn't work across linebreaks (unless you adopt a rule like "treat the spaces in the nouns as non-breaking and do not permit a linebreak for anything that isn't a sentence boundary").
Emacs does (or did) exactly that, perhaps by default: I think I had to disable it once because it was annoying me ... (setq sentence-end-double-space nil)?
Fair point, the sentence I invented off the top of my head isn't perfectly grammatically correct but it's close-enough that it shows the ambiguity problem. It's a lot to ask text-to-speech and typesetting programs to figure out contextually which periods are abbreviations and which periods are end-of-sentence, and so having a hard text cue like double-space would help. Then typesetters would have a hard cue of when to replace the space with a thin-space (which is supposed to happen in the case of something like "H. G. Wells").
How does it feel to have websites and books and newspapers and practically every other place silently ignore your double spaces and treat them as a single space?
The sibling poster is right, the quality varies. But the upper end of the quality range is really quite good. One of the best-read audiobooks I've ever heard was a Lit2Go edition of Pygmalion. And, for that matter, one of the worst-read audiobooks I've ever heard was an edition of an extremely famous and commercially successful book that I bought on Audible.
Are you a specist? Why should we value more an audiobook that's read by a human?
Sorry, just joking. But here's a reason: these things were not quality-checked at all. Click on Moon Voyage by Jules Verne and be greeted with a very human-like voice reading an numbered list of "other works by the author" in an extremely awkward fashion that's probably caused by how the .TXT file is organized.
Not a good idea then. The librivox experience turned me away from professionally read audiobooks for far too long.
Amateur readers are hit & miss. A lot of professional readers are actors or have a lot of experience. There is a reason people do pay for professionally read books instead of electronic reading or librivox only.
I was kind of hoping this was going to be human beings contributing read aloud versions of Gutenberg content. Since it’s not, I’ll propose a cool project. Raise money to enlist high quality voice actors to create audiobooks from Gutenberg. Release these audiobooks to the world for free. Which books come first could be voted upon. As someone who has used TTS a Lot in recent projects, I’m not excited about listening to AI read a book to me. It feels soulless.
>As someone who has used TTS a Lot in recent projects, I’m not excited about listening to AI read a book to me. It feels soulless.
AI TTS still uncanny valley enough to distract. I prefer even more soulless traditional TTS which sounds "neutral" after habituation. To the point where my brain can start layering on characterization as if I was reading. AI TTS feels like listening to to medicore voice actor, where it's hard to overwrite their creative choices, so just left disapointmented and annoyed.
I agree completely! I kinda like the neutral tone of a soulless robot when it knows how to stay out of the way. Far better than a bad AI _or_ a poor human reading.
I have used TTS in the past and in the last few years there has been a quantum leap in TTS quality. A similar such leap in the next few years and it will dominate the audiobook scene for good or bad.
It might be worse than human narration, but at some point the economics becomes so loopsided that it's dominance is inevitable. One good thing I can see coming out of that will be an abundance of audiobooks of copyright expired books.
Are the economics actually better, or do they look better due to a lack of quality control? Because no TTS - even the most current AI ones - are perfect. They need corrections, which involves a human's time. And it's time that dictates prices, not skill (which largely reduces time).
The key is just which time is faster. If you are able to just listen to it once, and note a few errors, and slightly adjust, it may still may be much faster to use AI.
Based off Apple’s advertised times to produce AI audiobooks, the times are comparable. AI is not running quickly nor inexpensively for this task it seems.
Does anyone know of and TTS available now that doesn't completely muck up foreign words? I know you can make custom pronouncing dictionaries to use with some of the open source ones, but I wonder if any of the more modern systems are good for this. I have been listening to the english news podcast from a japanese news paper that is made with TTS and it gets its one job, pronouncing Japanese names and places completely jarringly wrong.
Shameless plug - if you download lots of audiobooks and need help organizing them and figuring out which to listen to next, check out Audiobook Locker: https://gitlab.com/fonner/audiobook-locker. It's a desktop app (built with Tauri) that scans your audiobook folder and lets you sort, search and tag your audiobooks.
Sounded better than the handful of random corresponding Librivox recordings I listened to in order to compare them.
To be honest, a lot of people go to great lengths to make Librivox recordings without having the skills to read aloud.
I agree. I love the idea of Librivox, but the volunteers vary widely in quality.
Some are non-native English speakers, some have lisps or other articulation problems, some have other marks of fluency deficiencies, some have under- or over-dramatic intonation, etc.
And even if they're perfect voice actors, often their microphone setups are sub-par, and it comes through in the recording.
Frankly, these AI voices are now at a level where the few mistakes they make are easier to forgive than some of those issues from human readers.
That said, the final hurdle -- giving them the brains to know when to skip or resolve hiccups in the source material, such as typos, formatting issues, or text not intended to be read aloud -- is going to be very hard to overcome.
It's getting more passable. As someone who listens to a lot of TTS at high speed for years, eventually I adapted my brain do it and now it feels similar to phsyical reading with subvocalization where I can adjust the voices of characters. It's occasionally even preferrable, i.e. too much over produced podcasts these days where I just TTS the transcript.
Great news. It seems there is still much room for improvement, though. E.g. in “A Short History of the World” by H. G. Wells, the AI reads Darius I and Charles V as “Darius Eye” and “Charles Vee”. Open and closed brackets in sentences are not read out. The intonation is also a bit unnatural. But it is intelligent enough to parse 1,782 as 1782 rather than two numbers.
Another problem is that the audio clips are not broken into sessions. There is no way to locate the beginning of a chapter, for instance.
We're ranking them using the download count, and also this prompt to chatGPT (it's primarily for language learners):
"Is this text engaging and interesting for a modern reader, someone not into fine literature? Rate the text excellent, good, ok or poor. I don't want crusty, flowery, contorted language, talking about buttons and mannerisms and the hue of the sky etc."
Then, we're rewriting the ~1000 most popular books using chatGPT to modernise/simplify the text.
Using some markdown as an internal format, drawing from the gutenberg plain text and html formats, this will go to a github repo shortly.
There's translations, and then, need to look at current best TTS voices.
They carefully curate and copy-edit their books, and go for quality over quantity. I think that is probably the right choice. We already have free access to an effectively infinite amount of mediocre content on the internet.
I did check that out. The thing is, the criteria is a little different. This is for learners who need to practice reading English. The scripts surface a lot of material in Gutenberg that otherwise would be time-consuming to find. 1950's science fiction that still reads well, or story books for children. Someone could certainly call these books 'mediocre', but I think that's a bit adjacent to what we're trying to do here. If this idea is worth the effort relative to other sources of material.. still undecided.
Just for fun, here's what happens to Pride and Prejudice:
User: Rewrite and simplify the following text for a modern audience: "It is a truth universally acknowledged, that a single man in possession of a good fortune must be in want of a wife. However little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered as the rightful property of some one or other of their daughters."
ChatGPT: "People generally believe that a rich single guy must be looking for a wife. Even if we don't really know what he's thinking when he moves to a new area, everyone assumes he's up for grabs by one of the local girls."
This is a cool project. I'm learning French and am always on the lookout for sources of books -- and especially audio -- at the right level.
To others questioning the need for rewriting: language changes over the course of a century! It's no problem for me, as a native English speaker, to read English from around the turn of the 20th Century. However, it increases the degree of difficulty for me to try to read French or Spanish from the same time period. Also, to get through an old book I have to learn a bunch of useless words that have fallen into disuse; it's much more effective for me to learn the version of the language that is relevant in today's world.
It looks like the French TTS isn't working properly yet? At least on the couple of titles I tried. The selection of titles you have listed looks great.
It's for students learning English. The top 1000 will be rewritten and be available in addition to the original version, and about another 10,000 available in original only (selected using vocabulary frequency analysis and the chatGPT query).
Midsummer Night's Dream:
THESEUS
Now, fair Hippolyta, our nuptial hour
Draws on apace; four happy days bring in
Another moon: but, O, methinks, how slow
This old moon wanes! she lingers my desires,
Like to a step-dame or a dowager
Long withering out a young man revenue.
HIPPOLYTA
Four days will quickly steep themselves in night;
Four nights will quickly dream away the time;
And then the moon, like to a silver bow
New-bent in heaven, shall behold the night
Of our solemnities.
Rewritten:
THESEUS
Now, dear Hippolyta, our wedding day is approaching quickly. In just four more happy days, a new moon will arrive. But oh, it feels like this current moon is taking forever to disappear! It's holding back my excitement, like an old, slow stepmother or a widow who keeps a young man's money for far too long.
HIPPOLYTA
Don't worry, Theseus. Four days will pass by in no time, and then four nights will also quickly slip away. After that, the new moon will shine brightly in the sky, like a freshly bent silver bow, and it will witness the night of our special celebration.
Hmm. Shakespeare is the extreme case. It does make it more accessible. The style is lost. shrug
It's amazing that a decade ago I'd have called you an exceptionally demented individual for doing this, but these days there's so much stupid unconscionable shit going on with AI that it hardly stands out.
floren: In epochs bygone, a mere decade hence, I would have proffered thee the epithet of an egregiously unhinged denizen, were thou to embark upon such an endeavor. However, in these current times, wherein the domain of artificial intelligence doth teem with preposterous and morally indefensible machinations, thy conduct emerges as naught but a faint ember in the vast tapestry of perplexities that now envelop us.
davidzweig: Hark! Methinks 'tis the act of rewriting, not the sifting, that doth wound thy pride most deeply, and affront thee grievously. Set thine eyes upon the instance presented hence, to glean a clearer understanding of our noble mission, one that doth strive to render these splendid volumes more reachable:
Original:
The other person was a man named O’Brien, a member of the Inner Party and holder of some post so important and remote that Winston had only a dim idea of its nature. A momentary hush passed over the group of people round the chairs as they saw the black overalls of an Inner Party member approaching.
O’Brien was a large, burly man with a thick neck and a coarse, humorous, brutal face. In spite of his formidable appearance he had a certain charm of manner. He had a trick of resettling his spectacles on his nose which was curiously disarming—in some indefinable way, curiously civilized. It was a gesture which, if anyone had still thought in such terms, might have recalled an eighteenth-century nobleman offering his snuffbox.
Winston had seen O’Brien perhaps a dozen times in almost as many years. He felt deeply drawn to him, and not solely because he was intrigued by the contrast between O’Brien’s urbane manner and his prize-fighter’s physique. Much more it was because of a secretly held belief—or perhaps not even a belief, merely a hope—that O’Brien’s political orthodoxy was not perfect. Something in his face suggested it irresistibly.
Rewritten:
The other person was a man named O'Brien. He belonged to a group called the Inner Party and had a very important job that Winston didn't fully understand. When people saw O'Brien, they became quiet, because he was a high-ranking member of the Inner Party.
O'Brien was a big, strong man with a thick neck and a rough, funny, and tough-looking face. Despite his intimidating appearance, he had a certain charm in the way he acted. He had a habit of adjusting his glasses on his nose, which was oddly disarming, and in some way, it seemed polite and refined, like an old-fashioned nobleman offering his snuffbox.
Winston had seen O'Brien about a dozen times over the past few years. He felt a strong connection to him, not just because he was curious about the difference between O'Brien's polite behavior and his tough appearance. It was more because Winston secretly hoped that O'Brien wasn't completely loyal to the government's beliefs. There was something about his face that made Winston believe this, even though it was just a feeling.
---> Orwell already has a clear style that I very much admire. The somewhat minor changes here would make for easier parsing by students, without changing the message much. I can't imagine Orwell would have objected.
I think your message is misplaced and pretty rude.
It throws away the flavor of the original text. Why reword the great works of literature into YA-level blandness? Might as well just read the Cliffs Notes at that point. 1984 isn't a particularly challenging work, either, and I think readers do themselves a disfavor if they don't read the original.
Edit: hey I do want to apologize my tone in the first comment though, it was too much. Sorry.
Our software is used by Koreans, Turks, Brazilians, Hungarians etc. that are trying to learn a foreign language (English). If importing (and rewriting some of) Gutenburg was the best use of time for them, well, I'm not sure. If you don't see a difference in difficulty in the two texts, then, I don't think you are able to appreciate the perspective of a language learner.
Well, the filtering prompt could have caused offense to someone that enjoys literature. It was the prompt I used after some iterations. If you go through gutenberg (~70,000 books), much of it is pretty undigestable to a modern reader, never mind a learner. Personally I'm finding a learning curve with chatGPT etc. where you have to relearn to directly ask for what you want.
I don't really understand the purpose here, given that literature is significantly aesthetic by nature. Can't students just read contemporary, accessible books if they want something easier?
Literature: "Anything written in print that's not Twilight." (urban dictionary) :)
Yes, they could.. but we can't put Harry Potter in the public library of texts for copyright reasons, and licensing material is difficult and not fun, I prefer to write software. Users can paste it in themselves, but then there's steps. The center of the project is still a browser extension that runs on Netflix and Youtube, we thus sidestep the copyright issue.
If you are looking for short stories, I strongly recommend audio fiction podcasts.
Escape Artists is one of my favorite production houses. The recordings are creative commons licensed, and the authors (and other artists) get paid professional rates:
"Mrs. [pause] Shelly by Lucy Maddox Brown Rosetti"
I expected a bit better than this for a launch of "the next amazing cool thing"; distinguishing between full stops and honorifics seems pretty dang basic. As xrd said issues like this make the books unlistenable, it's too distracting and weird. didntcheck plugged librivox which is nice if mixed in terms of quality, I'd also plug "libby" for anyone who doesn't have it: check out audiobooks from your library.
I put together a script to read epub books using Coqui TTS and I think the results are not far off from this. It's super quick if you've got a GPU, but it's reasonable too if it's just using CPU to do the text to speech.
This is cool. Narration of audio books is a time consuming process! I agree with some of these comments here about how AI narration can sound robotic though and may not be too pleasant to listen to.
However, for anyone who is, or knows a family member/friend with a certified disability, or is a veteran, there is a free program to listen to a vast collection of audio books (with real narration) provided by the US Government. Check out https://www.loc.gov/nls/ (Braille material too!)
>I agree with some of these comments here about how AI narration can sound robotic though and may not be too pleasant to listen to.
I have encountered readers on librivox with such terrible pronunciation that following the story was rather difficult. on the other hand, a robotic voice could work well on some cyberpunk material
I tried to listen to The Call of the Wild and it was impossible to follow since accents and emphasis on words are all wrong. I could barely understand the story. I guess AI has more work to do.
Yes, the voice sounds very natural and not computer-generated. But it gets a lot of, even simple, pronunciations wrong. There's a long way to go before this is genuinely an enjoyable and useful option.
Upon first impression, this is incredible! Audiobooks have enabled me to enjoy fiction books that I, otherwise, would not have been able to due to time constraints or other commitments. Perhaps, in the near future, AI will be able to make many obscure books that are collecting dust in museums and libraries accessible to the public through audiobooks. That is a future to look forward to.
They have wonderful voice acting skills as well. They can create multiple character voices and personalities. They can detect and express emotions appropriately, without hints from the text.
They'll continue to be highly valued, and far more enjoyable to listen to, for a -long- time.
In sampling a couple I would call these narrations "serviceable" rather than "high quality". My benchmark is the voice of my mom reading Shakespeare and Grahame, with intonation and voice to each character. Perhaps AI authored narration could do that, but these haven't.
Isn't listening to people tell stories fundamental to what we are? Wouldn't you rather be a part of a culture that cares about the difference between listening to a person's voice vs. a bot?
Edit: My concern is audio files ending up in places where they aren't clearly labelled as AI-generated.
Anyone interested in free, well-narrated audiobooks should check out the Classic Tales podcast. I can't really say enough about it. The host is a fantastic narrator and the books range from Plutarch's Lives to Philip K Dick.
Exactly the same scene as I use in my daily life. It feels great to have an audiobook version of any e-book. You don’t have to wait/look for any real person to complete their audio work. I have used this tool [1] to convert and listen to many books.
A good human reader doesn't read as much as he dramatises the book. This conveys ideas and feelings more than what is just written. He takes liberties doing that which makes the narration interesting. I don't know if that's possible in the same way with an AI voice. I read to my kids often and try to dramatise the books in a similar way.
The libre vox project which is contributer driven audio books is, I think, a more valuable contribution to human culture than AI generated audio files
When I'm reading to my kid I can respond to the situation too.
We're currently going through the Chronicles of Narnia, and are finishing up the second book (The Magician's Nephew). I'll do different voices for each character (I'm a professional voice actor). But we've noticed that close to bedtime it can have an enlivening effect so that it keeps him awake, so I'll just read the voices in my normal narrator voice.
Yeah it’s still robotic, I just tried. When enunciating long conjugate parts of a sentence such as “A long windy snow filled country road”, it doesn’t know that this is a one set of adjectives describing a country road. It’s a dead give away it’s AI. Maybe they can fix this.
I'm sure they will but even if they do and produce something that's exactly the same as a human narrator, there's something intangible that's lost. That's more or less my point. To make it even more intangible, I think the loss is greater to the narrator than to the listener as is the case with many of these AI generated creative pieces.
Thirty five seconds in to the very first link [1] and I hear the phrase "kon-fee de rate". It took me five seconds to realize the AI can't pronounce "confederate".
I'm not even sure what this project is supposed to accomplish. I tried a few books, and gave up within a few seconds. Pauses and punctuations, intonations, the spacing of words, and a whole lot more are really out of whack - it takes me a lot of effort at times to figure out what's being said. After a short while, it was really starting to stress me out so I gave up.
The audio book for Project Hail Mary is brilliantly done with amazing voice acting and even uses effects on Rocky’s voice to emphasize his musicality. Listening to a good audio book is like listening to the perfect film adaptation - it adds to the reading experience.
There’s a long, long way to go for AI to learn emotion before I’d spend 20+ hours listening to a book read by one.
Fantastic progress. Nevertheless, here is something for the internet veterans: Remember Microsoft Sam? MS came a long way to finally do good text to speech: https://youtu.be/3db_4xYahVc?si=SsXKvfHCabQ5rLef
Only professional, human narrators are acceptable to me. With few exceptions (e.g., Jimmy Carter), I can’t listen to an audiobook that was narrated by the author.
The first example I clicked on turned out to be a super racist book! Luckily the narration was repetitive (like a record skipping), a tonal, and with prosody that chopped up sentences to the point of near intelligibility.
I am an avid consumer of audiobooks and I will never pay/listen to anything AI-generated. Maybe it's just me, I don't know, but just because they have shown that it is technically feasible, that doesn't mean that there is a market for it. I am skeptical. Listening to audiobooks is already a compromise over reading the book, listening to an AI-generated audiobook sounds to me like a bit too much. But let's see.
Check out audio fiction podcasts. Some do full productions of short stories. Depending on the work, reading the text is a compromise over listening to the reading. For instance, music is extremely important to these two stories by Aliya Whiteley:
Obviously there are lots of short snippets of audio that are machine generated. But, no, at the current state of the art I'm not going to listen to a machine generated audiobook much less pay for it.
I would agree but there is one big exception: books I really want to read but there's no audiobook version.
I have a particular interest in early Mormon history and the history of the western US, and there are some really great books that aren't available as audio. I ended up generating some with aws and while the voice annoyed me, I was willing to do it, and the cost was much higher than a normal audiobook would cost.
I think in reality, the more popular books will get a pro reading, but as long as it's labeled, there will be a market for ai audiobooks.
For now, actual voice actors make audiobooks listening way more enjoyable.
Those AI voices are convincing but lack the soul, emotions and art direction of actual voice acting. Think about listening to a book for 8 hours with a monotonous AI voice reading it...
I mostly listen to real audiobooks now so I understand their appeal, but I have also listened to dozens of books using the primitive TTS built into my old Kindle (keyboard model). TTS was rough at first but I knew it could work, because I know blind people use it, so I stuck with it and I found that after a few hours I no longer perceived the awkwardness and it became an effective and satisfying way for me to 'read' books. Brain plasticity is a marvel.
I was quickly put off by the monotone voices. Maybe it’ll improve in a couple of years, but until then I’m sticking to real voice actors. For some books like Greenlights, I reckon you miss out on most of the impact if you don’t hear it in Matthew‘s voice.
It seems pretty obvious that, at least at this point, the competition is either people doing this sort of thing as a hobby or (maybe) at race to the bottom wages. (Or not at all--as is largely the case with machine transcription vs. human transcription.)
If I want mediocre text to speech, I have that on my Kindle.
I listened to a few of these. The voice sounds muted at times, as if the reader has a stuffy nose. H.G. Wells was read with a pause in between each period because it "thinks" that each letter boundary is a sentence change, which drove me batty. And, there is zero life in the stories. It might be a good thing to put in front of a kid to put them to sleep, maybe? But, it would not put me to sleep because it is just aggravating to listen to these stories stripped of all life by AI.
Like Louis CK said: "Everything is amazing and no one is happy." I know this is incredible that AI can take in a transcript and produce something that most people would be able to distinguish between a real human. But, we should ask if you would want to hang out with the voice actor at a party.