1219 is a record as far as I know. We measured it explicitly by getting the screen reader to read a passage and dividing. I spent months working up from 800 to do it and lost the skill once I stopped (there was a marked level of decreased comprehension post 1000, but I was able to program there; still, in the end, not worth it). When I try to described the required mental state it comes out very much like I'm on drugs. Most of us who reach 800 or so stay there, though not always that fast for i.e. pleasure reading (I do novels at about 400). it's built up slowly over time, either more or less explicitly. I did it because I was in high school doing muds and got tired of not being able to keep up; it took about 6-8 months of committing to turn the synth faster once a week no matter what, keeping it there and dealing with a day or two of mild headaches. Note that for most blind people these days, total synthesis time per day is around 10+ hours; this stuff replaces the pencil, the novel, etc. Others just seem to naturally do it. You have little choice, it's effectively a 1 dimensional interface, so from time to time you find a reason to bump the knob. And that's enough.
Whether and how much the skill transfers to normal human speech, or even between synths, is person-specific. I can't do Youtube at much beyond 2x. Others can. It's definitely a learned skill.
To me, the fact that this does in fact seem to be bidirectional at least some is more interesting than that I can listen fast.
I always wanted to know whether it's true what I heard some years ago: a friend told me that older Danish people are complaining that the younger Danish are harder to understand because there has been a trend in recent decades to swallow consonants even more than in the past
(that's something hard to image for me as a non-Danish speaker :)
Living in Malmö I also find this quite hard to believe! Besides, isn't old people complaining about the younger generation butchering their language something that happens all the time in all cultures?
It was about how Danish linguists were worried about how the language was getting more unintelligible. The example I remember was about the ever-increasing number of words that are pronounced, basically, “lää'e”: läkare, läge, läger, lager... Those are written in Swedish here; some Dane may translate. (In English: healer, situation, camp [laager], layer...) There were more that I don't remember; in all, I think they mentioned at least half a dozen words that had become basically indistinguishable from each other.
I don't think serious linguists have had the same fear about most other languages.
Edit: Hah, just saw your post on talking faster to other people who have the same audio skills.
However most blind people I know who do this start hating audiobooks, start hating talks, and generally by far prefer the text option. audiobooks aren't annoying, but they're below my baud if you will. Net result: boredom/falling asleep to whatever it is and the need to actively make an effort to listen. Some things which require active listener participation--math lectures for example--are different. I guess the best way I can put it is that speed is inversely proportional to the amount of, I guess let's call it active listening, required.
I've given a lot of thought to this stuff, but we don't really have the right words for me to communicate it properly. A neuroscientist or linguist might, but I'm not either of those.
Also, how does your "baud" vary with your familiarity with the ideas? I can't imagine it's independent. As a ~35 year old programmer with a decade of professional experience and a decade of hobby experience before that, I cracked open SICP for the first time and found almost everything familiar. I had digested the ideas from other sources, so I could read at a "natural" rate. If I had read it as a teenager, it would have been a mindfuck, and I would have taken multiple slow readings to understand. When you talk about numbers like 800, are you talking about writing that challenges you and changes the way you think, or are you talking about stuff you do for a living that is just information you're already primed to accommodate?
With movies I don't bother with them unless they have descriptive audio, at which point you've got music, sound effects, and two somewhat parallel speech streams going on. That's high informational content.
I did an entire CS degree at 800 words a minute. I program in any programming language you care to name (including the initial learning) at that speed as well. For more complicated concepts I stay at that speed, but pause after every paragraph or so to chunk the content as needed. I'm doing this thread at that speed. Pretty much the only time I slow it down is pleasure reading or sometimes articles when i want to go off and do chores while I listen, but even then it's still faster than human speech.
In general i think answering these sorts of questions needs research that we don't have to my knowledge. Nothing in my personal experience or background really allows me to give you good definitive answers. The sample size to work with is pretty small and in all honesty there's not a lot of good research around blindness day-to-day in the first place.
That sounds like a similar description to what it's like for people with IQs significantly higher than the average.
espeak -s 800 "Things to say."
I will attempt to remember and find the time to take my demo recording of this on Rust compiler source code that's currently in dropbox and put it up somewhere more permanent. I doubt Dropbox will care for me much if I allow HN-volume traffic to hit my account. It's Espeak using an NVDA fork with an additional voice that some of us like, so vanilla espeak is in the ballpark.
What I don't remember is if vanilla non-libsonic espeak softcaps the speech rate. It might. I believe new versions of espeak integrate libsonic directly, but that old versions just silently bump the speaking rate down if it's over the max. I haven't used command line espeak directly for anything in a very long time.
Libsonic is an optimized library specifically for the use case of screen readers that need to push synths further: https://github.com/waywardgeek/sonic
There is a range slider that maxes as 450, which is the maximum speed according to the manual.
I tried listening to a Wikipedia article at 450, I am so amazed you can comprehend that. Perhaps that's equivalent of me visually scanning the text instead of reading, however when I do that, I tend to focus on interesting parts for long stretches of time. With espeak, how do you focus? Can you pause it at will?
If anyone is curious, here is the NVDA keystroke reference: https://www.nvaccess.org/files/nvdaTracAttachments/455/keyco...
As an interesting sidenote, screen readers have to co-opt capslock as a modifier key, then there's fun with finding keyboards that are happy to let you hold capslock+ctrl+shift+whatever.
I find that the maximum understandable rate varies a lot between speakers. For some speakers 2.5x is possible, but just 1.5x for others.
One advantage synths has, is that they can more easily control the speed at which words are spoken, and the pauses between words independently. When watching/listening pre-recorded content I often find that I'd want to speed up the pauses more than the words (because speeding up everything until the pauses are sufficiently short make the words intelligible).
If someone knows of a program or algorithm that can play back audio/video using different rates for speech and silence, please share.
If so, have you considered using an EQ plugin to maybe turn down the harsher high frequencies a few notches? Just a thought.
But more to the point there is nowhere to really plug that in to a screen reader, so we can't try it anyway. The audio subsystems of most screen readers are much less advanced than you'd think.