Hacker News new | comments | show | ask | jobs | submit login
The Missing 11th of the Month (2015) (drhagen.com)
136 points by shawndumas 10 months ago | hide | past | web | favorite | 19 comments

This was a really fun read. The explanation for the 1860s error rate peak blew my mind. I wonder how many other characters could generate similar errors and how long it will take for us to notice the patterns.

The only reason anyone noticed the error with the 11th of the month was because we had enough similar phrases (8th, 12th, etc.) that we could make a statistical argument that something was amiss. It would be hard to even detect most other errors in the database.

A lot of manual typewriters lacked both 1 and 0 as l and O served just as well. So there's another.

Oh hello HN. I'm the author of the linked article. Someone texted me that I was on the front page of HN, though I couldn't guess what for. I guess I'll stick around a bit and answer questions.

Note that, the article being from 2015, the promised future post for the 23rd is already live at http://drhagen.com/blog/the-missing-23rd-of-the-month/ and differently fascinating.

There are some other amusing typographical glitches. For instance, I was once curious about the origin of "badass" so I searched Google Books for usages. I got several hits from the 19th century and was excited at my historical discovery until I checked them and saw Google was just misreading things.

Coincidentally, this past 11th was the first "missing day" of my life, according to the local clocks: I flew out of SFO Friday night, and landed in SIN Sunday morning.

Oh yes. I grew up in the sixties, and I pounded my father's portable Hermes since long before I could actually write. Transitioning to computers in the early eighties, it took me unreasonable amounts of time and effort to unlearn my muscle memory and stop typing l and o for 1 and 0.

Also, in seventh grade I took a typewriting class. Since school start I had turned in absolutely all the homework I could get away with in neatly typewritten form, while I got the impression that none of my classmates had ever even gone near such a wrting machine. I expected great things!

Alas, I flunked. The only one to do so, and hugely. I still typed better than anyone else, but I never got the hang of touch-typing. Sad to say, I haven't to this day. I seem to be stuck with the method developed by four year old me.

I converted to Dvorak out of general geekery a few years ago. But since I did that one a 'normal' keyboard, it essentially forced me to learn how to touch type.

(If you want to give it a try, have a look at Colemak, I hear great things about it on HN every once in a while. Or Neo2 layout, if you need non-English characters.)

For true geekery, compile a corpus of text that you typed and generate a good layout for your inputting needs using a genetic algorithm or something like that. There is some code on github that you can steal.

Yeah, you could do that. Dvorak has the advantage of being comparatively mainstream, it even comes with random Windows computers.

Also by now I have a keyboard that speaks dvorak natively. The Kinesis Advantage. Of course, that keyboard is programmable enough that I could make it speak custom layouts too, or just fix it in the OS.

I converted to Dvorak over a decade ago, and am happy with the results. My fingers stay on the home row much more, and typing rolls more smoothly. (I have a long write-up at https://www.nayuki.io/page/i-type-in-dvorak )

I wonder how Unix and Linux commands syntax would look like if they were developed with dvorak keyboards

Commands are perfectly fine. But you can't practically use the hjkl keys for navigation in vim.

I got a lucky break when I learned to type around 1960, after my mom got a Smith Corona Galaxie Electric portable. This typewriter did have the 1 and 0 keys, and better yet, it came with a free Smith Corona 10 Day Touch Typing Course.

That was a set of two LP records and a 20 page textbook printed on heavy card stock with a plastic ring binder at the top so you could open it tent style and read it as you listened to the record and typed along.

The first page was ergonomics: how to sit and how to type.

Then you learned the keyboard, row by row. The first two lessons covered the home row, followed by lessons for the top and bottom rows, numeric row, and finally shifting and advanced topics.

It was great fun!

I remember typing class with typewriters and the little cardboard boxes to cover your hands freshman year in hs. The kicker was, this was in the 90's in the heart of Silicon Valley.

Okay, this is going to sound mean, but this is like the definition of p-hacking. When you look at 30 values, you simply can't be surprised that one of them is lower than the mean, with a p value around 1/20th. Use something like a Bonferroni correction, to get a significance level of 1/600. Does the result still stand up? In fact, there's an xkcd about this very topic. https://www.xkcd.com/882/

I completely agree that it's important to take this kind of thing into account when approaching a problem like this. As I say in the post, "There are 31 days and one of them has to be smallest. Maybe the 11th isn’t an outlier; it’s just on the smaller end and our eyes are picking up on a pattern that doesn’t exist."

I'll admit that a straight p-value is not the appropriate statistic here. I don't even know how what the perfect statistic for this problem is. A Bonferroni correction is not enough because not only is the 11th of the month the lowest for a particular year--it's the lowest for every year.

I was convinced that this was real when I looked at the first line graph of the post. The 11th is the lowest either every year or almost every year, being 3-5 standard deviations below the mean for the bulk of the last 200 years. That just can't happen by chance no matter how you slice it.

If anyone knows the proper way to calculate a statistic on something like this, I would love to hear about it.

The significance is absolutely robust. In the early 20th century the measured counts of "<month> 11th" are many standard deviations off.

Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact