
1.3 million Hacker News stories tell a tale - mcrowe
http://www.dataphoric.com/how-i-hacked-hacker-news/#hn
======
minimaxir
> _I’ve commited a grave statistical faux-pas in this article. Did you spot
> it? I showed that the number of votes a Hacker News story gets is correlated
> with the time it is submitted. Then, I told you that submitting your story
> at a particular time will cause it to have a higher chance of success.
> Correlation does not imply causation, however, so this is unproven. It is
> entirely possible that stories submitted on the weekend have simply been
> better than those submitted on weekdays, or that weekend readers are “vote-
> happy”. However, my intuition suggests to me that it is most likely that
> there is true causation here._

You probably should have _led_ with that caveat. Ignoring it completely for
the sake of an argument is misleading.

Relatedly, "what if I told you I found an easy way to increase your story’s
votes by 172%?" is _flat-out wrong_. That metric implies an _average_
behavior, which is not the case here.

~~~
mcrowe
You're right. I was focused on telling a useful story with the data. I should
be more careful not to be misleading.

Your related point is a good catch. That was a brain slip on my part. Thanks
for pointing it out!

------
mtmail
Same story from 6 hours ago, this time with '#hn' added to the URL
[https://news.ycombinator.com/item?id=10068983](https://news.ycombinator.com/item?id=10068983)

If you found the solution when/if a story gets picked up by time-of-day then
why submit it again? (I'm making an ironic joke)

~~~
mcrowe
Haha. Actually, Hacker News admin sent me an email asking me to re-submit it
with a slightly different URL.

~~~
dang
No, we invited you to repost
[https://news.ycombinator.com/item?id=10011081](https://news.ycombinator.com/item?id=10011081),
not this blog post.

A neural network simulation in a browser is a great HN submission. You were
right that it fell through the cracks undeservedly, and finding such stories
and giving them a second chance is the purpose of our experiment in reposting.

Looking through HN data to analyze posting times? not so much. Meta posts are
a low-quality genre to begin with, plus analyzing posting times to figure out
how to time HN is something of a cliché, and even if the findings are
meaningful, it's hard to see how HN improves as a result of them. Also, we'd
not likely invite a repost when the original had a misleading, linkbait title
like "How I hacked Hacker News".

Please do repost the neural network though!

~~~
mcrowe
Thanks for clarifying, dang. My apologies.

~~~
dang
Not to worry! Of course the post from earlier today was fresher in your mind.

------
mcrowe
I looked at the data from 1.3 million Hacker News stories and found that
_when_ a story gets submitted makes a big difference (up to 172% better chance
of getting on the front page). This article shows the analysis and results.

I used the official Hacker News API to get the stories using Python, and used
R and ggplot2 to do the exploratory data analysis and plots.

------
pedalpete
Interesting, I wonder if there is a difference in discovery rate of 'Ask HN:'
submissions vs. others.

~~~
mcrowe
Yes. "Ask HN" and "Show HN" both do almost twice as well as standard stories.

