

Ask YC: Site usage data for math modeling exercise? - nwinter

For my undergraduate Probability Models course, I want to do a small queueing theory analysis of submissions to Hacker News. I've looked through the source and found the submission ranking formula, so I can proceed, but it would be great if I could find site usage data or submission times throughout the day and week.<p>I'd like to determine things like the most and least effective times to post, the effect of early upvotes, the variance of hitting the front page (which would tell you how robust the system is to chance, how well the ranking system is working), etc. If I can't get numbers, I'll academicize it and make them up (does anyone have an idea for how to distribute them?), but it would be much more interesting to have them.<p>I realized that this kind of data has been asked for before:
http://news.ycombinator.com/item?id=179981
http://news.ycombinator.com/item?id=114862
I figure it is worth asking again. I wouldn't have to post the analysis, if thoughts of gaming the system are a stumbling block.
======
epi0Bauqu
Why don't you just use the next few weeks worth of data?

~~~
nwinter
By daily manual polling, I can track when submissions occurred and how many
points they received, but I can't so easily track _when_ they received those
points.

I could write a program to poll more frequently, to look for changes in each
submission's score. That might be even more valuable data than hourly traffic
data. I wonder how involved that would be.

Is there a better way of doing it?

