Hacker News new | past | comments | ask | show | jobs | submit login
Ask YC: Site usage data for math modeling exercise?
1 point by nwinter on May 4, 2008 | hide | past | favorite | 2 comments
For my undergraduate Probability Models course, I want to do a small queueing theory analysis of submissions to Hacker News. I've looked through the source and found the submission ranking formula, so I can proceed, but it would be great if I could find site usage data or submission times throughout the day and week.

I'd like to determine things like the most and least effective times to post, the effect of early upvotes, the variance of hitting the front page (which would tell you how robust the system is to chance, how well the ranking system is working), etc. If I can't get numbers, I'll academicize it and make them up (does anyone have an idea for how to distribute them?), but it would be much more interesting to have them.

I realized that this kind of data has been asked for before: http://news.ycombinator.com/item?id=179981 http://news.ycombinator.com/item?id=114862 I figure it is worth asking again. I wouldn't have to post the analysis, if thoughts of gaming the system are a stumbling block.




Why don't you just use the next few weeks worth of data?


By daily manual polling, I can track when submissions occurred and how many points they received, but I can't so easily track when they received those points.

I could write a program to poll more frequently, to look for changes in each submission's score. That might be even more valuable data than hourly traffic data. I wonder how involved that would be.

Is there a better way of doing it?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: