* Pull down the top 5 pages
* For each handle:
* * Look up their profile page
* * Compute how long they've been on HN.
You could also try pulling the handles from https://news.ycombinator.com/active and comparing the distributions you get, since that comparison will give you an idea of the volatility.
Then write up your results in a blog post and submit it here. I'm sure people would be interested.
Or is "codegeek" just a wannabe name?
EDIT: You know, I did wonder before hitting "submit" as to whether this comment was a bit harsh. I know that the YC mods want us to be less negative, but go look at my other comments. I'm usually pretty appreciative of the work people do, encouraging of the efforts they make, and free with suggestions, advice, and positive feedback. On this occasion I just felt that the best thing to do was point out that HN is intended for hackers, and that I expect that hackers will go away and do something, or build something, will experiment and explore. So that's why I'm asking the question: why did this question get asked? If you feel it's harsh and out of place then fine, downvote me. I feel that it's actually a positive contribution to the ethos of HN, and stand by it.
from __future__ import print_function
from bs4 import BeautifulSoup # pip install beautifulsoup4
# https://news.ycombinator.com/robots.txt has a 30 second crawl delay
DELAY = 60 # I don't need speed.
sys.stderr.write("Using a %d second delay\n" % (DELAY,))
s = urllib2.urlopen("http://news.ycombinator.com/" + rest).read()
time.sleep(DELAY) # Play nicely with robots.txt
soup = get("") # get the main page
# Pull out the links to the items: <a href="item?id=9374889">
item_hrefs = [tag.attrs["href"] for tag in soup.find_all("a")
if tag.attrs.get("href", "").startswith("item?")]
# Find the users with comments in the stories from the front page
# Users look like: <a href="user?id=dalke">
users = set()
for i, item_href in enumerate(item_hrefs, 1):
sys.stderr.write("processing item %d of %d (%r)\n" % (i, len(item_hrefs), item_href))
soup = get(item_href)
users.update(tag.attrs["href"] for tag in soup.findAll("a")
if tag.attrs.get("href", "").startswith("user?"))
if i == 5:
sys.stderr.write(" ... 5 is good enough. Stopping.\n")
creation = 
for i, user in enumerate(users, 1):
sys.stderr.write("processing user %d of %d (%r)\n" % (i, len(users), user))
soup = get(user)
created = soup.find(text="created:").findParent().findNextSibling().text
sys.stderr.write("Could not find 'created:' for %r\n" % (user,))
sys.stderr.write("Soup: %s\n" % (soup,))
fields = created.split()
creation.append((int(fields), fields, user.partition("?")))
if i == 10:
sys.stderr.write(" ... 10 is good enough. Stopping.\n")
for delta, unit, name in creation:
print("%s - %d %s" % (name, delta, unit))
id=VieElm - 151 days
id=cushychicken - 514 days
id=fit2rule - 550 days
id=david-given - 760 days
id=Already__Taken - 883 days
id=justin66 - 1038 days
id=revelation - 1162 days
id=cygx - 1538 days
id=whatupdave - 1664 days
id=InclinedPlane - 2030 days
Having said that, the reason I did this poll was not just to get the numbers but let fellow HN'ers reflect on how their HN experience has evolved over the time period they have been on HN. I should have perhaps clarified this better in the Poll description.
Oh and I am a "wannabe" totally :). So you got that right. I don't mean this as a sarcastic comment and really my coding skills are at best being able to write a few scripts or edit HTML/CSS . I would totally label myself as a codegeek wannabe.