If that ever changes and HN is overrun (insert dramatic and sad Hans Zimmer score) what about starting a new HN just for programmers? I wouldn't be able to get in at the moment, but someday I'd hope to be able to.[1]
[1] I'm assuming you'd have some kind of rotating or individualized Project Euler style problems that you have to solve to get in, and that they would be pretty tough. It occurs to me that you could also have temporary student accounts for people who are just getting into the software world as well though.
Passing a programming test is no guarantee for having anything interesting to contribute to an online community, just look at your company's internal IT mailing lists.
An invitation based system, while not perfect, would probably be the most effective method. Plus, the owner of the site could restrict or encourage growth by the amount of invitations allocated... allowing for easy management of unwanted lulls or spikes in user activity.
Invitation based communities are also very self-policing. If you invite someone and they happen to be an idiot then there is egg on your face as well.
By "get in" do you mean to comment or to simply read the comments? I'm all for a intelligence test (I personally would like a "actually RTFA test"), but I'm pretty sure that making that test required to even read the comments is a terrible idea.
Scraping comments on HN hits the robots.txt wall rather quickly.
At a maximum 2 scrapes per minute, your cache will be 60 minutes old when you're watching 120 (single-page) comment threads. That might be tolerable. Now try 1200 or 12,000 comment threads.
You can do smart cache refreshes, but you will miss bursty activity. You can ignore articles older than X days, but you'll miss the "dupe, comments here:" resurgences.
HN gets about 150 posts per day that hit the top 60, at least briefly, as a point of reference.
This is a neat thing, but if it gets popular you will have a problem. On the plus side, it's self-limiting. As the usage gets higher, the experience will degrade and usage will decline. No intervention required by you. :)
I've wished for a policy-compliant way to cache comments in a timely manner, but there's just no way to stay current under the existing rules.
You can't scrape with AJAX because of cross domain security restrictions.
One potential solution to obey robots.txt might be to spawn multiple small EC2 instances with different IP's and have them coordinate with each other to share the crawling without individually running over the limits. (This is also useful for scraping from sites that have rate limits)
robots.txt doesn't enforce itself so there is no IP limitation; this is still a violation and no better than simply lowering the delay on a single scraper.
Perhaps an iframe would work best. With some injected styles and javascript (some that would make the iframe expand automatically à la Disqus), I think this would work nicely.
I would see that eventuating if this allowed commenting via the embedding blog, thankfully it doesn't. I don't have the same issue with this that others do, but I remember TheNextWeb getting a bit of slack from commenters for embedding HN comments (although that was mixed amongst other comment platforms, and is a little different)
How hard would it be to have your script look for the submission on HN by url instead of having to manually put in the id? Maybe it could use the one with the highest score if there's more than one?
Great contribution Nathan, looking forward to using this to embed comments after my next blog post. Can you let me know how the comment system works? Are you scraping HN?
Probably the same reason no one should you use this to power their comment section there is no revenue, and no real reason to keep it up. Things like this come and go.
Not saying this is necessarily true for Hacker News, but unfortunately a lot of sites don't have the time/resources to develop and then maintain a public API for the majority of their content. And mostly see an RSS or XML feed of their news/articles/stories as good enough.
Do you think people are going to wait for someone to submit each post to HN? I think this will encourage writers to submit every single story they write to HN, just to get a comments page.
Here's a question, is it possible to put a Hacker News upvote button on your blog post? Or is that against the HN rules, or something that's frowned upon in the community?
I created hnlike.com about a year ago that helps you to setup an upvote button for your blog. I have disabled the upvote feature as the community seemed worried. See the HN discussion on my story for details.
I hope this doesn't further impact the quality of discussion on hacker news. It seems that every day I notice more and more people posting completely non-contributing crap in the comments that you'd expect from Reddit or something (one-liners, stupid jokes, etc). I go to Reddit for that, I come here for insight (which, granted, you can find a lot on Reddit also, but HN is supposed to be a bastion of intellectual thought and discussion).
yes i would. to implement the whole thing, instead of only this little jquery snippet shouldn't take much longer. but then i'm not dependent on someone elses server being up/fast.
If everyone could implement their own scrapper they would be able to follow robots.txt guidelines/limits, but HN servers might take a big blow it it gets popular.
This idea is kind of alluring. But you can't really include discussions on your page that might be happening all over the Internet. Reddit etc.
This is where the browser should really be smart enough to do a search engine lookup, and propose smart links alongside the page you're reading. Or even summarise and merge discussions found elsewhere.
An element I really like from random blog posts I've seen off of HN is using the sidebars for additional, short commentary/details. Your comment reminded me of that, but sadly this might not work well if an article were all chopped up with tons of commentary (I'm thinking of used books where 'important' parts are highlighted and comments put beside them by a random reader, which I dislike). Well, then again, perhaps for people who like to dive into subjects and really digest an article, there could be a straight text version and a feature to see it marked up w/ smart commentary.
This also reminds me of a small 'movement' a year or two ago for citing in-article links at the bottom instead of linking to them throughout the article itself.
Does it really matter where the links are listed - in the article or at the end?
Opera has a side panel, one of which is the Links panel. That pulls links out of the page and lists them alongside. But this alone isn't that much use. The browser however could do something more interesting with them.
I think a lot of pages seem dreadfully wasteful by placing in the same old accessory content on their pages. Buttonitis etc. I don't think this necessarily should live 'inside' the 'page' at all. The content should remain dumb. The inbound links, and chat around the content augments it. And this could be summised somehow by browser tools / other services.
I understand the author's want for embedding HN comments on a page. There's a simple and elegant solution to the problem. Simply open up another browser window/tab and point it to the relevant HN comment page. To do so, the author could include a simple link in the page, leave it to the browser to infer, or just let the user discover it for themselves (pointers do help though.)
Why the obsession of stuffing our pages with content?
It also feels a little snooty doing something like this (embedding HN comments on your page that is.)
I've previously thought that comment threads on blogs tend to die down fairly quickly, whereas discussions on HN tend to last a bit longer, so this is doubly cool.
I guess the only downside is the near-impossibility of letting people comment directly from the page without having to log in again.
Nice work, this is pretty cool! I've been working on a similar approach but using Github (mainly because I host my blog there using Jekyll). I'll try and get it released tonight :)
Very, very interesting! So you already calculated that the cost of this will be insignificant for you even with millions of pageviews? Blogs have a long tail of visits... ;)
How do people opt-in to allow their comments to be shared on sites other than HN? Or is this strictly opt-out and, if so, what is the opt-out mechanism?
Probably because your random internet stranger has never heard of HN.
It's a shame, for instance, that I was ever allowed to find this place.