Probably because your random internet stranger has never heard of HN.
It's a shame, for instance, that I was ever allowed to find this place.
 I'm assuming you'd have some kind of rotating or individualized Project Euler style problems that you have to solve to get in, and that they would be pretty tough. It occurs to me that you could also have temporary student accounts for people who are just getting into the software world as well though.
Invitation based communities are also very self-policing. If you invite someone and they happen to be an idiot then there is egg on your face as well.
Link - http://random.irb.hr/signup.php
At a maximum 2 scrapes per minute, your cache will be 60 minutes old when you're watching 120 (single-page) comment threads. That might be tolerable. Now try 1200 or 12,000 comment threads.
You can do smart cache refreshes, but you will miss bursty activity. You can ignore articles older than X days, but you'll miss the "dupe, comments here:" resurgences.
HN gets about 150 posts per day that hit the top 60, at least briefly, as a point of reference.
This is a neat thing, but if it gets popular you will have a problem. On the plus side, it's self-limiting. As the usage gets higher, the experience will degrade and usage will decline. No intervention required by you. :)
I've wished for a policy-compliant way to cache comments in a timely manner, but there's just no way to stay current under the existing rules.
One potential solution to obey robots.txt might be to spawn multiple small EC2 instances with different IP's and have them coordinate with each other to share the crawling without individually running over the limits. (This is also useful for scraping from sites that have rate limits)
And the fact that people can only comment if they have an HN account makes it even better!
Wouldn't this risk changing that?
API-wise, http://api.ihackernews.com/ , http://hndroidapi.appspot.com/ or http://www.hnsearch.com/api didn't suit?
 what am I thinking, an API doesn't need an API...
Oh, and simplicity of this seems pretty nice!
We also need a micropayment processor that can do sub-penny transactions.
EDIT: Actually, I would be fine with penny transactions considering that the current state of things with say, paypal; is 30+ cents per transaction.
I think if I were to implement this on my own page, I would probably have two tabs that expand if clicked.
Something like this:
Show [your site's name] comments | Show HN comments
And then for the HN comment tab, you would use the OP's awesome submission.
This is where the browser should really be smart enough to do a search engine lookup, and propose smart links alongside the page you're reading. Or even summarise and merge discussions found elsewhere.
An element I really like from random blog posts I've seen off of HN is using the sidebars for additional, short commentary/details. Your comment reminded me of that, but sadly this might not work well if an article were all chopped up with tons of commentary (I'm thinking of used books where 'important' parts are highlighted and comments put beside them by a random reader, which I dislike). Well, then again, perhaps for people who like to dive into subjects and really digest an article, there could be a straight text version and a feature to see it marked up w/ smart commentary.
This also reminds me of a small 'movement' a year or two ago for citing in-article links at the bottom instead of linking to them throughout the article itself.
Opera has a side panel, one of which is the Links panel. That pulls links out of the page and lists them alongside. But this alone isn't that much use. The browser however could do something more interesting with them.
I think a lot of pages seem dreadfully wasteful by placing in the same old accessory content on their pages. Buttonitis etc. I don't think this necessarily should live 'inside' the 'page' at all. The content should remain dumb. The inbound links, and chat around the content augments it. And this could be summised somehow by browser tools / other services.
I understand the author's want for embedding HN comments on a page. There's a simple and elegant solution to the problem. Simply open up another browser window/tab and point it to the relevant HN comment page. To do so, the author could include a simple link in the page, leave it to the browser to infer, or just let the user discover it for themselves (pointers do help though.)
Why the obsession of stuffing our pages with content?
It also feels a little snooty doing something like this (embedding HN comments on your page that is.)
I think this should be left to the domain of the browser. Create a tool for filtering and showing certain aspects.
I've previously thought that comment threads on blogs tend to die down fairly quickly, whereas discussions on HN tend to last a bit longer, so this is doubly cool.
I guess the only downside is the near-impossibility of letting people comment directly from the page without having to log in again.