If true, that seems downright unethical to me. I've actually met (and possibly drank with) some of the Scribd founders and they seem like really nice, smart fellows, but this practice really shouldn't be encouraged (unless there is a good reason behind it other than to promote Scribd).
It's certainly there to promote Scribd. I don't believe that HN has ever claimed to be an independent, impartial site. It's on a subdomain of ycombinator.com, it is heavily frequented by people who are either involved in YC, or people who want to be. Any story that is linked to a YC company is going to get a disproportionate amount of attention here, and I doubt that YC will hesitate to integrate links/widgets from YC companies as they've done with Scribd, CO2stats, and others.
I'm pretty sure that one of the purposes of the site is to attract people to YC. If you want to know why people like myself (who are just here for the hacker news/discussion) stick around, it's because it has some of the most interesting and civil discussion of any online community, and if you know about the YC bias then it's not a big deal to filter it out in your mind when browsing/reading the site.
Maybe "unethical" is inaccurate, but I did mean to make a strong point. How about "annoying and unnecessary"? One reason why I enjoy HN is because, as you said, the intelligent and civil discussion. Despite it being sponsored by YC, I really appreciate the distinct lack of advertisements or blatant links to YC funded apps. It truly feels like an inclusive environment for all hackers, not just YC affiliated ones. Your comments are judged by their content, not (usually) by whether the author is part of the YC clique. Yes, I understand YC doesn't owe anyone anything, and if they wanted to, they have the right to turn HN into a billboard for YC companies. But that's not why I come here. That's why I think HN should nip these trends in the bud, so it stays as inclusive and impartial as possible.
Initially (or at least a couple of years ago) if you submitted a PDF, the link would only be available through Scribd. It was a topic of much discussion. See here for example: http://news.ycombinator.com/item?id=195431
Yea, it's no different than the Green Stats thing at the bottom of the page. He's just helping people out by using their stuff. Can't blame him for that.
If the author doesn't want their content mirrored, they're free to throw up a robots.txt file, assuming Scribd respects that. The internet requires middlemen to make multiple copies of content for each request to successfully complete, so I'd call an HTTP 200 status code implied consent.
Every request from a different place ultimately is fulfilled by the original host barring some extreme caching (which you can also use http headers to instruct against). This is completely different than the case of it being taken from the original host and put up on scribd, which could easily be a copyright violation. An HTTP 200 response is implied consent for an end-user (or even a robot) to view it, not to redistribute it. AFAIK scribd does not crawl for content (with them then hosting that would be blatantly illegal in the US) so robots.txt is not really applicable.
This interpretation makes caching illegal and puts routing in a gray area. If not instructing against caching is implied consent to redistribute the content, then you're essentially agreeing with me.
robots.txt is indeed intended for crawling, but if it's there and you redistribute someone's content anyway, I'd consider it less defensible.
Google cache doesn't beat your main site in the rankings, and it clearly cites you as the original source of the data.
That said, it's irrelevant because they don't crawl for content. The HN admins are way out of line with their practices on this one. They're making illegal copies to help their friends at Scribd, and doing so without the requisite consent.
Search rankings have nothing to do with copyright law, which seems to be the basis of your argument. You have yet to give a decent argument that materially distinguishes Scribd from Google's cache, caching proxies, or even the basic routing required for any request on the internet, which copies content by definition.
I was just responding to Zak's comment about differences, giving a few. It wasn't meant as a complete argument of anything, or any sort.
That said, two points:
1) doing so is not just immoral, it's against Scribd's TOS.
2) your claim that re-hosting content without permission is indistinguishable from transport makes clear that your beliefs are so different than mine that I cannot possibly find a way to communicate with you.
The basic concept of copyright law is that an author is the only one who is allowed to make copies of her work, and only she can give others permission to do so as well. At a technical level, sending a response to a request involves telling another machine to pass a message along for you, so there is at least implied consent to copy it and send it to another user. However, what are the bounds of this consent? Can a router store the data it has been passed? For how long? Can it serve it to others besides the IP address the response was intended for?
There may be case law and/or actual laws that clarify these points, but I am not a lawyer. I presume you aren't either. If you think the concept of IP law has a straightforward and indisputable application to the internet that clears all questions about what routers can and can't do with the data they are passed, feel free to explain.
It seems like you're operating within the "lots of people do it, so it must be legal somehow" school of thought when it comes to routing. This isn't necessarily a problem as long as it's applied consistently. Several other commenters and I effectively made the same argument in saying that Google does almost the same thing Scribd does in terms of copying and redistributing content. It isn't possible for you to call that argument invalid then rely on that argument as proof of routing's obvious legality.
Scribd is legally indistinguishable from a caching proxy. Feel free to let Opera and all the other caching proxy operators know.
I've found clicking a PDF link on anything but Mac OS to be rather unpleasant. Many people (me included) consider the scribd link a more pleasant experience.
Contrast this with my experience: whenever I happen upon an article on Scribd I immediately skip over it. I really don't like the user interface. (I'm on a Mac, by the way.)
That doesn't run well on my platform of choice. Evince does though, and is also fast and lightweight. Still, I find PDF links a bit annoying, and am glad to have the scribd option.
I just put Evince on all my Windows machines and I am very happy about it. I was able to remove all the Adobe Air bonus arterial sclerosis.
I tried Foxit and was not very impressed. I thought it was interesting that the Evince windows port was made since the last time I looked for an Acrobat replacement about a year ago.
As long as they don't change the link to the original PDF, it seems fair to me (if you accept scribd as fair in principle - they simply gobble up all PDFs they can get, I suppose). Meaning I think it is just a helper to include a viewer in a link to a PDF.