

How changing the structure of the web changes PageRank - keyist
http://www.michaelnielsen.org/ddi/how-changing-the-structure-of-the-web-changes-pagerank/

======
srean
With years of practice SEO engineers have become quite good at gaming
PageRank. So by itself PageRank is not a good signal of quality anymore. This
is by no means a revelation. But what really amused me when I tried computing
PageRank on sample webgraph was that it had become such an awesome indicator
of porn! For some reason I had not seen that coming.

Some of the easy to exploit loop holes of PageRank give you a glimpse of the
nature of the internet a decade ago. Compared to other link-analysis based
indices that were prevalent in those early search-engine days, Pagerank was
quite robust against spam. But it had this one loophole - "free rank" that all
pages on the web received.

One way to view PageRank is to think of pages as participants in an economy.
They pages paid a flat rate 15% tax to the powers-that-be and the remaining
they had to pay it forwards via their outlinks. The powers-that-be would
collect all the tax and then re-share the collection equally among all the
pages. And that "equal" sharing turns out to be a loop hole now.

Unlike a decade ago, creating a page (and thereby getting free PageRank
tokens) comes practically for free. One can create thousands upon thousands of
pages dynamically at zero marginal cost, if not millions. Now one can suck in
as large a share of the tax pool as one wants by just creating a large farm of
pages, limited only by the size of your farm. Well, this works only if
PageRank is computed by the book, in reality of course it isn't.

Given how well Pagerank must have worked during the early days of the internet
you can get appreciation of how valuable a page on the internet was in those
times.

~~~
MBeuser
I think PageRank is nowadays a very relative performance indicator. And I
still look at it once in a while. However, it's not, and was never a good
indicator for quality, because people always played it.

I believe, though, that you've pointed out a very relevant development that
everyone has to keep in mind: creating a website (good and bad) does not cost
anything anymore, so what does it mean for me?

------
shubber
A practical outcome of the __bound (that the PageRank change for a new link is
bounded by the PageRank of the source of the link) was the WordPress link
selling affair from a few years ago.

Every WP blog had a link at the bottom: "Proudly powered by WordPress" - which
meant there were lots of in-links to WordPress's main site. Links from
WordPress were therefore very influential, and the WP admins sold links to SEO
shops for a tidy sum. There was some outcry when this was discovered, as I
recall.

------
alenahemkova
One part of pagerank which I've always been confused about is how it factors
Google itself into the model. When computing pagerank would Google add a node
for its own domain (i.e., google.com and all its search result pages) that
points to every page on the web? They must realize that they themselves are a
major driver of traffic on the web and that would affect the assumptions of
the random surfer model (i.e., a random surfer would go back to google to do
new searches occasionally). Doesn't this become a chicken and egg problem?

