
Google’s Secret and Linear Algebra (2004) [pdf] - espeed
http://verso.mat.uam.es/~pablo.fernandez/ems63-pablo-fernandez_final.pdf
======
orbifold
One nice thing is that this algorithm is very amendable to being augmented
with additional data. For example if your initial adjacency matrix gave equal
weight to each outgoing link, nothing is stopping you from measuring the
actual "transition probabilities" by pervasive tracking / custom DNS servers
such as 8.8.8.8 and so on. Moreover it is also easy to generate a personalised
model for an individual user by recording their activity over time and using
that to predict transition probabilities for websites they might never have
visited. In that way you can generate a filter bubble.

~~~
candiodari
You make it sound so ominous, and the result would in fact be a large
positive, rather than a negative.

I mean sure, this would require some metrics as to which links are actually
followed versus which ones won't, but there's no need for "pervasive
tracking". Google could just get these from it's own employees, or indeed try
to extract some of these metrics from 8.8.8.8 (ie. from volunteers)

Given how big Google is, I'm sure the answer is "all of the above", but the
end result is: Google places the page where you eventually end up finding what
you want high on the search results page. And it does this by getting tiny
amounts of help for you from millions of others, anonymously.

So we have no reason to assume any of the methods are nefarious, and the end
result of it is definitely useful. More of this, please !

~~~
SquishyPanda23
> Google places the page where you eventually end up finding what you want
> high on the search results page.

I'm going to have to go ahead and disagree with you here. The smarter Google
tries to make their algorithm the more infuriatingly terrible it is for me to
use.

All I want is a feature to completely forget my YouTube and search history
when I am not logged in. I never want to see personal recommendations, because
for me personally they continue to get worse and worse.

~~~
deelowe
Maybe the issue is that SEO keeps getting better. I bet if you were to get
what you really wanted (a completely unpersonalized experience), the results
would be horrible, because traditional approaches to ranking are totally
useless these days.

~~~
SquishyPanda23
I have a good experience on new devices. I used to have a pretty good
experience deleting cookies every time I close a browser tab, but now even
that's degrading.

I should probably just buy a new device ever few months like Steve Jobs did
with cars.

------
JacobiX
Page Rank is an incredibly elegant algorithm. But in practice I suppose that
the actual Google search algorithm has little resemblance with the
mathematically elegant PR algorithm, because, it has to implement many custom
tweaking, tuning, DMCA black-lists, right to be forgotten list, etc ...

~~~
candiodari
It's like "machine learning" a spam filter, isn't it? It sounds so simple and
elegant, and then _somehow_ you end up with 200 rules, most of which are
entirely manual, and weekly tweaks to keep it working well.

------
jihadjihad
Another great little paper in the same vein is this one here (shameless plug
is that it's by a former professor of mine) [https://www.rose-
hulman.edu/~bryan/googleFinalVersionFixed.p...](https://www.rose-
hulman.edu/~bryan/googleFinalVersionFixed.pdf)

------
SquishyPanda23
Bibliographic citation algorithms in general are pretty interesting. Jon
Kleinberg's work from this era and earlier is especially wonderful to read if
you're interested.

~~~
sytelus
Do you mean this one?
[https://www.cs.bgu.ac.il/~snean151/wiki.files/7-Authoritativ...](https://www.cs.bgu.ac.il/~snean151/wiki.files/7-AuthoritativeSourcesinaHyperlinkedEnvironment.pdf)

~~~
SquishyPanda23
Yeah, that's the paper cited in the original PageRank publication.

But a lot of his other stuff is pretty interesting. He has done a lot of work
that you can broadly classify as applications of graph theory to the social
world.

That includes the HITS paper, but also work on small world networks, social
networks, and several other papers on links and information flow related to
the web.

------
ASipos
I find it weird that this was published in 2007 but the first sentence is
"Some months ago newspapers all around the world considered Google’s plan to
go public", and that happened in 2004.

~~~
ipsun4
It could have been an internal document that was published a few years after
written. This could be due to a indecision on whether or not to publish the
work. This document could also be originally for internal use to educate new
hires.

