
Google's PageRank patent has expired - dfabulich
https://patents.google.com/patent/US6285999B1/en
======
bborud
Myth: PageRank was the secret to Google's success

It wasn't really that simple. For a brief while, perhaps it made a difference,
but within perhaps a span of 6 months, every decent search engine implemented
page rank in one form or another. It is a cute story for the muggles to focus
on. In reality search was already then about balancing a large number of
signals into a decent ranking formula. It was much, much harder than just
applying some magic algorithm and I think the people who built Google search
back in those days deserve a lot more credit. But that wasn't really a sexy
story, I guess.

To a much greater degree than any algorithm or formula: Google's ability to
execute, and to do so in cultural sympathy with the web, was more important.
Much has been said about Google's Not Invented Here, but this made all the
difference in the early days: you had to get things to where you could iterate
and innovate fast.

And I say that admiringly as someone who worked for one of their competitors
at the time. I used to be jealous of Google because they were managed by
people who were part of the Internet. Our management was alien to the Internet
(and our business model was to power search for portals mainly run by
horrible, stupid people in suits).

Google was the only search engine that was properly in tune with its audience:
focusing on the user.

(Disclosure: I worked for FAST, then Yahoo, then Google until I quit Google in
2009)

~~~
baccheion
Google's success was due to marketing and the opportunity to do so at massive
scale (wide reach of and frequent media coverage). Even PageRank's mention was
marketing, though the world at large had no idea what it meant.

Most latched on to Google due to marketing surrounding and due to their IPO.
Many heard and understood: money, billions, billionaire, slides and bicycles,
etc.

After, continued marketing kept it all going. Powerful illusion. Google is the
embodiment of candy coated BS. Even now, they mainly continue due to
continually marketing themselves as greatness..

..and paying for Chrome, Android, and first placement on iPhones.

They even had internal studies showing that while they are pervasive (most use
at least one of their products), there isn't any stickiness. That is, if some
other search engine had first placement on iPhones, the majority would be
using that one. It's like the site that previously appeared when searching for
a definition (dictionary.com?): while many used it and did so frequently, they
often didn't even realize where they were.

The world (and internet, even) at large is very different from the handful who
think themselves aligned with the masses (ie, source of revenue). It's even
funnier, as most thinking anyone cared about PageRank don't buy anything. No
spending = you don't exist. Anything else is a coincidental nod, stupidity, or
coincidence.

~~~
remote_phone
Yeah, this is completely wrong. I watched Google grow up having worked in the
area and they didn’t have any marketing at the beginning. They first became
the best search engine many years before they started adding ads and well
before their IPO.

~~~
baccheion
Being the best search engine was irrelevant to their rise. It's just a thing
that happened to also (supposedly) be there.

Supposedly, as its ranking algorithm was so heavily gamed by 2007 (already
risen; popularity incentivized effort to game) it was a complete joke.
Powerful illusion again, as they just covered it up and moved on. Also,
internal tests from around 2009-2010 showed Bing was seen by users as
producing better-quality results.

Not only did they not have much marketing at the beginning, they also didn't
have many users.

~~~
bborud
No, you are entirely wrong. There's no part of your "analysis" that has any
relation to reality. I don't understand why you keep insisting.

(I worked for two of Google's competitors in the years where they grew from a
student project to a huge business. I had the opportunity to take a peek of
the code of two other competitors. Many of my former colleagues helped build
Bing. I also worked at Google for a few years. I was there so I would know
something about it)

------
enriquto
Every once and then I remember that you can actually patent algorithms. Terror
ensues, and a general feeling of frustration and helplesness. Then I try to
forget this fact slowly, just to be able to continue living in this crazy
world.

As a mathematician, i see algorithms as examples of theorems, and the idea to
"patent" a theorem, a mathematical truth, is so foreign!

~~~
pron
> i see algorithms as examples of theorems, and the idea to "patent" a
> theorem, a mathematical truth, is so foreign!

I'm not defending the patenting of algorithms [1], but what is protected by
algorithm patents is _not_ their mathematical truth -- quite the opposite, in
fact, as I'll explain -- just as what is protected by mechanical patents is
not some physical truth.

You are free to publish a patented algorithm (provided you don't copy your
text verbatim from a copyrighted source), teach it, study it, etc. to spread
that truth and expand it. What you're not allowed to do without permission
from the patent owner is to _implement it and run it on a computer_ ; i.e.
what is protected is not a truth but a certain human action. This is the same
for mechanical inventions, which could be equally said to be "physical
truths": a mechanism built in this way would, according to the laws of
physics, behave in that particular way etc. Similarly, you are allowed to
publish and study that physical truth -- what you're not allowed to do is to
_build_ it.

Again, I'm not saying whether this is right or wrong, only pointing out that
it is not _truth_ that's protected by patents, but _application_. In fact, one
of the original motivations for patent protection is precisely to encourage
people to not keep doscovered truths secret by promising them that profitable
_applications_ would be reserved to them for some period of time. So patents
were designed to help _spread_ truth in exchange for protecting applications.
That this is what patents are _intended_ to do is a fact.

It's fine to object to patents -- there are good arguments both in favor and
against -- but completely misunderstanding what patents are and what it is
that they protect is not one of them.

[1]: I'm not in principle against that, either, except that in practice few
patented algorithms rise to the level of inventiveness that patents are
intended to protect.

~~~
Schoolmeister
Does this imply that the PageRank algorithm I wrote for a uni assignment is
actually illegal? Or does this fall under "studying it"? Note that I'm from
Europe.

~~~
zild3d
Only if you made/used it commercially and also that Google did not decide you
can use it

("What rights does a patent provide? A patent owner has the right to decide
who may – or may not – use the patented invention for the period in which the
invention is protected. In other words, patent protection means that the
invention cannot be commercially made, used, distributed, imported, or sold by
others without the patent owner's consent.")

And being out of the country, the US patent doesn't apply to you

("Is a patent valid in every country? Patents are territorial rights. In
general, the exclusive rights are only applicable in the country or region in
which a patent has been filed and granted, in accordance with the law of that
country or region.")

------
whymauri
There's a lot of discussion I see about PageRank for CS purposes, so I'd like
to give a slightly different perspective most people haven't heard about.

A colleague of mine at CalTech recently used the PageRank Algorithm to model
the evolution of in-vivo neural networks [0, page 8]. The concept is pretty
good for modeling extensions of simple Hebbian learning and explaining some of
the ensemble dynamics (at least in the hippocampus). If you're interested in
further reading, there's more work on attractor states in biological learning
and associative memory (some of which is cited in the paper).

Brief overview: there's debate on whether brain network dynamics are stable or
unstable. In networks related to learning and over the timespan of weeks, this
experiment observed that ensemble-level dynamic change during learning __but
__some neurons are remarkably stable in how /when they fire. You can rank
stability with methods like PageRank, suggesting that connectivity implies
importance (and perhaps stability).

[0]
[https://www.biorxiv.org/content/biorxiv/early/2019/03/07/559...](https://www.biorxiv.org/content/biorxiv/early/2019/03/07/559104.full.pdf)

~~~
sitkack
Are you sure it is PageRank specifically? These fof maps were used and
investigated as far back as the early 1900s.

Search for "the page rank book" for a deeper analysis.

~~~
whymauri
He used a customized version of the PageRank algorithm that comes with MATLAB.
The paper is pretty dense with other interesting observations. I think one of
the novel contributions is experimental evidence that centrality is important
to learning distinct sequences of behaviors.

Consider three behaviors: moving right on a linear track, turning around, and
moving left. A reinforcement signal could be sugar water. If you place sugar
water at the right end of the track the mouse learns the sequence of moving
right, drinking, turning around, and moving left to leave. Now, in the
hippocampus different ensembles of time and place neurons are correlated with
each distinct activity. The inter-ensemble connectivity has to be learned, as
the sequence of actions become correlated not just in behavior but in the
brain as well. The neurons that most strongly inhabit inter-ensemble
connectivity, tend to be those 'stable' and 'important' anchor neurons.

Yes, it has been studied before that some neurons are more important than
others. But the critical extension here is to long-term stability and learning
on an unprecedented time-scale and quantity of simultaneously recorded
neurons. The actual microscope itself, is a significant technological
advancement in how it stays robust to long- and short-term motion artifacts
and was custom designed and built by the first-author.

There's also the experiment done on dynamics after traumatic damage to the
neural circuits encoding a learned behavior (this is one of the specialty of
the Lois group). But that starts deviating from the observation I wanted to
make relating to the thread.

The conclusion: "Overall, our findings suggest a model where the patterns of
activity of individual neurons gradually change over time while the activity
of groups of synchronously active neurons ensures the persistence of
representations."

Seems fairly innocuous, but I'm almost certain it will be controversial to
some parties.

------
Irishsteve
For extra context: back then you would manually submit your site to yahoo and
dmoz to end up in their results. They saw themselves as directories.

Google was all about crawling and building up the biggest dataset going.

Both approaches were victim to keyword stuffing (lots of keywords at the
bottom of the page and if you were lucky it was in a marquee tag).

Pagerank was a pretty decent extra value with a relevance score to promote
trust worthy sites. However there were similar techniques like hubs and
authorities from kleinberg.

On a side note his old research students / postdocs ended up leading key
initiatives at FB newsfeed and Pinterest discovery.

~~~
jacquesm
Altavista and others beside were already crawling the web.

Yahoo and Dmoz were curated but Google definitely wasn't the first crawler.

As for the approaches being victim to keyword stuffing: that was because the
algorithms used were exclusively 'on page' without assigning a value to links.

~~~
calgoo
Yahoo's content was mostly US based if i remember correctly. The reason i
switched to Google from Altavista was because of the reduced ads and clean
look on the page. The results where about the same back then.

~~~
mercer
IIRC speed played a huge role too in part because of the cleaner page, but
also in part because of whatever voodoo they used to deliver results.

------
kumarvvr
To my knowledge, PR has long been succeeded by more sophisticated models. At
one time, perhaps even now, Google had a team of mathematicians constantly
tuning its ranking algorithm.

PR itself might be the foundation, but it definitely wouldn't be enough to
build another Google scale system.

~~~
luckylion
Links are still the single most important factor for ranking, though. I mean,
there's a lot of other stuff going on, and the content/information extraction
has advanced massively since the early days, but PR (or a similar concept)
seems to still be the biggest part of determining how relevant a page is.

~~~
kumarvvr
Atleast the future is not links. With the move towards reactive js based
clients, restful api based services, links hardly have the requisite data for
indexing. Even page content is extremely fickle to index.

Google must be having a js engine as part of its web indexing process.

~~~
luckylion
They do, and have been using it for quite some time. Not only will they
execute JS, they will also do AJAX requests and index the content that is
returned.

Since the overwhelming majority of the web is still static, I believe links
will be fine for the foreseeable future.

------
skunkworker
It's a unique milestone to see something so integral to the development of the
internet as we now interact with it become expired. It's also telling that we
might need to rethink the duration of the patent system as a whole because so
much can change within 20 years.

~~~
pishpash
It just makes me feel old, like the internet is last generation's stuff now.

~~~
phkahler
It is. Aside from looking up information and ordering things, the net is
almost completely useless to me now. Nothing is discoverable any more.

~~~
bredren
The Internet does not feel much like a wild place where you might encounter
anything anymore.

On the other hand, reality itself is much faster paced because of the
Internet.

You can go into the world see, try, learn, experience, be hurt, recover and do
it all again at a rate never before possible.

So real discovery, discovery of what it is to be alive is more accessible than
ever.

------
tanilama
But it is no longer that useful anyway. It is probably of more value to Google
as a legend for recruiting purpose other than an important technological asset
or advantage.

------
gigatexal
Cue all the whiteboard interviewers: “implement the page rank algorithm,
please.”

~~~
colmvp
Followed by “what is the time and space complexity?”

------
shoo
tangent: if anyone wants to read a good non-fiction book about the history of
steam, invention, the patent system, etc -- i can recommend William Rosen's
book "The Most Powerful Idea in the World: A Story of Steam, Industry, and
Invention"

------
ykevinator
It's too broad to say all algorithms should not be patentable. Amazon's one
click buy and page rank are obvious to "those skilled in the arts" but there
are plenty of algorithms that should be patentable. The problem is that the
uspto doesn't properly get an expert consensus on obviousness.

------
savrajsingh
I bet the purpose of this patent was to check the investor box of “yes we have
a patent,” they’ve never had enforce it, and competitors were not really
discouraged by it. Thoughts?

~~~
utopcell
Patents can be defensive. Someone else could had patented it and used it
against them. Also (subjective opinion), they were kids: They thought
patenting it was important, like they thought asking Yahoo for a mere $1M to
buy it was a great deal.

------
puranjay
Slightly off topic, but has anyone seen a drop in traffic even with the same
rankings? Too many of the keywords I used to rank for now have featured
snippets. These snippets basically mean that any result after #2 get little to
no traffic

It used to be that beyond page 1 was a waste, but I reckon now if you're not
#1 or the featured snippet, you don't really stand to gain much in terms of
traffic

------
Cyclone_
Would be a difficult patent to enforce against a competitor since it would be
difficult to tell if a backend is actually implementing this but I suppose
it's in their interest to still file for it.

------
umanwizard
Pretty amazing that you can patent an idea as simple as multiplying a matrix
by itself N times.

~~~
gilgoomesh
I realize that not everyone understands how patents work but this is
ridiculous. The patent claims don't mention matrices. Any implementation (like
matrices) is merely an embodiment (you can implement the patent without
matrices).

And even if that weren't true, the foundation of the patent system is applying
existing techniques to new applications. The background section of the patent
clearly details how this technique has been used in other applications.

I don't dispute that a lot of patents are trash but this is possibly the most
important patent of the last 30 years. That doesn't mean it invented
computers, mathematics and the internet, it just put some already good ideas
together. That is what invention is.

~~~
Certhas
I realize that not everyone understands how matrices work, but this is
ridiculous. The abstract of the patent:

"... the rank of a document is calculated from a constant representing the
probability that a browser through the database will randomly jump to the
document."

And the specific claim is:

"Looked at another way, the importance of a page is directly related to the
steady-state probability that a random web surfer ends up at the page after
following a large number of links."

This _explicitly_ describes a Markov Chain, which is naturally represented by
a matrix. A variety of versions of the linear equation are explicitly given in
the patent.

To claim you can implement the patent without matrices is, for all intents and
purposes, wrong. You can implement the same equations in a variety of ways,
but they are still matrix equations.

They patented the idea to apply random walks to ranking webpages. That's
arguably reasonably novel, though Wikipedia lists a number of predecessors.
But it was also an inevitable invention, because there is a large number of
people familiar with Random Walks/Markov Processes, they are routinely taught
to undergrads, and are used to model and analyse a vast number of processes
[1].

[1]
[https://en.wikipedia.org/wiki/Markov_chain#Applications](https://en.wikipedia.org/wiki/Markov_chain#Applications)

~~~
gilgoomesh
What you quoted is not what the patent “claims”. The claims are the numbered
points in the section “Claims”. They are the only legally enforced section of
a patent and they are written in a very specific language. Everything else is
background or embodiment and has little to no legal value.

------
vietvu
Page Rank, Map Reduce... those infamous tools that once game changing. But
now, I think there are many better ones, just less well known

------
gingabriska
Is there any page rank library launched on Github? Can exgooglers make such
libraries and raise themsleves to open source fame?

~~~
visarga
It's been done many times, and it's easy to implement for small graphs.

Example:
[https://networkx.github.io/documentation/networkx-1.10/refer...](https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.algorithms.link_analysis.pagerank_alg.pagerank.html)

~~~
bhl
As mentioned, the difficulty with PageRank isn’t the implementation, but
scaling it to billions of webpages. Doing so would require either extensive
knowledge of systems or approximation methods.

------
sinzone
White page that would lod fast when everyone had 56kb was key

------
samcodes
They should have put some round ears on it...

------
baxtr
Maybe they should revert back to PR to yield better Search results

~~~
thrwayxyz
The Google results started going bad for me around when project humming bird
came along. We need a distributed open data alternative that can be tweaked
for your proferences transparently.

And we also need to start paying for the internet. The malvertising model we
have now is unsustainable and crippling the network in favour of Facebook and
Google.

~~~
bborud
What's stopping you from doing it?

------
sdan
So what now? Someone copies PageRank?

~~~
kbumsik
Nothing. Everybody freely has copied PageRank already. It has became one of
the basic teaching material in graph theory.

For example, NetworkX, a popular graph library in Python, implements PageRank.
[1]

[1]
[https://networkx.github.io/documentation/networkx-1.10/refer...](https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.algorithms.link_analysis.pagerank_alg.pagerank.html)

~~~
kreetx
How does that work in the real world - do google simply not care, or it's
"fair use" somehow?

~~~
rocqua
The patent is only for web search, so other applications dont infringe the
patent.

~~~
bbrb
First sentence of the abstract of the patent:

"A method assigns importance ranks to nodes in a linked database, such as any
database of documents containing citations, the world wide web or any other
hypermedia database."

------
dang
" _Please don 't post shallow dismissals, especially of other people's work. A
good critical comment teaches us something._"

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

Edit: bad call. Detached from
[https://news.ycombinator.com/item?id=20067782](https://news.ycombinator.com/item?id=20067782)
and marked off-topic.

~~~
umanwizard
Sorry dang, I respect your moderation a lot but I have to disagree here that
my comment is a shallow dismissal that can’t teach anyone anything.

PageRank really is a very simple idea based on elementary linear algebra, a
fact many people might not have known. Thus my comment could inspire a curious
person to go read more about how PageRank works instead of fearing that there
is a Ph.D worth of prerequisites.

Furthermore, it is a relevant comment on the US patent system.

By the way, I think PageRank is an _incredibly_ important development in the
history of technology, and took a fair bit of ingenuity to think up. I’m not
dismissive of it at all. But it is also very _simple_ , which doesn’t
contradict any of the above. And I don’t think it should be patentable, just
like I don’t think other simple and intuitive applications of matrix
multiplication (like the chain rule from calculus, for example) should be
patentable either.

~~~
dang
I'm persuaded. Sorry!

------
burtonator
Here's the raw truth that no one in the valley will admit.

It's about two things:

1\. Execution.

2\. Luck.

... but it's mostly execution.

If you have a BRILLIANT idea but can't execute you're dead.

There are tons of companies in the history books as examples.

BUT... if you can EXECUTE, you can take a shitty idea, abandon it, pivot,
measure, and focus BACK on a good idea. Then when you have a good idea you can
keep executing it forward.

The rest of luck but this can be worked on as luck is often timing + being
prepared.

~~~
MegaButts
I strongly disagree that execution matters more than luck. I ask myself would
I rather be good or lucky, and I will always choose luck. Because no matter
how good you are, if luck isn't on your side you're going to lose. You might
be the better programmer, but you still didn't get the job. You might have the
better proposal, but you still lost the bid. You might be the better
candidate, but you still lost the election. You might have a better product,
but your company still failed. You might eat right and exercise, but you still
got cancer.

I see this regularly play out when founders who found success on their first
startups attempt to repeat it, only to fail miserably. Granted, there are a
handful who manage to duplicate their success, but I know that some of these
are simply due to who you know from their first success (you could argue
that's better execution, but the number of unicorns I know that wouldn't exist
if it weren't for a lucky break from an investor previously befriended makes
me think otherwise). I do not mean to belittle the hard work and brilliance of
many successful founders, only to emphasize that luck was absolutely critical
in nearly every success story.

Obviously it's better to have both, but pretending that you can outmaneuver
the universe is an act of hubris. I think people pontificating that execution
matters more than luck are too arrogant to realize how lucky they are, or want
to believe it matters more because it's reassuring to believe that we are in
control of our destinies.

There are things you can do to increase your exposure to luck, but it's
ultimately something beyond your control. The world isn't meritocratic.

~~~
bartimus
To get lucky is something different from being a person who "has luck". The
latter doesn't exist. Unless you believe in supernatural things.

Everyone has the same chance to get lucky. You can position yourself to
optimize your chances. The question remains if you have the necessary
abilities to take advantage of it.

~~~
MegaButts
Luck is random. Nobody has luck, but the guy who won the lottery got lucky.

------
rakibtg
Because they dont need it anymore.

------
rawoke083600
Back to keywords in meta and h1 then ? /s

~~~
quickthrower2
That's still table stakes for SEO. Although not sure being in <h1> matters,
but doesn't do any harm.

------
jacquesm
For better or worse, pagerank destroyed the value of inbound links and by
extension it killed the web. Now Google, not links determine how your page is
found and while I'm sure links still have some value that is only one of a
very large number of inputs. I believe it is impossible to come up with some
way of ranking web results that does not in the end lead to a destruction of
the metric that one uses to rank with, if the metric initially gives great
results. Spammers will figure it out and will drown out the signal with noise.

~~~
utopcell
a bit exaggerated, but sadly there's a core of truth to it.

