It wasn't really that simple. For a brief while, perhaps it made a difference, but within perhaps a span of 6 months, every decent search engine implemented page rank in one form or another. It is a cute story for the muggles to focus on. In reality search was already then about balancing a large number of signals into a decent ranking formula. It was much, much harder than just applying some magic algorithm and I think the people who built Google search back in those days deserve a lot more credit. But that wasn't really a sexy story, I guess.
To a much greater degree than any algorithm or formula: Google's ability to execute, and to do so in cultural sympathy with the web, was more important. Much has been said about Google's Not Invented Here, but this made all the difference in the early days: you had to get things to where you could iterate and innovate fast.
And I say that admiringly as someone who worked for one of their competitors at the time. I used to be jealous of Google because they were managed by people who were part of the Internet. Our management was alien to the Internet (and our business model was to power search for portals mainly run by horrible, stupid people in suits).
Google was the only search engine that was properly in tune with its audience: focusing on the user.
(Disclosure: I worked for FAST, then Yahoo, then Google until I quit Google in 2009)
We are a 25 person tech startup and while our culture is amazing we struggle with execution. This is an ongoing struggle for us; it’s so easy for a young startup to misunderstand the importance of execution.
Is reading books enough, if so, which? Should we hire a COO from a company with a history of excellent execution (how to tell?). Are there courses to take? Or is it just about prioritizing excellent execution with continuous learning?
Some resources that have helped so far: Scaling Up (book) and
First Round Review blog.
I've seen several startups suffering from not being able to decide what they were. Engineering teams fractured because the company wanted half of them working on their bottom line and the other half working in some offshoot product.
Even gigantic companies need that: notice how people criticize Google for creating and killing way too many products, and at the same time praising their minimalist webpage (since 1998), early GMail, etc. Same with Apple when Jobs returned to Apple and streamlined their product line, etc.
Prioritisation is about deciding what not to do. Forget the BS story about putting rocks and sand in a jar, where the secret is to put the big rocks in first so that everything fits. That's not how you need to do prioritisation, because order does not appreciable change the amount of time something takes. The secret is to put only the big rocks in. Period. You go 6 times faster because you have 6 things you could do, but you only do one of them.
Now the real kicker is that the only way to determine which one thing to do is to do analysis on all 6. An HN post is not really a reasonable way to describe this. However, consider a requirements discovery to be like a tree. Requirements are discovered at a particular rate, as you work on something. Discovery of one requirement leads to discovery of new requirements. It's a feedback loop. Pruning the tree as early as you can leads to significant gains later on. So while you can't actually get the 6x development time by avoiding 5/6ths of the requirements, you can pretty easily get 2-3x gains.
BTW, for anyone interested in a more rigorous approach, consider taking something like Littlewood's model of defect discovery and assuming that requirements discovery has a similar curve. Littlewood's model is very naive, but I've found that it still has a lot of value. Again, sorry for cryptic hints here, but I don't have time to write a book on it (which it would certainly take...)
Recently we are focused on just one objective and key result - it took years to get that far - we used to have so many. But even now with just one objective we still picked 25 initiatives to attempt to reach our goal. In retrospect it was an obvious fail because we only executed on a few well. We did some initial analysis but considering your comment I think we could have done much more analysis and cut much deeper and picked just a few or even just one. This is radical thinking!
Thank you for getting deeper into this. Do you have any other hints on where I could learn more about the approach you are describing?
Most latched on to Google due to marketing surrounding and due to their IPO. Many heard and understood: money, billions, billionaire, slides and bicycles, etc.
After, continued marketing kept it all going. Powerful illusion. Google is the embodiment of candy coated BS. Even now, they mainly continue due to continually marketing themselves as greatness..
..and paying for Chrome, Android, and first placement on iPhones.
They even had internal studies showing that while they are pervasive (most use at least one of their products), there isn't any stickiness. That is, if some other search engine had first placement on iPhones, the majority would be using that one. It's like the site that previously appeared when searching for a definition (dictionary.com?): while many used it and did so frequently, they often didn't even realize where they were.
The world (and internet, even) at large is very different from the handful who think themselves aligned with the masses (ie, source of revenue). It's even funnier, as most thinking anyone cared about PageRank don't buy anything. No spending = you don't exist. Anything else is a coincidental nod, stupidity, or coincidence.
Google started as the project any hacker would dream to be part of, even for free. What it became after all that money changed it is a different story.
Most people just use what's there and that happened to be Google's search engine. It's especially the case after massive growth of internet users due to mobile (ie, there were only ~300 million internet users in 1998 versus the billions online today).
Also, their growth after 2004 was linear not meteoric/exponentially and they are now in decline: https://trends.google.com/trends/explore?date=all&geo=US&q=%....
If you are going to use a metric you have to show that it is relevant. It also helps if one agrees on what we are measuring.
The main search engines of the time, like Altavista, gave you a busy, cluttered, experience as they pursued monetisation.
Google gave you a search box: http://web.archive.org/web/20000815052943/http://www.google.... Over time they stripped even that down.
Just as significantly, the search results were also uncluttered, and "good enough".
This approach was absolutely critical when they started out, when people would most often be using 56k modems, where every byte had a real impact on the end user experience.
They got larger maybe because they also kept going (didn't have a choice; tried to sell early on for pennies, though they spin the story to distract from what happened). The other engines were bought or sold out.
The death of many companies during the bust also made room for them and they were likely the face of a group effort to "keep it all moving." Facebook was a similar face.
It's funny how knowing makes it seem you don't know. You sound naive and brainwashed. But all it does is once again show how powerful an illusion can really be. I'd (and did) say similar things if I didn't dig deeper.
Supposedly, as its ranking algorithm was so heavily gamed by 2007 (already risen; popularity incentivized effort to game) it was a complete joke. Powerful illusion again, as they just covered it up and moved on. Also, internal tests from around 2009-2010 showed Bing was seen by users as producing better-quality results.
Not only did they not have much marketing at the beginning, they also didn't have many users.
(I worked for two of Google's competitors in the years where they grew from a student project to a huge business. I had the opportunity to take a peek of the code of two other competitors. Many of my former colleagues helped build Bing. I also worked at Google for a few years. I was there so I would know something about it)
I don't understand why you are raving on about Chrome and iPhones. The iPhone was almost a decade away when Google started getting traction, and building a browser wasn't even at the idea stage.
Please edit such acerbic swipes out of your comments here. They break the site guidelines and lower the signal/noise ratio.
The general internet user is different, much less concerned with the underlying tech or anything else, and about other things. The general internet user thinks Facebook is the internet, don't realize they are online, and only use online services to text and take/post pictures.
Android phones have a camera icon. Ever thought about the camera app associated with it? Not really if an Android user. You just use it or use it as a backup. If it were another app, you'd be using that one. Wouldn't notice.
Most people don't spend time digging, unless it's something they are really into. Most also don't read reviews or research, though that's been changing over the years. They just go with whatever, unless important (to them).
What gets people confused is that they tend to think this was the only mechanism in use and that it was a solution that never evolved. Things also get confused by the fact that not everybody knew everything. (For instance, hardly anything in the "official" origin story of FAST is true, and you'll get conflicting stories depending on who you ask).
I don't think the Altavista engine was ever used. I think the people from Altavista ended up on Panama, Vespa and possibly some on the Inktomi-based engine. The FAST and Inktomi engines were both in use for a short while for web search, and then the effort was split so the FAST engine was used in what became Vespa (where pagerank isn't as good a mechanism as for the web). Vespa grew out of work at FAST that started around 2002-2003 to separate out the more infrastructural bits of the search engine into more reusable infrastructure components).
Eventually Inktomi was used for web search at Yahoo. Simply because of geography (well, politics). Since Inktomi didn't really have anywhere to go that pretty much sealed the fate of Yahoo as maker of a web search. You might be thinking of Inktomi.
In 2004 Yahoo acquired several smaller companies which were working on algorithmic search and page linking around the same time that Google was. They released their own ranking algorithm called WebRank which was substantially based on the methods of PageRank.
FAST's web search used Page Rank from late 1999.
Yeah, VESPA wouldn't have scaled back then, but the search engine we used was far more scalable than Inktomi since it was the same search engine we used for web search. We did hold the record for largest index for a while (to distract people from the fact that our ranking was lagging behind Google's :-)).
But the search engine itself wasn't really the point for VESPA. Also page rank wasn't really as relevant for the use-cases VESPA was for. In fact, ranking in small, special corpora is very different from ranking in web search. And in the case of small document corpora: surprisingly hard, so one depended on tools to specialize both search, ranking and result processing.
I wrote the first implementation of the VESPA QRS with a couple of other guys, which I think was the second component in VESPA (if you count the fsearch/fdispatch as the first). I think this was the first step towards making easily customizable search. The big initial barrier was to convince people Java would be fast enough for this. (I was prepared for a 30% loss of performance in exchange for ease of extension. What we got was a 200% performance boost over the C++ implementation before even starting to optimize. But it was a bit of work to make it play nice with GC in Java and I remember David Jeske at Google refusing to believe me when I outlined how we'd done it :-))
An interesting question is what would have happened if Yahoo had chosen FAST web search instead of Inktomi. According to Jeff Dean, our search engine was the only competitor he was worried about (mentioned over lunch in may 2005 after i accepted a position at Google). Possibly because he didn't understand why it performed well. We made some fundamentally different design bets than Google (they bet RAM would become cheap fast, we bet that it wouldn't. They were right).
Inktomi was a technological dead end. That was a stupid choice by Yahoo top management based solely on geography and reflecting the ineptitude of top management when it came to technology.
To be quite frank, I think Yahoo would have flubbed web search either way. The only reason VESPA managed to survive at all was because it was being developed in Trondheim Norway - far away from Sunnyvale where we could get away ... well, bullshitting leaders and pretending to obey them while doing our own thing. Not that we weren't in deep doo-doo initially (we were in over our heads), but we had some really great people that were able to orchestrate the mess that was VESPA into something that worked, and then something that worked well.
Without mentioning any names, Yahoo had a problem with technologically inept leaders as well as too many useless middle managers. At the time just before we were acquired by Yahoo, it was quite clear that separating out important bits of the search engine into infrastructure components was key. Google had understood this early and done a few very important things (GFS, Protobuffers, MapReduce, Borg etc).
The funny thing was: our first two versions of our search engine (in 1998 and 1999) essentially used MR for crawling and processing, but we did so with shell scripts and duct tape (it was a mess). Anything that could be turned into "sort and scan", as we thought of it, could be done fast. Including page rank and deduplication - and deduplication was a much, much, much harder problem than page rank.
And when I say shell scripts and duct tape: we used unix sort, pipes, shell scripts and various small programs to do "mapping" and "reduction" :-) (strictly speaking, we used our own version of UNIX sort to have the same sorting order on all platforms, but essentially unix sort). Management were only focused on short term sales to portals, so we just reimplemented the same primitives over and over and over in every piece of technology we made. Wasting a ton of time and effort.
I was working on a storage system at the time that was sort of a combination of GFS, MR and Borg (the design came from before the papers about GFS etc were published). The idea was to have a distributed storage on which you could execute code in a sandboxed environment on each node. Meaning that you send the code to where the data lives and process it locally in a parallel manner and stream output to other nodes in the system. After certain executives felt a need to get involved and dictate technology choices I figured the project was doomed and abandoned it. (It was, for a while, known as "the storage system that can't store stuff").
Today I think that my approach would have been too complex to be sufficiently easy to develop. There were certain things about GFS I really didn't like (too trusting of clients), but slicing the problem into distinct domains was the right thing to do. Also, Google had Chubby and we didn't.
As a mathematician, i see algorithms as examples of theorems, and the idea to "patent" a theorem, a mathematical truth, is so foreign!
I'm not defending the patenting of algorithms , but what is protected by algorithm patents is not their mathematical truth -- quite the opposite, in fact, as I'll explain -- just as what is protected by mechanical patents is not some physical truth.
You are free to publish a patented algorithm (provided you don't copy your text verbatim from a copyrighted source), teach it, study it, etc. to spread that truth and expand it. What you're not allowed to do without permission from the patent owner is to implement it and run it on a computer; i.e. what is protected is not a truth but a certain human action. This is the same for mechanical inventions, which could be equally said to be "physical truths": a mechanism built in this way would, according to the laws of physics, behave in that particular way etc. Similarly, you are allowed to publish and study that physical truth -- what you're not allowed to do is to build it.
Again, I'm not saying whether this is right or wrong, only pointing out that it is not truth that's protected by patents, but application. In fact, one of the original motivations for patent protection is precisely to encourage people to not keep doscovered truths secret by promising them that profitable applications would be reserved to them for some period of time. So patents were designed to help spread truth in exchange for protecting applications. That this is what patents are intended to do is a fact.
It's fine to object to patents -- there are good arguments both in favor and against -- but completely misunderstanding what patents are and what it is that they protect is not one of them.
: I'm not in principle against that, either, except that in practice few patented algorithms rise to the level of inventiveness that patents are intended to protect.
And just like with software patents, people come up with weird workarounds:
> Sun-and-planet motion. The spur-gear to the right, called the planet-gear, is tied to the center of the other, or sun-gear, by an arm which preserves a constant distance between their centers. This was used as a substitute for the crank in a steam engine by James Watt, after the use of the crank had been patented by another party. Each revolution of the planet-gear, which is rigidly attached to the connecting-rod, gives two to the sun-gear, which is keyed to the fly-wheel shaft.
If you look at the animation at the link, you can see it's just a crank with two extra gears attached
Pretty interesting stuff. Although visually similar it is actually a different principle. And we can readily tell that the arm doesn't contribute to the drive as the sun is rotating faster than the arm would drive it.
Being granted exclusive access to put a fact to productive use (or any use) seems roughly equivalent to "owning the truth" to me.
One could argue about effectiveness, but it is a common historical intepretation that the choice between different kinds of IP has had a real impact, so the distinction is very much one with a difference.
No, that would be allowed. Please, if you want to form an opinion about patents, you should first learn what they are and what they protect.
("What rights does a patent provide?
A patent owner has the right to decide who may – or may
not – use the patented invention for the period in which
the invention is protected. In other words, patent
protection means that the invention cannot be
commercially made, used, distributed, imported, or sold
by others without the patent owner's consent.")
And being out of the country, the US patent doesn't apply to you
("Is a patent valid in every country?
Patents are territorial rights. In general, the
exclusive rights are only applicable in the country or
region in which a patent has been filed and granted, in
accordance with the law of that country or region.")
Heck, Siemens used to have a German patent on the Internet, except that it wasn't called Internet but something like "making available of structured textual data representations via long-range data transmission", etc.
You don't need to be a mathematician to see that the current system of IP is completely contrary to the nature of the universe. It will be over soon.
I'd say that if they steal from the Chinese, they won't talk about it much (or say it was the other way around) and continue just the way it was in case they come up with something patentable again.
Exaggerated statements or claims not meant to be taken literally.
— The Oxford Dictionary
If I invent something, a 1 year headstart in the market should be plenty of reward.
It certainly does not help that they use the ambiguous term IP that does not really mean anything.
The universe or the patent system?
Algorithms can be extremely hard to discover and extremely valuable. Together these facts are why the patent system can usefully apply, same as traditional patent classes.
And lots of traditional types of patents are "algorithms" too, really.
It's not that different from, say, patenting a drug. The algorithm to manufacture [insert expensive drug here] probably isn't that hard to follow, but it was very hard to discover, and we want to encourage more people to discover more algorithms like this.
The real issue is that the people in the court system can't understand the different between trivial, straightforward, somewhat advanced, and truly sophisticated algorithms. Also, since you don't need to risk health and life to test them, computer algorithms are typically easier to test than drugs, so it's likely that few-to-none of them satisfy the "difficult to invent" standard that drug molecules do. It costs like a billion dollars to develop a drug; PageRank was made by a few guys in a room in a matter of months.
As another interesting point of comparison, a lot of old weird gun designs from the late 19th/early 20th century were made to get around patents. It's dumb to make a "blow-forward" pistol but there was a time when the straightforward "blow-back" system was patented. Whole countries used quite weird and suboptimal weapons for a long time simply to avoid patent entanglements. I'm sure to them it seemed similar - how can you patent putting these 2 pieces of metal together in this shape? These systems were not very complicated.
Imagine that Euclid's algorithm was patented? Or Pythagoras theorem? Or the definition of exterior derivative? Or the expression of a function as Fourier series? All of these are valuable ideas that are non-trivial to come by. The same argument for software patents applies to each of these cases, and in all these cases it is evident (to me) that these mathematical constructions cannot be patented, and that society cannot grant monopoly of their usage to their discoverers.
You are forgetting that patents are only granted for limited time. If we applied the current system, those things would be patented only for 20 years after their discovery. Would it still be unimaginable?
Maybe your opinion is still the same but the situation is not as clear-cut as it seems. For example, I would guess that if patents were cancelled the industrial budgets for research would somewhat decrease, because there would be smaller advantage in figuring things out first.
The original reason for patents is enabling more information to become a public. When the patent system did not exist, the only way to protect inventions was to keep them secret. Patent is way to separate information from the economic use of the information. Without patents you must protect the information if you want exclusivity.
Imagine that some extremely important and new algorithm is not patented but kept a secret instead. Today it's possible use provide software as a service and keep the algorithm within protected servers.
> cannot grant monopoly of their usage to their discoverers.
Current patent system is too restrictive and should be reformed. Instead of monopoly, it would be better to have mandatory licensing for algorithms for short period. For example 5 years with some fancy auctioning system to discover the correct price.
If Euclid's algorithm was patented in a system similar to our modern one, he would have... cornered the marked in ancient Greek cryptographic protocols for 20 years or so before other ancient mathematicians were allowed join in?
This is an interesting point and I'm not sure it cuts the way you want it to.
The world had to wait thousands of years for each of these algorithms to be developed. There are broad reasons for this, but one of them was probably the fact that nobody had any incentive to develop them. Mankind literally had to wait around for some random nerd to come up with these in his spare time.
Now imagine that they were patentable and exploitable for money (for 20 years). Perhaps they would have been developed centuries or millennia earlier. You would have teams of mathematicians from 1000BC onward, working to try to make money inventing theorems and machines and medicines.
20 years under patent seems like a small cost to pay to have something invented possibly thousands of years earlier (or, not kept under trade secret, as e.g. Damascus steel was).
Drugs, according to the comment, are like an algorithm, with a simple set of inputs— hard to discover these inputs, but then easy to replicate. Yet we seem to feel differently about patenting drugs?
Do you have a response to that part?
As a mathematician with a patented algorithm / open source advocate... I'm extremely squeamish about this and not super comfortable with it. On the other hand... Cardano's formula was once a trade secret. He's dead and we know the trick now.
As others have pointed out, patents expire in 20 years or less. Nothing is stopping you from building on somebody else's patent (popular among trolls: a patent on applying your patent in a novel area, or a more optimized version, a dual version, etc, of your patent). In that situation you can't use your own patent without paying them... and neither can they use yours. Implementation might be a problem, but you can still prove your theorems.
Quick sort is from a time before the USA decided that algorithms could be patented. In fact, most foundational algorithms of computer science are from that time. IMO, the development of computer science would have been hampered for a couple of decades had it been possible to patent algorithms back then; we're very lucky that this wasn't the case.
Can be, most aren't. Most are specializations of a well known thing, applied to a new area. Page rank is (maybe) a good example of this. Not quite old enough to transfer myself back to the time, but diffusion on graphs and power methods have been around for a while.
Companies often request that any code you write that's a part of their contract work must not leave that project and they own all IP rights to every line of code you deliver.
Which isn't reasonable most of the time because technically if you imported a library from Python's standard library you can't import that library again in another project without violating the contract. Or if you pasted in a snippet of code from a library's documentation, since you did it on their time, you can't use it again on another project.
I always do my best to get things reworded to differentiate generic code from business / trade secrets code, and then for the generic code they either license the code from me as MIT or I license the code from them as MIT.
In our world, you cannot patent theorems but people publish them anyway. Why? for the honor of human spirit, or to impress chicks, whatever.
You'll never have enough information to know what theorems went unpublished. You can only speak to the probability based on incentives. Public institutions and Universities may be incentivized to publish theorems, but that same incentive doesn't exist in corporations. A valuable money making theorem that could give competitors and advantage would never be published unless a patent could be formed using the technique that circumvents the limitations by including other processes.
Even with the existing patent system, trade secrets still exist: WD-40, Coca-Cola, Twinkies...etc. When it comes to money, the default is to preserve.
Hiding a truth you have discovered is OK.
"Owning" a truth that you have told everyone is absurd.
I think you seriously overestimate the incentives to publish. In several computer science domains, and certainly the ones I work in, the academically published algorithms are often a decade or two behind the state-of-the-art that is never published. Valuable algorithm advances are often explicitly treated as trade secrets. As an equally common case, the inventor(s) simply have zero incentive, either personal or financial, to spend their time publishing it -- they did it to solve a problem, not to publish, so they prioritize spending time with their family etc. This disadvantages open source software, and academia spends much of its time reinventing algorithms already known in the industry. In my experience, surprisingly little hardcore algorithm R&D happens in academia, so any model of information dissemination that makes this assumption is going to be suboptimal.
As an example that is unrelated to my current work, I developed a set of novel algorithms that massively improved the efficient parallelization of graph traversals in 2009 -- a true step function in both scalability and throughput per CPU (I was working on supercomputers at the time). Fast forward a decade to 2019 and these algorithms still haven't shown up in academic literature even though people built systems based on them and they are superior to what is currently in the literature. In this case, the algorithms are not even secret. I've also learned some brilliant and as yet unpublished algorithms of unknown origin via these same oral traditions over the years. As a social dynamic, it feels inappropriate to publish an algorithm that you learned this way.
This is a challenging problem to solve. Companies spend serious money on algorithm R&D hoping to obtain a commercial advantage. Outside of that, publishing is often an unattractive use of one's personal time if you are not an academic. This reality disservices the software community at large and I'd like to arrive at a better solution, even though the current reality benefits me greatly as an insider who sees loads of amazing, unpublished computer science.
Austin Meyer, the maker of the X-Plane flight simulator, was sued for millions of dollars because his app used an in-app purchase option made available by an existing 3rd part SDK.
There is a whole industry of inventing bogus software patents that take utterly trivial everyday business processes like giving someone money for a product, dress them up in fancy lawyer talk and general descriptions of non-existent, "systems" or "methods" without any example or prototype, and then sue people over it. In-app purchases? Better pay up! You have a fax machine or scanner? Pay! Storing business data on some electronic device for later retrieval? Better pay an arbitrarily high patent license fee!
There are multiple alternatives.
Remember what happens when drug companies re-discover a useful molecule that turns out to have “prior art” as a molecule that people have been consuming in some every-day herb or some-such. They can’t patent that molecule (because they didn’t invent it), and there’s no profit motive to go through FDA compliance for a non-patented drug.
But, there’s nothing stopping them from finding the “theorem” behind the “algorithm”—figuring out what it is about the molecule’s structure that makes it have the effect it has—and then discovering another molecule in the same class (another “algorithm” constructively proving the same “theorem”), and then patenting that.
Same is true for actual algorithms: if PageRank is patented, I can just look at the theorem behind it—efficient eigenvalue derivation—and then come up with a different constructive proof of it. It’s easy once you know it’s possible. And, because there are so many known isomorphisms between algorithms (e.g. between algorithms on pointer machines vs. RAM word machines) there are often “obvious” transformations of the algorithm that aren’t considered the same algorithm from the patent office’s POV. (And, I mean, technically they aren’t; they might have a very slightly higher time-complexity, by a factor of the inverse Ackermann function or something. But these are things that don’t matter in practice, just like the random extra bits that the drug companies tack onto their molecules don’t matter in practice.)
The FDA just approved the use of the drug,  which is basically a ketamine molecule cut in half (ketamine is a racemic molecule),  and J&J is selling it for HUNDREDS of dollars per treatment, while regular Ketalar brand ketamine is damn-near as cheap as saline.
 Arketamine (R(-) - isomer) https://upload.wikimedia.org/wikipedia/commons/thumb/d/df/Ar...
Esketamine (S(+) - isomer)
(This is a deep rabbit hole. All mechanical patents have the same theoretical equivalence to computer algorithms, though less obvious. The lines that define patentable subject matter do not have a rigorous basis, it is an arbitrary convention.)
This is also why so-called business method patents (i.e. doing X, but on the Internet), which are often improperly conflated with algorithm patents under the rubric of "software patents", are generally not patentable.
A colleague of mine at CalTech recently used the PageRank Algorithm to model the evolution of in-vivo neural networks [0, page 8]. The concept is pretty good for modeling extensions of simple Hebbian learning and explaining some of the ensemble dynamics (at least in the hippocampus). If you're interested in further reading, there's more work on attractor states in biological learning and associative memory (some of which is cited in the paper).
Brief overview: there's debate on whether brain network dynamics are stable or unstable. In networks related to learning and over the timespan of weeks, this experiment observed that ensemble-level dynamic change during learning but some neurons are remarkably stable in how/when they fire. You can rank stability with methods like PageRank, suggesting that connectivity implies importance (and perhaps stability).
Search for "the page rank book" for a deeper analysis.
Consider three behaviors: moving right on a linear track, turning around, and moving left. A reinforcement signal could be sugar water. If you place sugar water at the right end of the track the mouse learns the sequence of moving right, drinking, turning around, and moving left to leave. Now, in the hippocampus different ensembles of time and place neurons are correlated with each distinct activity. The inter-ensemble connectivity has to be learned, as the sequence of actions become correlated not just in behavior but in the brain as well. The neurons that most strongly inhabit inter-ensemble connectivity, tend to be those 'stable' and 'important' anchor neurons.
Yes, it has been studied before that some neurons are more important than others. But the critical extension here is to long-term stability and learning on an unprecedented time-scale and quantity of simultaneously recorded neurons. The actual microscope itself, is a significant technological advancement in how it stays robust to long- and short-term motion artifacts and was custom designed and built by the first-author.
There's also the experiment done on dynamics after traumatic damage to the neural circuits encoding a learned behavior (this is one of the specialty of the Lois group). But that starts deviating from the observation I wanted to make relating to the thread.
The conclusion: "Overall, our findings suggest a model where the patterns of activity of individual neurons gradually change over time while the activity of groups of synchronously active neurons ensures the persistence of representations."
Seems fairly innocuous, but I'm almost certain it will be controversial to some parties.
Google was all about crawling and building up the biggest dataset going.
Both approaches were victim to keyword stuffing (lots of keywords at the bottom of the page and if you were lucky it was in a marquee tag).
Pagerank was a pretty decent extra value with a relevance score to promote trust worthy sites. However there were similar techniques like hubs and authorities from kleinberg.
On a side note his old research students / postdocs ended up leading key initiatives at FB newsfeed and Pinterest discovery.
Yahoo and Dmoz were curated but Google definitely wasn't the first crawler.
As for the approaches being victim to keyword stuffing: that was because the algorithms used were exclusively 'on page' without assigning a value to links.
There was a time when Yahoo also tried to get into the indexed search space, but never seemed to be a viable competitor against other players in the market. Once Google established their dominance, all bets were completely off.
Who’s the subject of this sentence? I didn’t know PageRank had academic descendants.
What’s the current research on PageRank like? I looked at Sergey Brin’s academic page a couple months back and was surprised that people still work on nearest neighbors now and back then.
But thats from Kleinberg - the person who I was referring to :-)
After that there is personalised page rank / Salsa which are probably the more widely known approaches to identifying trust worthy nodes in a graph.
PR itself might be the foundation, but it definitely wouldn't be enough to build another Google scale system.
Google must be having a js engine as part of its web indexing process.
Since the overwhelming majority of the web is still static, I believe links will be fine for the foreseeable future.
Also, I do a lot of work for SEO/Affiliates. The exact same tech with pretty much the same content ranks top 2 for highly competitive keywords when hosted on a very high PR domain, and top 100-1000 when hosted on a normal, on-topic domain. This is consistent over multiple content areas, and it's why the large media companies have begun to sell/rent folders/subdomains on their site. Instant ranking (until somebody else buys access to a larger media company), no risk, as Google doesn't consider "paid publishing" as against their guidelines.
On the other hand, reality itself is much faster paced because of the Internet.
You can go into the world see, try, learn, experience, be hurt, recover and do it all again at a rate never before possible.
So real discovery, discovery of what it is to be alive is more accessible than ever.
It used to be that beyond page 1 was a waste, but I reckon now if you're not #1 or the featured snippet, you don't really stand to gain much in terms of traffic
And even if that weren't true, the foundation of the patent system is applying existing techniques to new applications. The background section of the patent clearly details how this technique has been used in other applications.
I don't dispute that a lot of patents are trash but this is possibly the most important patent of the last 30 years. That doesn't mean it invented computers, mathematics and the internet, it just put some already good ideas together. That is what invention is.
"... the rank of a document is calculated from a constant representing the probability that a browser through the database will randomly jump to the document."
And the specific claim is:
"Looked at another way, the importance of a page is directly related to the steady-state probability that a random web surfer ends up at the page after following a large number of links."
This _explicitly_ describes a Markov Chain, which is naturally represented by a matrix. A variety of versions of the linear equation are explicitly given in the patent.
To claim you can implement the patent without matrices is, for all intents and purposes, wrong. You can implement the same equations in a variety of ways, but they are still matrix equations.
They patented the idea to apply random walks to ranking webpages. That's arguably reasonably novel, though Wikipedia lists a number of predecessors. But it was also an inevitable invention, because there is a large number of people familiar with Random Walks/Markov Processes, they are routinely taught to undergrads, and are used to model and analyse a vast number of processes .
It is an interesting struggle to figure out what my objection to that point is. I think it is that we know exactly how hard it is to apply linear algebra to a problem - not everyone's cup of tea, but easily 10% of software engineers would be able to do it.
The truly groundbreaking part of Google was never the indexing - that was a problem that was going to get solved one way or another. The groundbreaking part is figuring out that search + low latency + advertising is a money printing machine and that tech favours the winner.
The mechanism to achieve search + low latency + advertising is important but to some degree unimpressive. If the other search engines at the time had realised the payoffs and how important latency was they'd have gone short text-only ads too and put more engineering time into the problem - maybe someone other than Google would be the search engine of the day.
And even if PageRank was the difference between Google and a hypothetical runner up, the difference of a better algorithm would be marginal. Decisive, but ultimately marginal.
It's almost as if the people behind Lycos/Excite/Altavista were all using the internet via a T1 connection from their unis..
As far as I can tell (and could tell the day I heard PageRank described a few years later), there was no difference between that and PageRank, although there is a huge practical difference in that scientific papers can only ever refer to those that were published before them (or at least were in preparation at the time), whereas web pages are edited and can point to any other web page.
The "reference rank" application is not a DAG because of the "in prepataion" links, although it is not very far - so the "jump to random paper" is much more important to produce a useful stationary distribution than in PageRank - but it is otherwise the same.
Page and Brin did a lot of interesting things, many of which weren't trivial, and were hugely rewarded for that by society. But PageRank was an application of an old idea to new medium, not a new idea - in a way that (on its face) should not deserve patent protection.
I remember Google's first days - the main selling point for the majority of people I knew was not "it finds what I want when other search engines dont" - people had learned to direct AltaVista properly more or less. The selling point was "It gives an answer in milliseconds insteads of tens of seconds". In fact, I remember complaints because it lacked the "and/or/not/near" and other features that AltaVista and Lycos already had.
What? That’s the entire idea of the patent: using repeated matrix multiplication to compute the relative “importance” of various nodes in a directed graph.
> you can implement the patent without matrices
If I remember correctly it is basically a way to calculate a random walk through a network without having to do the walk.
And we also need to start paying for the internet. The malvertising model we have now is unsustainable and crippling the network in favour of Facebook and Google.
Wonder if some organization like ACLU or the like take up such a project and release a paid version for power users.
For example, NetworkX, a popular graph library in Python, implements PageRank. 
"A method assigns importance ranks to nodes in a linked database, such as any database of documents containing citations, the world wide web or any other hypermedia database."
It'll need to be a pretty sophisticated version of "content based search", or it'll just be overwhelmed by keyword stuffing and garbage auto-generated content.
Edit: bad call. Detached from https://news.ycombinator.com/item?id=20067782 and marked off-topic.
PageRank really is a very simple idea based on elementary linear algebra, a fact many people might not have known. Thus my comment could inspire a curious person to go read more about how PageRank works instead of fearing that there is a Ph.D worth of prerequisites.
Furthermore, it is a relevant comment on the US patent system.
By the way, I think PageRank is an incredibly important development in the history of technology, and took a fair bit of ingenuity to think up. I’m not dismissive of it at all. But it is also very simple, which doesn’t contradict any of the above. And I don’t think it should be patentable, just like I don’t think other simple and intuitive applications of matrix multiplication (like the chain rule from calculus, for example) should be patentable either.
It's about two things:
... but it's mostly execution.
If you have a BRILLIANT idea but can't execute you're dead.
There are tons of companies in the history books as examples.
BUT... if you can EXECUTE, you can take a shitty idea, abandon it, pivot, measure, and focus BACK on a good idea. Then when you have a good idea you can keep executing it forward.
The rest of luck but this can be worked on as luck is often timing + being prepared.
Admittedly, that is a critical part of being a founder/entrepreneur, otherwise knowing just how “stacked against you” the real world is would make you go running for the hills.
That said, I absolutely agree it is always about execution, with luck.
But to echo someone else’s comment: luck really is about opportunity + preparedness/ability to recognize a true opportunity.
For an analogy on the luck part: in poker, everyone is dealt the same hands in the long run. It is about being able to read both the cards and opponents effectively to routinely “win”.
On the execution part, it is all about focus, learning to know what is important and what really isn’t. Time spent on anything that doesn’t really matter slows you down. Of course the hard part is knowing what matters and what doesn’t. My simple advise here is to ask yourself two questions: What am I not getting done by doing this? What is more important, this or what I am giving up?
One word of caution on the above: understand what is urgent, and what is important, and don’t let things that are important but not urgent slide.
I see this regularly play out when founders who found success on their first startups attempt to repeat it, only to fail miserably. Granted, there are a handful who manage to duplicate their success, but I know that some of these are simply due to who you know from their first success (you could argue that's better execution, but the number of unicorns I know that wouldn't exist if it weren't for a lucky break from an investor previously befriended makes me think otherwise). I do not mean to belittle the hard work and brilliance of many successful founders, only to emphasize that luck was absolutely critical in nearly every success story.
Obviously it's better to have both, but pretending that you can outmaneuver the universe is an act of hubris. I think people pontificating that execution matters more than luck are too arrogant to realize how lucky they are, or want to believe it matters more because it's reassuring to believe that we are in control of our destinies.
There are things you can do to increase your exposure to luck, but it's ultimately something beyond your control. The world isn't meritocratic.
Everyone has the same chance to get lucky. You can position yourself to optimize your chances. The question remains if you have the necessary abilities to take advantage of it.
Timing doesn’t have to be luck — it can come from a deep understanding of trends and knowing when the right time to strike is...
I largely consider execution to be what you can control, and luck to be what you can't. You can break those down much further, but I really feel those are the important categories.
Anyway, I guess agree to disagree. Cheers.