Knowing what the most common questions are doesn't help you find the corresponding answers. At best you can find the result that users click on most frequently, but if all of the results in the result page suck then that's not going to help.
Google's advantage is simply billions of dollars and 20 years of R&D into NLP tech.
> Knowing what the most common questions are doesn't help you find the corresponding answers.
Not directly, but it does mean that, say, the second 50% of people that ask the same question will get a better answer than the first 50%.
Google has been able to build fairly accurate instant results based on which sites users were clicking on before. I'd say that a majority of simple general knowledge queries are solved by quoting the first 3 sentences of the wikipedia page that match the search query.
But let's say that there is no easy match to show a quick result for. But after 1000 queries, 98% of users clicked on one particular site on the first page and never went back to the search results. Google then A/B trials how many clicks result from showing that website at the top of the results page in an instant result window. If clicks drastically drop, that's a sign most users are satisfied with that result. They were only able to do that, because of the 1000s of times people typed that query and interacted with the site.
So I'd say that yes, having common questions asked over and over again does help you find the answer that users are looking for.
The funny / scary part here, is that this may not be the correct answer. But it's the answer that satisfies the most users, and is therefore the one that will keep the most people coming back to the Google search engine.
> Knowing what the most common questions are doesn't help you find the corresponding answers. At best you can find the result that users click on most frequently
I doubt that users have a strict expectation of finding the correct (interpretation of "corresponding") answer. Even if the most clicked answer[s] are still not correct, they may still be the ones who suck less, which can still be acceptable/desirable.
Keep in mind that in this perspective, the concept is similar to Google Translate - Google built a translator that, at least originally, doesn't understand language, instead, it applies (applied) a static model on a large amount of documents. While they certainly poured a large amount of money on it, it's success can't be centered purely on the economical factor.
Well in the given example the user asks a question "who was the actor that..." and Google gives them the answer. This is really hard to achieve by accident no matter how much traffic you get. It requires NLP machinery to answer a question, it's not a matching problem but an understanding problem.
There are at least three kinds of queries that a search engine has to handle. Requests for websites e.g. "Facebook". Traditional keyword searches across the web e.g. "Twitter ban political ads", and questions "Who was the guy who voiced bender?".
Google's advantage is simply billions of dollars and 20 years of R&D into NLP tech.