I'm happy to give more context on this. Some people don't put all their information needs into a single query. For example, instead of searching for [iphone wikipedia] to find the iPhone page on Wikipedia, they'll do one search for [iphone] and then their next search will be [wikipedia].
Google tries to help with those sorts of search sessions. For 0.3% of queries, if we see a search for a query A and then another search for query B, and there appear to be good results related to both A and B, then we may surface those results.
For example, I just did the search [iphone] and then the search [wikipedia]. In addition to the regular results for Wikipedia, Google also surfaces the page http://en.wikipedia.org/wiki/IPhone . A good way to see that Google is doing this is to look for a phrase like "You recently searched for iphone" under the newly-surfaced results. Go ahead and try it with the search [twilight] and then the search [wikipedia] for example.
Between Gabriel's article and the WSJ article, words that are reported to provide this behavior include iphone, nexus, obama (but not romney, because there wasn't enough information for this word at the time the data was generated), tablet, twilight, computer, health, speech, iraq, sports, social security, and stock.
Just to reiterate, this algorithm affects 0.3% of searches on Google. Most Hacker News readers are savvy enough to search for [iphone wikipedia] instead of breaking that search into multiple queries. However, if you don't want Google to surface additional results that might help with your current query, Google has a support page telling how to turn off search history personalization: https://support.google.com/accounts/bin/answer.py?hl=en&...
When you say the algorithm affects 0.3% of searches on google do you mean you select each query with a probability of 0.003 and pass it through this algo, or this algo is used 100% of the time when query A is followed by query B and these combinations account for 0.3% of all searches. If it is the latter then the fact that Obama is a magic keyword may mean that you are biasing a very high percentage of political searches. I am sure election involving an incumbent is an edge case which is very difficult to account for, but now that we know of this I hope google will try to correct the results to remove these specific accidental biases in the future.
The solution for most of these sorts of things is just to refresh the data more quickly. Lots of queries, particularly head queries, are pretty stable in their characteristics over time. For a system that isn't absolutely critical to getting the query correct it's perniciously seductive to think that you can just push out the data once, then refresh it every quarter or so.
We used to have a problem with spelling where some news event would make a person with an uncommon name famous, but google would mistakenly correct it to a more common but incorrect name just because the spelling system hadn't ever seen this person's name before. We've fixed that issue and many other freshness related things: http://googleblog.blogspot.com/2011/11/giving-you-fresher-mo... but this is an ongoing area of focus throughout a lot of our systems.
It's an interesting problem because for many things recomputing the data faster will only fix a handful of queries, so from a raw impact standpoint hardly seems worth it. However those queries end up being ones that are in the news and related to things that people care a lot about.
I get what you're saying, but I would absolutely love a completely plain version of Google, where I didn't have to opt-out of anything ever again. Yes, sometimes it might be nice to get results that are in my neighbourhood or relevant to my interest, but sometimes I wish I could get what Google thinks is the globally most interesting response. Iterating and expanding on the query actually lets me learn about the subject.
What you're saying, though, is that Google is making an inference. DuckDuckGo does not want to infer anything and I tend to agree.
Also, why aren't all of the inserted links highlighted with the, "you recently search for" line?
Why wouldn't Romney be a magic keyword?
I appreciate your explanation but it doesn't really explain anything.
I don't work for Google or have any inside informatgion, but I guess "magic keywords" become such after a certain (very large) threshold. Obama, being the president of the United States for four years, has probably crossed that threshold long ago, but it only became a political issue since the election process began.