
What people want (according to Google Suggest) - lkozma
http://www.lkozma.net/peoplewant.html
======
caryme
Google says they "apply a narrow set of removal policies for pornography,
violence, and hate speech."

What seems incongruous with this to me is that they also filter many (non-
explicit) gay-related searches.

For example, typing "is my son" yields:

is my son autistic? is my son smoking pot? is my son ready for kindergarten?
is my son gifted?

Doing a google trends search for these four phrases along with "is my son
gay?" shows that the gay query is dramatically more popular:
<http://bit.ly/aPNHxO>

It seems like in some of those cases, knowing that others are searching the
same thing could provide some comfort. I'm not sure why Google filters these
queries.

~~~
jonknee
I think they filter results that aren't themselves hate speech but that have
hate speech pointing at them or contained within the SERP. I could see anti-
gay sites that are considered hate speech having content about "Is my son gay"
and that contaminating the suggestion results.

~~~
pornel
If that's the case, it should be fixed by tweak in the ranking algorithm (rank
informative sites higher than junk filled with hate), not by filtering out
query suggestions.

~~~
lukev
That's a very dangerous line to cross - ranking search results by, essentialy,
ideology?

~~~
pornel
I think it would be appropriate to rank objective(ish) information about a
topic higher than extreme opinions about the topic.

~~~
lukev
But what's objective? The best way to marginalize an opposing viewpoint is to
paint them as extremists who shouldn't be taken seriously.

------
amirmc
_A second best method is to think hard of what problems other people might
have that you can solve. Let's do [this] by seeing what people are searching
on the internet_

That's an interesting idea and I found the results intriguing. However, the
search results themselves also need to be taken into account.

For example on the search term _"How can I have a blog"_. The author's opinion
is... _"These suggest that many people are still left out of technology
because they find services too complicated. Need for even simpler blogging,
site creation, etc. platforms?_ "

I disagree with this view since it presumes people already know the major
blogging platforms before they search. The search itself brings up Blogger and
Wordpress as the first two hits which may have satisfied users.

~~~
lkozma
Very good point, I agree. However, I also meant that even the way the
questions are formulated "how can I do a blog", "how can I blog on facebook"
betray a lack of basic understanding of these technologies. Perhaps not only
simpler platforms are needed, but better offline education about these topics
as well (via school, traditional media, etc.)

~~~
amirmc
I'd say that language barriers also play a role. I've met plenty of educated,
non-native, english speakers who would say "How can I do a blog".

That queries display a lack of understanding shouldn't be too surprising. I
suspect people often search Google when they don't know about a topic. If my
first search were "How can I do a blog", my second may well be "What are the
differences between, Blogger and Wordpress".

If people are asking about blogging on Facebook I'd be interested to know what
they learned from the results of the search. i.e has their understanding
increased? If so, then I'd argue we don't need further education.

Overall, I'm suggesting that perhaps Google is already solving these people's
problems by connecting them with answers (to some extent).

------
DotSauce
These terms are ranked by # of results, not search frequency! Nothing can be
gleamed except the total popularity of the individual words and phrases as
they are published on the web.

<http://www.google.com/search?q=how+can+i+be+on+made> About 25,280,000,000
results

<http://www.google.com/search?q=how+can+i+make+a+site> About 2,210,000,000
results

Both match up with the suggest data.

The words at the top of the list are generic: (will, free, help, made, etc.)

As you go down the list, the words become less generic: (dream, market,
people, speed, password, etc.)

You would be better off looking at Google AdWords data or
<http://google.com/trends>

~~~
lkozma
You are absolutely right.

What confused me was this paragraph from the Google Suggest page:

" Google Suggest returns search queries based on other users' search
activities. These searches are algorithmically determined based on a number of
purely objective factors (including popularity of search terms) without human
intervention. All of the queries shown in Suggest have been typed previously
by other Google users. "

I put a correction on the page, please let me know if you wouldn't want your
name to appear there.

However, I disagree that nothing can be learned from the data, because all
these are real queries that have been issued multiple times. At the same time,
I admit that the ordering is totally different from what I understood it to
be.

It seems that when you use Google Suggest, the 10 suggestions are ordered
differently than this number (the # of results) would indicate, perhaps that
is the true popularity of a query, however, those partial orderings are
insufficient to reconstruct the full ranked list of all queries.

Thanks again for pointing out the (major) error in the article.

------
awolf
>People who search in "how can i ..." type of questions are probably not a
representative sample of the whole population. My guess is that they are less
tech-savvy users on average.

Hmm... I dunno. I search this way all the time. Mainly because I assume a lot
of other people do as well and, more importantly, it works. Am I the only one?

~~~
bad_user
Tech-savvy people know that Google eliminates stop-words and uses stemming
anyhow, although their algorithms for doing that aren't public, so ...

    
    
        lose weight
    

... is a lot more efficient to write than ...

    
    
        how can I lose weight
    

And while not yielding the same results, I think the first yields better
results (in this case).

Then you start adding words for refinement ...

    
    
        lose weight safe
    

See a pattern? ... It's a lot like adding tags, instead of formulating real
phrases.

~~~
awolf
But... the results for "lose weight" and "how to lose weight" are not the
same. Nor are the results for "earthquake" vs "what is an earthquake". In both
cases if I'm after the latter then it is the more verbose query that yields
the better results.

~~~
hboon
stop words/phrase removal can't be done in a too simplistic manner. words like
<how to> signify a different kind of intent. Also if stop words are always
removed, queries like <who's who in america> wouldn't work properly.

------
what
154 how can i get more facebook friend

180 how can i get fans on facebook

Maybe someone should create lots of fake profiles to 1) sell friend requests
to losers who want to pad their friend list 2) sell fans for people's pages

EDIT: Actually, maybe you can sign up other Facebook users to sell themselves
as friends/fans and take a cut. I'm not sure how would you would track it
though.

~~~
Tichy
There was a company who did that, I think months or years ago. Not sure if
they still operate.

------
jakevoytko
Wow. I have a new appreciation for disambiguation as a competitive advantage.
I know search queries are inexact, but I didn't realize that they might be
entered billions and trillions of times

    
    
        130 how can i work with google 1150000000
    

Do they want to be employed by Google, learn about SEO, or be a supplier of
Google's?

    
    
        166 how can i get some money 1020000000
    

Do they want to earn money, win money, or receive a wire transfer?

~~~
Ardit20
The just want to get some money!

------
dean
This is a little scary ...

Here are the current Google Suggest phrases on Google Canada for 'how can I':

how can i kill my baby

how can i make my breasts bigger

how can i lose weight fast

how can i lose 10 pounds in a week

how can i watch hulu in canada

how can i get my boyfriend hard

how can i get pregnant

how can i keep from singing lyrics

how can i download youtube videos

how can i tell if i am pregnant

I often use the search term 'how can i ...', and I've noticed the 'how can i
kill my baby' search phrase many times over about the past 6 months. It comes
and goes from the Suggest list. Very strange.

EDIT: formatting

~~~
euroclydon
If the Canucks are smart, that "how can I kill my baby?" search term is a
honey pot.

~~~
aasarava
Alternative explanation: The search is performed by new parents looking for
tips on what NOT to do when caring for their baby.

~~~
zaphar
Another alternative explanation: Pregnant women/teens who don't want to carry
the baby to term.

------
akadruid
Many of the questions that confused the author are down to language or
cultural issues I think.

A "C Form" is a common name for a tax form in several countries, including
India. Also short for concessional form.

CA is likely Chartered Accountant, an important accountancy qualification in
several countries.

How can I get office/windows/etc for free: If you're new on the internet, it's
going to be difficult to understand how such expensive software is ubiquitous.
Office in particular costs 6 months salary in many places of the world (G8
countries: can you imagine spending $20k on an office suite?) Plus of course
there are both valid, and copyright-infringing methods of getting these
products for free.

How can I view a street? Google Street View - a valid question if you've
seen/heard of it but don't have the name.

how can i copyright a website: It seems almost no-one understands copyright
well, and copyright notices are everywhere on websites (despite copyright
being automatic)

how can i get the new facebook 2010: related to phased rollouts of new
features, or some scam or hoax, or both.

what version of windows 7/ office 2007 do I have? They mean which edition,
Basic/Ultimate etc. Not a stupid question at all.

how can i start share business?: How can I start investing in stocks/shares.

how can i tell how tall my son will be: It's actually easy to calculate a
fairly accurate prediction from the parent's height, and height is pretty
important in many cultures.

------
todayiamme
I wonder if mining it regionally and seeing the unique searches will tell us
more about a culture than years of on the ground anthropological observations.
Of course, it assumes that people are kinda well off in the country and a
sizable majority uses the internet.

Further, this is something that networks would kill for. Most people google
the news they want to read. So, they could manage their editorial content
better with this.

This data can have so many awesome uses...

P.S. - This comparison between Google USA and Google India is kinda right on
the dot <http://i.imgur.com/FPk2M.jpg> Everyone I know uses a computer for one
thing; porn and more porn. I think that this is a symptom of an extremely
sexually repressed society. [edit: I am sorry if that sounded like a
generalization, but what I intended to say is that from first hand experience
and abnormally large amount of people are fixated over porn. It's kinda
understandable with teenagers, but grown men and women? I know this stuff
since I am sorta an unofficial tech support, so I come upon GBs of stuff that
rattles the hell out of me. It might be personal bias, but I genuinely think
that porn is more prevalent here due to social norms than a more liberal
country kinda like Victorian england.]

~~~
GeneralMaximus
_[edit: I am sorry if that sounded like a generalization, but what I intended
to say is that from first hand experience and abnormally large amount of
people are fixated over porn. It's kinda understandable with teenagers, but
grown men and women? I know this stuff since I am sorta an unofficial tech
support, so I come upon GBs of stuff that rattles the hell out of me. It might
be personal bias, but I genuinely think that porn is more prevalent here due
to social norms than a more liberal country kinda like Victorian england.]_

Nah, you're completely right. Me and my friends have a joke: nobody has sex in
India; we're all children of God. I find it unbelievable that kids at my
college never discuss sex.

Edit: formatting.

~~~
todayiamme
It is so hard to explain this to people who haven't grown up down here, but I
was still wrong. We can't make generalizations at all.

Yes India is a society where menstruation is a taboo. It is a "dirty thing"
that girls do, or something to most men. Live in relationships are against
god. Women have to be "pure" until marriage. Men can do whatever the fuck they
want as long as they aren't gay or effeminate, of course.

Ah yes, the joys of indian "society", but do you know something?

I cherish people who do not live by these standards.

When you meet someone over here who treats women as human beings and considers
live in relationships to be normal then you have found an outlier, and they
tend to be pretty amazing folks.

------
scottyallen
This is a really clever way to look for business ideas. I've tried doing
something similar with twitter, searching for phrases like "I'm looking to
buy" and "I wish there was". There's a lot of noise though, and the publicly
searchable data set is way too small. This seems like a much better data set.
If you pair this with trends or some of the ads traffic data, you could
probably figure out what sort of search volume these searches are getting.

Seems like there's a real opportunity to take the tactics that Demand Media
uses to generate ideas for articles/videos and apply it more broadly to
products (see <http://www.wired.com/magazine/2009/10/ff_demandmedia/> for a
good article about Demand Media's approach). Instead of trying to generate
ideas on what products people might like and test if they actually do (using
an MVP or dry testing), instead figure out what they're looking for that
they're not finding, and offer that.

------
IgorPartola
16 how can i search in google 2690000000

This one is just curious.

~~~
gurtwo
Probably meaning 'how can I better search in Google'. Can the users be trained
into making better, more specific queries? At school, maybe?

~~~
IgorPartola
Believe it or not I learned some tricks I use daily with Google from an
otherwise computer illiterate English teacher in high school: using quotes and
- to modify how the search works.

------
SandB0x
I found

46 how can i find my son

55 how can i get my son back

very sad.

~~~
IgorPartola
I wonder if at least some of the time #55 refers more to getting him back
mentally, as in back to the state before puberty hit.

#46 is indeed very sad.

~~~
igravious
No, it's in all likelihood the the parent-son bond has been broken and the
parent would like to fix that. They are both very sad, and no reference to
daughters in the list? Sons may stray further or abandon their parents more
than daughters.

~~~
masklinn
> No, it's in all likelihood the the parent-son bond has been broken and the
> parent would like to fix that

I'd have thought of a divorce gone bad, and a father barred from seeing his
son.

~~~
igravious
I hadn't thought of that but that could easily be the case as well. Very sad
also.

------
mredbord
Not that this isn't a cool analysis, but seems as if there's selection bias to
these data. Lots of psychographic segments will never start a search with "how
can I...".

~~~
lkozma
I think I mentioned this in the bullet points.

------
nivertech
This post reminded me how I always was kidding on my coworkers and friends.
When working both on the same PC and needed to download some utility. I was
going to Google homepage and typing: "Dear Google, please help me. I looking
for some information regarding this utility program, called XYZ. Do you, by
any chance, know where from I can download it? Thank you very much in
advance!"

------
apollo
At first I thought the numbers were number of searches, but they're not. For
example, according to AdWords, "how can i be on made" is searched less than 10
times a month globally. The numbers shown are number of results.

~~~
lkozma
You are absolutely right, I added a correction. What confused me, was the
wording on the Google Suggest help page and the num_queries parameter name, as
rbrcurtis mentioned. Please see my replies to this:
<http://news.ycombinator.com/item?id=1511824>

------
ryanricard
This is like the serious version of <http://failblog.org/tag/autocomplete-me/>

My favorite recently was "Do Not I...(ron clothes on body)"

------
snorkel
What _teenage_ people want.

------
Terretta
It's nice to share the list of sorted results -- but why on Scribd in print
format?

------
mcknz
Found poetry....

------
Aetius
35\. Too funny. 59 is also hilarious.

And would you _look_ at all those Facebook queries. That alone should have
Google shaking in their boots!

