
Andrew Ng predicts half of web searches will soon be speech and images - thousandx
http://venturebeat.com/2014/09/20/ai-expert-predicts-half-of-web-searches-will-soon-be-speech-and-images/
======
biot
Summary: someone who is working on a product to search by speech and images
issues a press release predicting that people will soon have half their
searches being speech and images.

~~~
kenjackson
Well Andrew Ng can work on pretty much anything he wanted to in search. It's
not as if he is a random product manager assigned to this job. He picked this.
He put his money where his mouth is. I'd think that would give him more
credibility on the claim, not less.

~~~
ewzimm
Right. It's reasonable to question a claim if someone has a vested interest in
the outcome, but Baidu's future is looking pretty secure no matter what
interface people prefer for their searches. Researchers definitely have
biases, but I would much rather hear an educated opinion from a researcher who
has chosen to dedicate time and thought into a subject than almost anybody
else.

------
Dwolb
This idea of searching by speech or images seems ... uninspired. Claiming the
interface will change and we'll search by speech or images is a linear
progression from text searching.

How about framing the problem as proactive vs reactive searches? Piece
together enough fragmented data about me to know what song I'll have stuck in
my head and don't know the name of, recognize my email contains a course
syllabus and auto-populate my calendar with assignments and study times... a
million other proactive tasks all done with me in mind.

Geez these guys are talking about changing the interface... try and get rid of
the interface all together!

~~~
TillE
That's what makes Google Now so cool. It's not super advanced (yet?), but in
many circumstances it's quite good at automatically giving you the information
you need.

~~~
lazylizard
no Google Now is not cool. i already have my mom nagging me. i don't need my
phone to chime in.

------
graycat
Let's consider only "safe for work" Internet content:

Google/Bing have done well with keyword/phrase searching with results sorted
by popularity and date.

Ng seems to be thinking that the big change will be having speech and images
coming from the users as their input to the search process. My guess is that
this will not be very important. I also guess that search for Internet content
based on speech and images will become more important.

Yes, likely Ng can get a lot of pictures of what he knows to be, say, Ferraris
and use some of them as the _training set_ with a neural network to identify
Ferraris and test the training with the rest of the pictures. Okay. Maybe his
neural network will be able to identify Ferraris. So, he could repeat this
training for, say, 100,000 objects -- Fords, bread, airplanes, jewelery,
Victorian houses, .... Maybe there will be some value there.

My view is that the future of Internet search is quite different.

~~~
hnriot
words are an incredibly efficient mechanism for communicating with google.
Google image search is great for some things, like you find a picture of a
sculpture on tumblr and it's not credited, use the photo as the input to find
the source. But Taking photos of a ferrari seems like a dreadfully inefficient
way to make a search. Maybe identifying flowers would be a better example, but
most cars have in very clear letters the make and model written on the back.

~~~
graycat
As I understand it, you have described what Google does with words and
pictures well.

Ng wants neural networks to identify things, maybe Ferraris. Then he wants a
search user to send a picture as their input for the search they want to do.
So, then the user might be able to find more pictures of Ferraris. Maybe.

For what Ng is doing, I doubt that flowers would work because there are far
too many too different cases of flowers.

Search by keywords/phrases by Google/Bing has worked very well for a huge
collection of Internet content.

But I am guessing that in a sense Ng is correct about images and sounds --
there stands to be a lot more such content on the Internet in the future.

Generally I'm guessing that there is also a huge collection of Internet
content, searches people want to do, and results they want to find where
search via keywords/phrases such as via Google/Bing is from poor down to
useless. Thus, my guess is that a new means of search is needed.

What do you think?

~~~
defen
Are you suggesting that there is a distinction between "searches people want
to do" and "results they want to find" (besides the purely functional one of
search -> result). e.g. there are search results people want, even though they
don't know they want them? Unknown unknowns that people will retrospectively
be grateful for?

~~~
VLM
I'm guessing the work flow would be something like "I want intermittent
rotation" except probably not expressed as literately. Maybe a sketch or a lot
of babble along those terms gets searched on, whatever input format. If it
works, maybe you find a Geneva mechanism.

If you have enough domain experience to search for "continuous rotation rotary
intermittent rotary" then you'll find it, but if you are mechanically
illiterate you may not know what intermittent means or rotary... maybe.

It would be a truly amazing display of AI to be given a really poor sketch of
a Geneva mechanism, it'll find a really nice blueprint. I'd be impressed if
this found the hypoid gear in a differential. I'd be more impressed if someone
who doesn't understand the concept or reason for a torsen differential was
able to none the less search it.

As a concrete example theres a pretty impressive lego torsen(-ish) diff out
there. Its easy to find if you google for the terms. I'd be impressed if you
could give a sketch to a search engine and find this lego diff.

~~~
graycat
I'm not trying to compete with Google/Bing where their work with
keywords/phrases, _page rank_ popularity, and, say, date, work well. And for a
huge pile of Internet content, search, and results, they do work well.

For the searches you mention, I believe that maybe for one of them it could be
possible to improve on Google/Bing, but I don't believe that _real AI_ would
be needed. For describing how to build such a search engine, that might take
more than the 10,000 character limit on HN posts!

------
dthal
I get why someone would think that speech is going to be more widely used: the
computing world is moving to devices - phones and tablets - that are awkward
to type on. Wearables (if they ever become a big thing) won't usually have a
keyboard at all. I don't really see why picture searches would be that big.

~~~
cryptoz
I think it depends on how we define what a 'search' is exactly. When you wear
a future Google Glass and it's discovering that you're drinking an iced coffee
instead of a regular coffee, did it do that by searching? If so, did it do 100
'searches' to discover this fact, where a search is a really an image search
of static frames from your vision?

Or is a 'search' something that a person dictates to a computer to answer a
question?

------
bertoabreu
I can see this happening in a couple of years, specially images since the
world uploads thousand or even millions of images online. Then there is the
whole face recognition software's that are not limited to people, places and
things.

------
kenjackson
The problem with speech search is that talking can be disruptive.

But if they could do accurate speech recognition with a very quiet whisper
(maybe using lip reading technology too) then I could see it completely
dominating text searches in usage.

~~~
sixQuarks
yeah, I kinda agree. It would be cool if our smartphones could read lips.

------
liamshaw
I'd get excited about image search technology if it could identify components
of pictures (e.g. a red house, an angry cat, etc). This means that when I
search for "happy family at beach" the search would actually look at the db of
pics to find a) a family b) a beach and c) indications of happiness. That'd
drastically cut my image search time down.

Text tags on images have a limit to the accuracy of the results.

------
mark_l_watson
I think that Andrew is making an easy prediction. Of course most search
queries will be over voice or image.

My wife and I usually use speech input for web search on our phones. Also, the
current Google image search is very nice. Have you tried using it? Go to
Google image search and drag a photo to the text input field. I have used this
to identify pictures of small mechanical parts and also to identify plants.

------
wslh
A lot of skepticism but I can say that children who are less than 5 (don't
read/write) are the main target for this kind of search.

~~~
bbwharris
So after a less than 5 year old grows up, why would they switch to a more
antiquated form of search? They now have a level of expectation that voice
search works and when it doesn't the software sucks.

~~~
wslh
Why more antiquated?

------
sixQuarks
speech I can understand, but how does he figure pictures will be a huge way to
search? There aren't that many times where I'm said to myself "I wish I could
just take a picture to search for similar items".

~~~
jacobwcarlson
If you think about wearables there's a more compelling case. If I walk past a
car I like I can look at it and get specs, price, who has it for sale, etc. Or
if I'm at the grocery store and wow, those tomatoes look amazing, I see
recipes. While it seems weird to search by image right now, it will probably
feel more natural once it's seamless; it certainly removes friction from the
experience.

~~~
hnriot
or you could just type or say "honda civic" \- is that really so hard?

~~~
icebraining
A prediction is about what people will do, not what they should do. Taking
that into account, is your objection relevant?

