I fairly certain Google can leapfrog Apple here, it's a machine learning company with big data built into its DNA vs. a computer company with Siri grafted on. Google simply has more training data and more expertise to build this on top of their already impressive voice recognition (which like their translation facilities just smoothly gets better and better).
I am also certain people will scream that Apple "invented" this despite the decades of public and private research on the problem and the fact that IBM's Watson (far more impressive than Siri) was released and won Jeopardy before Siri was even launched.
I'd hope nobody screams that Apple invented this, since it's a matter of public record that Siri started out as a DARPA project and was purchased by Apple a mere 18 months ago.
Very few people say that "Apple invented X", especially in geek sites. What we DO say however, and are most often correct, is that "Apple popularized X", "Apple finally made the first version of X that masses of people want to use and use", "Apple finally got X right", etc...
So what? They also make worse mistakes that that harmless one (it's not like who made the original invention matters much, except for proper historical attribution: I think of it in the way we usually think about idea vs execution. The "original invention" is closer to the "idea". It's who put it on our households that did the actual "execution". I can use Linux GUIs, OS X, Windows but could care less about Xerox's original UI).
Also, for some profession or other, we're all laypeople, and make equally bad or even graver mistakes. But we don't care that much about those. A computer geek feels superior to other people for knowing that Xerox invented the modern GUI, but he doesn't feel stupid to believe any number of BS about medicine, economics, literature, etc.
"it's a machine learning company with big data built into its DNA"
Is that really an overwhelming advantage? On the one hand, "data" definitely helped Google build solid speech recognition, on the other, Google has shown us time and time again that big data isn't an advantage in creating graphical user interfaces. My sense is that the problems in an AI interface involve just as much design and interface expertise as they do data expertise.
No, he did not. He mentioned GUIs in another part of his comment --and then went on to comment on the AI interface:
"""Google has shown us time and time again that big data isn't an advantage in creating graphical user interfaces. My sense is that the problems in an AI interface involve just as much design and interface expertise as they do data expertise""".
Siri very much has an interface -- it's just not the graphical part. The voice interface is still an interface, and still needs to be designed as naturally and nicely as possibly.
Data? What data?
Siri is AI which tries to understand what I said and then executes, it's intelligent command line processor. All the words in English and their meanings are very small data, nothing that Google has and nobody else not.
Data? What data? Siri is AI which tries to understand what I said
By using a vast amount of data. Words, phrases, accents... it all involves crunching a lot of data. That's why Siri isn't actually intelligent- if you ask it a question it hasn't been programmed to answer it doesn't work out what you're asking for- it just tells you that it doesn't understand.
It doesn't have to be specific. Google has extensive experience in crunching incomprehensibly huge datasets into meaningful, end-user responses. Apple does not.
Google has had billions of people asking billions of questions to it, and then has seen what answers (webpages) they find most satisfactory (i.e the navigate to from the search results).
"It’s wise of Google to quickly develop its own Siri rival"
I totally disagree. The worst thing Google could do is quickly develop a copy-cat.
Google should slowly and carefully evaluate whether they need a Siri rival and then create something that their customers will use that enhances their core services.
To be fair it's not like Google has never done anything with voice before and saw this Siri thing and panicked and rushed to put out an equivalent. Google has been working to improve voice recognition for years, and Android already has significant voice features built-in (speech-to-text text input, Voice Dialer, Voice Search). Google has been exploring this field for a long time.
It's not crazy to package things in a way that's comparable to your competitors.
That raises an interesting question. Does voice recognition need to be tailored to the type of lossy encoding used? Since perceptual encoders are designed to capture what the human ear thinks it hears, rather than what the sound wave actually looks like, does an algorithm trained on one perceptual encoder apply well to material encoded with another?
> Apple: Takes risk, makes iPhone
> Google: I want some of that. Android Go!
> Apple: Takes risk, launches iPad
> Google: I want some of that. Android is tablet optimised
> Apple: Lets innovate here and add Siri
> Google: Voice? To control a phone? Lets do that!
It probably isn't like this at all but this is how it comes across. It seems that when Apple innovates, Google copies.
Bringing out your own version of insert item here is fine but it would be nice if Google didn't come across as plainly copying someone else. Apple has Siri, 3 month's later: "Introducing Siri for Android! I mean Majel!"
Would be nice if the next version of Android has its own killer feature people could shout about instead of waiting for someone else to invent it first.
> Apple: Lets innovate here and add Siri > Google: Voice?
> To control a phone? Lets do that!
No, Apple acquired a company that was using DARPA-funded research from the SRI. Their software was in the iOS app store and on its way to Android/Blackberry when Apple bought them.
> It seems that when Apple innovates, Google copies.
From the other point of view, Android is way ahead and iPhone is catching up slowly. A tray for multitasking? Notification pulldown from the top statusbar? Microphone-on-keyboard for dictation? All Android features, some from day one.
Face it: everyone takes good ideas from wherever they're found and builds on them. Treating Apple as if it's some divine font of innovation that everyone else can only copy is absurd.
If the media sends that message across they are being disingenuous. It's not that different from saying Google introduces voice control -> Apple introduces the same "but better". Why can't Google's 1-2 step be seen the same now, as introducing something like Siri "but better" ?
Apple usually gets away with waiting 2 years before they release an alternative to something Android has, but they get a pass because "they waited to make it better". While Android doesn't get the same defense even if they waited a few months. Do you really think they've started working on this just 2 months ago? A Google X employee said on Reddit that they've been working on AI that can beat the Turist test for a long time - and it already beats 93% of the Turing tests.
> It's not that different from saying Google introduces voice control -> Apple introduces the same "but better".
iOS has had voice control since iOS3. The point of Siri is that it's not voice control, it's that it's 1. better at recognizing speech and 2. stateful/context-based, in that it can take "followup actions" within the context of the previous command
And Mac OS has had voice commands for several years before that.
I'm getting sick of this fight of always trying to find who “invented” something. Innovation is not just inventing new concepts, that just doesn't happen that much, it's also using old stuff in new ways or applying it somewhere it hadn't been used before, or twisting it in some other way.
What's conceptually different between Siri and Eliza or the Emacs Doctor? Nothing, but nobody else put it in a phone as a way to control it when everybody was using just voice commands. That's innovation – which Google dismissed, by the way http://gigaom.com/apple/googles-andy-rubin-doesnt-think-siri....
As is pointed out in another comment here, Google has been working on Voice recognition for a long time now and had voice commands prior to Siri. It was not conversational like Siri, that was mentioned int the article as well - but to have a conversational AI is not something new either (there are probably hundreds of these).
Secondly, voice recognition and conversational AI is not a new product that was invented by Apple. You seem to overestimate the importance of Siri in terms of the originality it brings. It indeed seems awesome to have and interact with, I have not done that yet. I have an Android, and I am not even waiting for something similar because I find it unnecessary to have one and it might be intrusive to use in presence of others. So probably for the few who are like me, it is not a killer feature - not even close.
I was aware that Apple didn't invent Siri. They brought the company which made it then tidied it up. The point still is that Apple recognised its use in one of their products, they took the punt and made it happen.
I am also an Android user. This feature isn't something I need.
Like I said in my original post, it probably isn't how it seems. The fact remains though that Apple saw an opportunity with voice commands on its mobile platform. Google may have been working on voice technology before hand but was it for Android?
Google is doing great on other things. Chrome, Chromebooks (maybe released a few years too early but they took a risk on it) and driverless cars.
With Android, I have yet to see it really innovate by itself. I should say that I do not follow Android development. I am just an Android user who casually looks at an iPhone from time to time to compare the two.
yeah definitely -- Google Voice Search and all the work they put into building up the database/quality of voice recognition has been in place for years. I've tested voice search with many of the Siri-like statements and commands and they all work. The only thing is the device doesnt' talk back, seems like that means they're way past 50% complete on this 'feature'.
it might be intrusive to use in presence of others
In my experience, it essentially looks the same as voice actions, except with a much higher success rate. The least socially intrusive action is whatever lets me transition back to the conversation the soonest, and voice actions win in terms of speed for a lot of use cases. Also, if you absolutely have to turn to your phone with someone else around, it's kind of courteous to telegraph exactly what you're doing.
You're ignoring pretty much all the situations that would stop me from wanting to use it: The vast majority of the time I use my phone for something other than conversations is spent in the presence of colleagues, family or commuting in situations where I would not be interacting with them, but where talking to my phone would potentially disturb them. Voice actions is not an alternative - I never use them for exactly this reason. The alternative is my phones touch interface.
If I "had to" interact with my phone in the middle of a conversation, it might be different, but that's a fringe case I very rarely encounter.
I can't get my friends to shut the hell up when I'm using voice search or music match software. I say "hang on a second" /beep/ "search for polar be" "dude we should get Taco Bell" "ars in Michigan".
Sure, but for someone who follows what universities and groups like SRI have been doing in the past ten years, it doesn't take a genius to see the connection.
Apple got there first. Good for them, and they fully deserve the advantage on having a functional version while others are still working on the beta. But let's not now extrapolate that and claim nobody would have got there if it wasn't for Apple.
Google has innovated plenty. To name a few things that Apple copied: notifications, cloud sync, OTA updates, and voice actions (as a part of Siri). Also, simply buying up an existing company is a pretty weak form of "innovation."
> Apple has Siri, 3 month's later: "Introducing Siri for Android! I mean Majel!"
I think it's more like: Google has been researching conversational voice control for several years; Apple releases Siri; Google says "Oh dear they've beaten us to it, we'd better rush a release now even though we have only implemented 30% of what we planned."
For those who aren't aware, and it wasn't stated clearly in the article, Majel is for Majel Barret-Roddenberry, who among other Star Trek roles was the voice of the computer throughout the series [1]
That's what the article means by the Star Trek link.
This is very exciting. Apple may have gotten to the market first, but Google has proven themselves to be an AI company at the core. They have have a pretty good head start on some more complex intelligent tasks (Image search for example via google goggles). It's not hard to see them surpassing Siri in quality.
Additionally combining that with an open source platform such as android, the possibilities for what this could do really open up excite me. Google TV could be voice controlled "Can you record chuck tonight?". You can be standing in the kitchen talking to your house computer "what was the next step in this recipe?". The real power is you're not constrained to just your phone or tablet.
I would rather Google didn't work on this. If anything they need to
a) Improve the default TTS voice so it does not sound as jarring
b) Include intent based inter-operation with apps
Ideally Google will provide only the services they do already, with mapping and searching, but will allow people to easily hook in and provide the rest.
I also tend to think that voice activation is mostly a gimmick outside of being in the car, and Vlingo already works excellently with a 'Hey Vlingo' realtime trigger.
Vlingo doesn't understand a single word I've tried saying to it (I have a Scandinavian accent), so while I don't really see myself using this stuff much, Vlingo is a non-starter unless it's recognition gets vastly better. I wonder how large a part of the potential user base it has problems handling.
Apple makes most of its money by selling hardware, Google's main income comes from advertising.
Let's assume Majel equals or surpasses Siri, this would be a clear benefit to users. But how does this benefit Google? Siri can answer questions (using partners) without using a search engine, yes it's a limited use right now but surely that functionality will grow.
The million dollar question is how does Google make money with Majel?
I'm sure Google can come up with interesting ways to advertise through Majel. They'll probably use the unique voice data to profile users and target ads even more effectively.
I would love to finally see voice web browsing. The TTS and VR quality is the main issue. I've written an HN voice navigation and article reading but the .NET TTS is far from good.
Google's "Phonetic Arts" TTS engine sounds a lot better don't you think? It seems pretty natural to me. Go ahead and try a Google Translate from English to Spanish, or the other way around.
The article gets it all wrong. The technology siri is based on was only a project for a few years and ended in 2005, basically because it was a failure and already far behind search engines at the time.
The majel thing is likely far more than siri, rumors are that it gives 93% accuracy on the turing test, and includes emotion and object detection technology.
Majel's voice would be awesome... there is a crapton of good voices to choose from and some can be recreated. We have HAL and SAL. My personal favorite used to be the lady who did the voice overs for Mechwarrior 2.
Still I am not sure I want to live in a world of devices blurting out everything I ask about or others. I already tire of people who use their phones walkie talkie style.
I am also certain people will scream that Apple "invented" this despite the decades of public and private research on the problem and the fact that IBM's Watson (far more impressive than Siri) was released and won Jeopardy before Siri was even launched.