Voice input is a terrible UX, because as a user, you have no idea what the system can and can't do. Every single Voice UX I've used for input (Siri, Alexa, etc) is a fumbling around experience where inevitably I can find out the weather, turn on some music, and then hear some half-funny replies from the device because it doesn't actually know what to do. It can work 99% of the time, but if that 1% it breaks down and I have to revert to a device with a screen, it kind of ruins the point.
Maybe Amazon is addressing this with their supposed Alexa with a 7" screen in the future.
What I want is improved Voice Output UX. I think Apple is trending towards this with the AirPods, where you can leave them in your ears all day and have Siri just talking to you, which is the only practical use of any of these things unless you're stoked about annoying everyone around you with your robot assistant awkwardly talking out loud all day.
Things said in our home all the time:
"Do you know what the weather is going to be like?"
"Is it going to rain today?"
"We need to buy some milk."
"The roast need one and a half hour."
We speak all the time with each other and there is nothing weird about that.
I bought a Google Home and it's literally transformed our lives in the little details for the better.
Before asking for the weather was something which required one of us to find the phone, open it find the weather app.
It sounds banal but if you have to do it every day because you take your kids to school believe me it's a godsend.
My wife was very sceptic about this and is certainly no tech fanatic, she now loves it because as she said she can now turn listen to radio just like in the old days when all you had to do was turn on the radio.
My kids love talking with it.
And it's basically just a very practical assistant which is revolutionary in the little details not some sci-fi scenario.
It's not disrupting anything it's improving the quality of tech by allowing us to interface with technology in one of the ways which removes yet another layer of abstraction.
You don't need to know what the system can do as long as it can do enough Google Home certainly can and it's just getting started.
Except that humans have an average IQ of 100. They understand context, they get what I'm trying to say. Try getting Siri to turn off all repeating alarms, just for today. Or turn off all alarms until 12pm.
"But that's not how Siri is supposed to work!"
Sure, fine. It's just a ux to a limited api. But then you can't also compare it to how we use the same input for humans. Robots aren't humans. Not by a long shot. They have an api, a very specific and limited set of capabilities. They are not flexible. They don't get context.
Voice interfaces have their place. Because once you know the api, the possibilities, voice may well be most practical. But not because we happen to talk to humans.
Thats like complaining about a commodore 64 not being able to run the same games as a Playstation.
The discussion is not whether Alexa, Siri and Google Home are perfect AI's but whether they are good enough for some very basic things like those I described.
And no it's not because we talk to humans that they are good, it's because we interact with other humans through speech.
Consider a voice operated light switch. Why do I need to tell the switch I want the lights to turn on when I enter I room? The house AI should know the ambient light level, the time of day, my location in the house, and my default lighting preferences, and lighting should "just work" unless I want to change something - which is when I can ask for it.
Alexa can't do any of this yet. I can give Alexa commands to turn lights on or off, but currently there isn't even a context for the current light state. So I can't say "Alexa, lights" and have Alexa work out whether that means "turn the lights on" or "turn the lights off" for the room I'm in.
IFFT may eventually be able to do this, but it's far from transparent and straightforward.
It's about affordances. I have a reasonable idea what a human's affordances are. The current voice UI (vUI?) equivalent is a handful of dots of implemented functionality surrounded by huge areas of not-working-yet space.
Not only is there no map, there's no way to guess what might be on the map.
It's also about cognitive load. I'm typing this in a bedroom with a couple of 433MHz switches controlled by a remote - one for a light, one for a heater.
Using the remote takes no conscious load at all. When I had the switches controlled by Alexa, formulating a command took effort.
That is unless I'm reading a book, in which case I need a light on. But not just any light, I'd probably just lamp next to me to shine on my book instead of the one that lights the whole room. I don't need the whole room lit, just my book. So now the contextual system needs to have cameras to see if I'm reading or just watching TV or playing on my phone or napping.
And when I walk into the bathroom in the middle of the night, I'm fine with the nightlight above the toilet. Turning on all the lights will wake me up and ruin my night vision. But my wife likes the light on. So now we need facial recognition to tell who is walking into the room.
Right now I have to get up and turn on a light if I'm on the couch reading. If we're talking effort, that's a hell of a lot more effort than saying "Alexa, turn on my reading lamp". And don't get me started on the effort it takes to try to find the missing remote...
I don't think you are appreciating the things that used to be which got lost with digitalization and now can get back.
It's not just about what the AI can do, it's how it gets expressed.
However, the voice ux projects at big tech companies are inevitably about collecting training data.
Five years down the road that limited API is going to be a lot less limited.
But that's not really an inherent issue with voice. It's just that today's systems are mostly stuck with requiring relatively strict adherence to specific syntax. This will certainly improve over time.
I don't find Alexa as transformational as some people I know do but I find it useful enough.
For example, a while back I was using Assistant to look up showtimes for a movie at a specific theater, and just to see what would happen I said "show me a map" without giving any explicit context. I was expecting Assistant to come back with a search query or maybe pictures of generic maps in response, but to my surprise it actually understood that I was talking about a map of the theater and gave me that.
Of course, that specific response in that exact situation isn't in and of itself all that impressive, and for every time Assistant succeeds in figuring out what I meant in a situation like that there's maybe one or two other times when it has no clue, but the idea of being able to speak 100% naturally to a virtual assistant and having it "just work" is unbelievably cool.
As a parent in New England, you have to choose between 3 jackets and when you are trying to get socks on two kids and can ask "alexa, what's the weather today" and know the answer without having to pause and get a phone and let one of the kids escape, that's really nice.
With modern heating systems (specifically thermostatic radiator valves) we no longer need to ensure that the heating is prepared at the right time, so we don't need to know the weather.
With the prevalence of tumble driers washing doesn't need to hang out, so we don't need to know the weather.
Most modern clothing is suitable for a very large range of temperatures and good waterproofs can be comparatively small and lightweight, allowing them to be brought everywhere without burden, so we don't need to know the weather.
Cars and roads have improved significantly, the impact of the weather on a small or long journey is often negligible, so we don't need to know the weather.
More jobs are performed indoors now and where they aren't special machines and tooling have been made that make the job easier alongside significantly reducing the risk and discomfort that weather can cause, so we don't need to know the weather.
There remain some situations where knowing the weather is very important, but very few of them affect 'the general populace' is a consistent enough manner to cause them to waste time following it.
Then again I'm British, and an American I've recently befriended has shown me how different attitudes to the employer employee relationship can be
Pretty much the story of modern civilization. Just step outside and check the temperature, and look up at the clouds and guess if it's gonna rain later. Wear a sweat shirt or waterproof windbreaker if in doubt. Or you can just not care about getting mildly cold or wet. I say this as someone who spent 25 years in Minnesota.
But sure, you don't need umbrella either and we could all be driving open cars :)
Don't get one if you don't want to, the discussion is whether it's useful and it is. Not just for the weather not just for rain. For many little things in life.
That's one example. Schedules, playing music, guitar tuner, uke tuner, metric conversion when hands are dirty from cooking, and plenty of other examples are abundant in our house
Nothing complicated about it, it's just a reality for many people.
Not sure what you think you are proving here.
I didn't really think of rain in my original comment because when I was growing up we literally just walked in the rain.
Again whats your point?
Both Alexa and Google home have been great with a kid.
But then again, we live a few hours north of the location where they filmed the Battle of Hoth. Asking anyone, let alone a machine, "what is the weather outside" gives you little information about the weather in six hours.
There is something about the tangibility of information we use a lot. Instead of it being hidden behind login, app launch etc we just access it directly.
For example, I can check the weather for today on my phone with minimal fuzz, clicking 2 or 3 times and getting much richer information (a full graph with short term forecast, a few hours in advance, etc) in a single screen shot.
Another example, ordering a pizza. I prefer a GUI over calling and taking to some human. The options are clearer, and I can review easily what I'm ordering.
Not to say that it's impractical, but voice doesnt carry a lot of info for a lot of services. We'd prefer a phone call over an specific tactile interface for everything, and that's not the case...
I gave some pretty specific examples of what it's great for.
I am not saying it's great enough yet for much more complex things no disagreement there. But unlike Seamless web or mobile interface I could potentially just with the command, "order the usual from Joe's Pizza" skip a whole suite of interactions needed to order food today.
Telling Alexa "send me some paper towels" only works if you have established a protocol in writing with Amazon (i.e. I want a 8 pack of jumbo bounty).
Your use case "how is the weather" is useful, but is it useful enough to be "the next big thing"? It may be big, but not smartphone big.
My experience with both devices is that there's no one Big Thing, except maybe music. There's a lot of stuff that just works surprisingly well by voice. Quoting my daughter: "Hey, Google, play Puff the Magic Dragon" - "What's the weather today?" - "Turn on/off the lights" - "Set a timer for 5 minutes" - "What's 17 tablespoons in cups?" - "How long does it take light to get to Mars?"
It's probably not smartphone big. But I think that in the long term, we're going to find that some kind of voice-based interactive device like this will become very common.
I cook a lot, timing is important.
I have fun playing games with my kids teaching them trivia, Google Home does that while we are free to do other things.
We listen to a lot of different music. Google at home allow me to stream from Spotify simply by saying the type i want.
My wife likes to listen to radio, she just tell Google at home to start a radio station.
Listening to the news, google does a pretty good job assembling what is most important during the day.
You can ask it to play the latest "Startups for the rest of us" podcast simply by asking it to play that.
You can ask it when you next flight is.
How long it's going to take to get to somewhere.
You can add things to a shopping list which is important when you are family.
And these are just things that it does out of the box. I don't care about switching on lights which it also can do and turn the heat up or down.
What I think most of you realize is that just like the touchscreen removed a layer of abstraction to access technology because it allow for things humans do naturally. So does the voice. It's extremely intuitive as there is nothing to learn. As voice gets better and better it's going to change a lot of things.
The sales numbers of Alexa is pretty telling too and this is from all sorts of people like my parents in law who now have one they use and understand.
I don't care whether it's going to be the next big thing, but to claim as the parent did that voice is a bad interface is simply as wrong as it can be and to claim there is no value in what it can besides a few things, well it turns out at least for my family those few things are pretty useful.
"Ok Google, is it going to rain today?" is something I can (and do) ask at any point. No need to have a new device in every room of my house and no need to learn a different interface when driving in my car etc.
Oh god please. I can't understand how complicated radio is these days. Fuck me, I'm forever returning or rescanning or some bullshit just to listen to something.
Weather and traffic reports are still a staple of radio broadcasts
Hope it was worth the convenience.
As if knowledge of me visiting a public site is comparable to Google having an always-on microphone that probably is riddled with 0days and has constant internet activity that can easily mask a malicious connection?
it only takes approximately 10 minutes of recorded audio to interpret individual keystrokes with a ~96% accuracy. 
Google / NSA / foreign state hackers / your next door neighbor could potentially find or pay for an exploit to the device and have all of your passwords down within a few days. All of your children's passwords.
It can reconstruct every message you've typed in range of the microphone. Ergo unless you rely on bookmarks and not the address bar, it can in fact track which websites you visit. Furthermore, it could interact with ultrasonic sounds emitted by your computer speakers to track your ad experience, and if you live in a malevolent state, to locate you. 
If your kids are a problem to the police state, they will be identified and marked before they even know they want to be political activists.
This is the world you aren't only allowing, but are defending as well.
And then there are the ISPs you use to access the internet. Or the cellphone tracking your wereabouts or your camera and your microphone on your laptop. Or your router.
I could go on. If that's what you are afraid of that game is already lost.
You're trying to invalidate my legitimate concern over a device's security implications by telling me other devices have security implications as well. No shit.
The Amazon Dot is my favorite kitchen tool, we use it more than we do spoons at this point. The ability to set multiple timers, convert units and measures, and remind you to do things all while playing your favorite tunes is amazing.
"Google Now, Set a timer for..[slight pause]...<How long would>three<you like>minutes<a timer for?> [Pause, pause, pause] Three minutes. Three minutes! Three Fuck<Setting timer for>ing Minutes<three minutes>.
I am now manually inputting timers again.
"I'm sorry, mikestew, I didn't catch that."
Maybe if you waited more than 2.6 milliseconds for me start flapping my pie hole, you'd have something to catch.
I don't actually care that much for it as a time though. The other timers I can easily set in the kitchen tell me at a glance how much time is left. Alexa I have to ask. These are small things but it means I don't default to using Alexa.
However when you're preparing a meal or making cookies and you're constantly moving and not watching the clock, just responding to it, it becomes handy. It's also nice to not have to stop and wash your hands to set an egg timer.
HN isn't the best sample group for this, because for most people that's a pretty large amount of the time.
Fucking up food is a sin.
Sci-fi got it wrong with their humorous stupid robots providing comic relief while smart robots do the work. All the robots are the same smart, and getting better every day.
I'd like it if the knowledge graph was better, for answering questions. Just wikipedia queries is a bit lame. I think Google Home will win on that one.
Oh, and I'd really, really like to be able to voice call/skype through it (presuming the array mic + software will be able to isolate my voice/remove reverberation well enough).
IM (with TTS) could be fun too
I was a huge skeptic of voice assistants before the Echo came along, but being able to turn my lights on and off with a single voice command alone has been more than worth the price of entry for the Echo Dot + TP-Link Smart Switch combo I got for $60.
Sure, there is a niche for reordering products easily, but commodity voice tech is going to rapidly hit zero margins. Like those 10$ TV dongles that used to take a full PC.
Google makes 330 million in profit every week.
It's maddening. There's absolutely no way to discover how to use apps like this unless you already know how.
Just consult the nearest "X things you need to know/didn't know about Snapchat" article /s
It takes all of < 1 second. I get it is obfuscated but we aren't talking about a crazy level of interface hiding.
People expect to be able to swipe on most interfaces these days.
See you in a few hours.
"File, Edit, View" etc model is condensed into 3 horizontal lines. The lines I suspect were modeled on the idea of a row, with one option per row after another. Click "View" in a random app, oh look, one after another. Don't like that notion? Think of it as a Start menu in ever app, and you still have to drill down. It's barely a change, since the idea has existed since Windows 95!
Small screens need less chrome to highlight content. It's about the user having the best view of their content, not peddling a catalog of "me to" features and pretty icons in people's faces. Turns out, people actually enjoy seeing more of their content on desktops too, rather than rows of default icons.
Scrollbars? People still stop what they're doing to look for the scrollbar? I stopped dealing with that once MS shipped the Intellimouse 20 years ago. Tap/drag on mobile. You know how to tell if you're at the bottom of a document? It won't scroll further.
"It's maddening. There's absolutely no way to discover how to use apps like this unless you already know how."
But you KNOW how to use them already. They're just slightly modified. And these things have existed long enough now, there's no excuse not to just swipe either side or tap the hamburger to see what's up? I mean, how am I going to know what the new app using classic "File, Edit, View" menus does? By using the input options I have, mouse, keyboard, to dig through them. That hasn't changed on these new desktop apps. On mobile, input options are fingers, tap and swipe about.
This falls into a category of "Personally, I don't want to rethink things ever." type arguments.
However, it can't, it can't, and you can't. And when the illusion fails, it's frustrating: Neil Stephenson would call it Metaphor Shear, I (being less fancy) would call it why-didn't-they-give-me-a-better-ui-or-at-least-a-manual-dang-it.
This is ultimately why I prefer unambiguous interfaces over "friendly" ones.
1. It's slow. Apple AirPods are a good example. Siri controls the volume. Talking is slow. The input and response until the volume is changed is slow. Faster would be to use the long side of the AirPod as a touch control and sliding up and down it changes the volume. One finger slide = volume change. Two finger slide = track change. In any case, touch controls are way faster than voice.
2. It's weird in public, and faulty in loud environments. Talking aloud to yourself is going to get strange looks at you. I tend to do things that don't bring me attention, so I don't use Siri. Also, good luck using Siri at a concert or loud train commute.
This is how I thought about search without search operators. Then Google got it right and owned the market.
So I agree with you -- but I also think someone will eventually get it right, and that company will take off.
I don't think your statistics makes any sense.
99% of the time, you save time by speaking your intent, vs interfacing with your phone. That time saved is meaningful and outweighs the 1% of the time you have to fall back to your screen device.
99% of the time, you can use your hands/fingers for other tasks while speaking your intent. That allows you to multitask more efficiently.
Very true -- our Echo can control the lights, but I can never remember what each light is called, while with a GUI, I can see the list and it's obvious that "Livingroom - TV" is the light near the TV that I want to dim, while "Livingroom - TV table" is the one on the table to the right of the TV and I want to turn that one off.
I also don't like the feeling of having a dedicated microphone set up that records everything, waiting for a command and is completly intransparent about what it sends back to Amazon/Google/Apple.
Also, for the $100 price it frequently goes on sale at, it's a great wireless speaker with Spotify integration built in.
Also you're being very "No wireless. Less space than a nomad. Lame."
The problem is: The electronic light control modules are expensive, break down after 1-2 years, and then go obsolete.
I'll spend more time rewiring my house than stumbling around in the dark!
When you say 'Hey Google / Siri / Alexa / whatever' a connection is set up to a real human who can control your house automation. Then the system could be self learning because the operator is translating your command into an action.
But it would not be for me. I absolutely never felt the need for home automation (don't think it is going to make my life better). And I care for my privacy.
Talking, while potentially more natural, is also simply slower than most other input types.
This is the same as the command line. Just like the command line, voice may not be completely intuitive initially but for regular use, it's probably much faster than a colorful GUI.
"Hey Siri, set a timer for fifteen minutes". Done.
That's friction, and that's where these frictionless interfaces step in.
People love to misuse that word.
I just tried it with best case scenarios, and it's still a second or so faster using voice (1.5 seconds) vs control center (3 seconds). I mean, voice is literally just yell out "hey siri set a timer for 15 minutes" and that ends the interaction. Done, just like that.
If you're changing the time that is set, it's significantly longer.
I think Voice Input should be an Additional, not as the ONLY input method.
I avoid these devices for the privacy concerns and also the generally poor security of IoT devices. I hate to sound like a luddite, but i really abhor these devices and how hard the tech companies are trying to foist them upon us as "the next big thing".
Either Apple/Google/etc and granulize the permissions such that things like "boolean: user is in vehicle at speed" are addressable separately (possible but unlikely) or such knowledge comes with a cost - the app knows where you are and have been.
Same applies doubly for "noise it has heard".
Consequently my decision is: no assistant, thankyouverymuch.
How will it learn to compare all these different environments? ML obviously; which pretty much entails leveraging the consumer data to provide a better product. The cost of usable, voice-driven assistants is giving up a vast trove of personal data and abandoning the concept of privacy completely, just so I can order some paper towels from an Internet of Shit connected device instead of walking three blocks to a store.
Your position is akin to a Louis CK joke(not offense meant): "Oh boy! I hope he doesn't do what he's going to do!"
Whether you're considering voice interfaces, connection speeds, machine learning, web development, phone battery life, or whatever.. things are trending towards the better. Sometimes it's steady improvement, other times it's punctuated. Either way, the floor on what we can considering is always increasing.
I think security is getting better but what we see now are the consequences of years of poor decisions. It's the post-Christmas credit card bill.
While not quite what you meant, Firefox is frequently bloody awful now and it used to be quite usable :)
One of the exhibits was of a common East Berlin living room. The exhibit explained how the state had hidden microphones here. They were listening for people to express anti-socialist ideas in their own homes. This was a real threat faced by people not more than 30 or 40 years ago. You could have your life turned upside down just for saying the wrong thing at home.
This exhibit drove home for me the inherent abusive powers enabled by some tech. One of these technologies is always on, always connected microphones. Why would I want one in my house? Given the potential downsides, what could possibly be worth it to justify having one?
Sure, the government could force Amazon to let Echo spy on you. But they could force Apple or Google to do the same thing with your phone. Speakerphone isn't as good as far field mikes, but it's pretty good.
The government could also just bug your apartment the old fashioned way.
The big fear would be that echo is storing audio of your house that a future government could use against you. But there is no indication that Amazon does that. It only sends audio after it's triggered by the codeword. Honestly, my google and bing histories are probably a lot more damning.
But if Trump goes all hitler on us, I'd throw mine out.
Also, the recent findings show it's not just subversive actors who gets surveiled. There are already massive dragnet operations. That's happening right now. So it's not some hypothetical, that's reality.
I think the better approach, rather than waiting and finding out too late, is just to avoid these altogether. The world has mostly been pretty safe but the security appears to now be threatened and civil liberties are rapidly eroding. So just for that reason I think our position for ourselves and for friends/family should just be to pass altogether on this type of device
This means a future dictator will have a huuuge corpus of data on everyone with the perfect surveillance apparatus to leverage it. And by then it is too late.
I wonder what direction our new Russian-influenced President will take things given Putin's KGB background and savviness in shutting down free speech.
Yet, when everyone has email, that becomes a rather important data source. That draws the interest of those organizations tasked to keep us safe. They certainly wouldn't be doing their jobs very well if they left that trove untrawled, and voilà, soicograms of everybody.
When everyone has a live audio feed from their homes to Google, Amazon and Samsung that becomes an even bigger data source. It would be unthinkable not to mine it and I don't think anyone knows what the effects of that will be.
My standard counter-argument to this is "drugs". Drugs are a sufficient bug-bear to be used by the UK government as justification for invasive surveillance, but there are so many drug users in the UK that if they tried enforcing that law they would bankrupt the nation three times over from each of prosecution, building and staffing sufficient prisons, and the sheer number of people in prison who would then not earn any money or pay any taxes.
And that is just from a thing the government uses as its justification for the surveillance power it now has.
And what happens when failing to have an Alexa or equivalent devices becomes an act of subversion or at least suspicion?
> Sure, the government could force Amazon to let Echo spy on you. But they could force Apple or Google to do the same thing with your phone
I think you'll find this is an argument against Google and Apple handing over data wholesale to the government, not an argument in favour of getting another device that does that same thing too.
> The government could also just bug your apartment the old fashioned way.
You mean with judicial oversight and at some considerable amount of effort and expense that would make dragnet surveillance unpopular both within and without the government meaning its use would be reserved only for the sorts of activities proponents of surveillance say it would be used for? Yes, I'm okay with the old fashioned way.
> The big fear would be that echo is storing audio of your house that a future government could use against you. But there is no indication that Amazon does that.
The big fear is that by inviting these amoral entities into our lives they CAN (not will) perform these immoral activities at the request of other parties. It enables it.
> Honestly, my google and bing histories are probably a lot more damning.
Yet you still submit to the surveillance. Good for you. Not everyone is will to make the same compromises.
I had hoped and frankly expected that there'd be a few more privacy oriented choices in this space by now. There was Zoe, but that got canned at last minute. I really didn't want a refund...
In short 5 months ago all seemed to be going OK, then they went a tad quiet. Three months ago there was another update including a CEO statement.
"Due to unforeseen delays in the development, we are not going to deliver the ZOE in the current form and all contributors will be refunded." They mention it's a partner having development delays in some critical component, but all very vague with no details.
Supposedly the project is alive and we'll get an update email when they're ready to try again. I infer from refunds all round it's not going to be soon. If it was just a few months I figure they'd ask for patience to work around whatever issue.
I don't know enough of IoT specs, protocols etc to know what would be needed other than the broadest terms. Probably yet another use for a Pi 3.
That is the most important distinction. Sure my laptop has a camera but if I keep good security practices I can feel reasonably assured that it is not monitoring me unless I've switched it on for some reason. The Alexas and Google Homes of the world default to "on", and everything you do/say is monitored because "cloud processing".
I will always have a problem with anything that tries to use my actions to build a more predictable model of who I am or what I am/will be doing
Wrt your fear of always on microphones, there is a possibility that this may also happen on your smartphone where some rogue app might be listening to all your conversations. I don't see this to be quite different from that. I guess as more and more people adopt Alexa and you see the benefits, you will reach a tipping point where the pros outweigh the cons..
Resistance to technology that has unbounded risk isn't "fear". It's vigilance against an increasingly dangerous threat. Ignore that risk at your own peril.
> this may also happen on your smartphone
What smartphone? Regular phones still work, and a full size screen and keyboard has much better ergonomics.
Also, the existence of malicious software doesn't mean it's a good idea to voluntarily increase your attack surface. Lockpicks exist, but you probably still lock your house and car.
> you see the benefits
That will never happen, because voice input isn't ever going to be compatible with my apartment's old walls. Talking loudly at night is rude to the neighbors.
Google's assistant might be striving for more openness, but I don't have high hopes here either; at least until a formal development kit is released. Given that Google won't be able to use its same tried and true ad revenue strategy, we can expect them to offload voice requests to the highest bidder or prefer Google services above all else. This is another perverse link into Google's realm of anti-privacy; but worse, because they won't be able to do ads along the side. The responses will be the ads.
All of this to say that all of these locked down digital assistants will fail until a company truly approaches it from an open and wholistic perspective, and one that doesn't rely upon troves of private data. These devices may sell well this year, but so did digital picture frames. How many of those are still in use?
I'm under the impression that if it feels Amazon-only it's because of a failure of companies and devs to take advantage of the platform, not a failure of the platform itself. For example the Google services you're asking about, is that Amazon's fault or has Google not taken the time to develop that app because they're focusing on their own product?
I've used my echo dot most often with 3rd party services so far (daily briefing integrations, capital one, yahoo fantasy football).
1. They require you to say "Alexa, tell <app_name> to <do_something>"; This pushes all 3rd party services one level deeper.
2. Amazon competes with too many companies and services these days. Instacart devs won't want to work on a skill within a platform that is itself is a competitor.
#2 I agree is an issue. I wonder if with the current environment any independent would be allowed to gain traction, or would the competitors lock them out of services that users find mandatory (amazon shopping lists, google search, apple music, etc).
Alexa is actually not a walled garden at all; skills are almost entirely separated from each other, and I think that will be one of the biggest problems though. My gut says that the reason language is convenient is that it allows for things to be implicit rather than explicit and I expect the voice assistants that will be most useful are the ones that can best understand this context, which will require deeper integration.
Except that: https://developers.google.com/actions/
OFC partner negotiations will need to take place in this; this is a new field with massive implications for customer security and caution and oversight _should_ be the watchword. Imagine the open ecosystem model for this and the disasters it could bring.
But the door to the ecosystem is plainly marked now and the rules are posted over the queue. You can do it today.
> The first and most important difference is that Google is not going to create an “Action Store” where users can select which ones they want and “install” them on Google Home. Instead, Google itself is going to approve all the keywords that developers want to use to invoke their actions and make them all available to everybody.
> That effectively means that actions will be curated by Google (like an app store), but users won’t have to install anything before using them (like the web). “It's not a direct analog to any existing ecosystem,” says Jason Douglas, director for actions on Google.
That article does specify that there will be a hand-off to 3rd party systems, rather than Google orchestrating everything, but it still sounds quite tightly constrained and I think that if the plan is to have things all work nicely together and not be silos, then tighter integration with more Google control will be the name of the game.
With Google voice assistant I can't say "look up the New York Giants score then text it to Phil"
I don't recall reading that functionality in the Alexa docs, either.
Seems natural and logical to me.
* I had to specify New York Giants because Google always brings up the SF Giants if I don't
But they have released one? https://developers.google.com/actions/
However, I get an email once a week from Amazon with "whats new" on Alexa. Its always a few useless things like holiday trivia and then some garbage you can order from Amazon. It was a ground breaking device, but its clearly meant to just be another gateway to Amazon's weird, limited ecosystem.
Is there a technical reason why they couldn't play an audio ad after a response based on what you asked for and your account history? It seems to me that voice UIs, like mobile UIs, have constraints on the quantity of ads but don't make them impossible. These constraints, by reducing the ad space supply, can also drive up the price for said space, especially when you consider the audience for an audio ad is A) captive and B) not restricted to a single individual.
This would make users the customers and better align incentives letting them do things like put user privacy first.
I used to work there, and I can definitely say that are working hard and fast at becoming a non-walled-garden platform for voice.
The voice capabilities are still lightyears beyond Siri/Alexa, but they have spent more time focusing on big partnerships than the fickle user facing market.
I hope they make some moves very soon, because frankly I believe their tech is truly superior.
Then one day someone forwarded to our team this email, and it really touched my heart.
I am quadriplegic and the Amazon Echo has transformed my life. I haven't been able to open a newspaper for 10 years, but now I can!
Please pass this email on to all of those concerned at the Guardian... They need-to-know the difference that they have made".
“Qualitatively, Amazon’s position is more secure than the numbers would indicate.”
I beg to differ. I believe Apple has a much more secured position than Amazon in the long run, given how Siri is already integrated in iOs mobile devices and macOS based desktops. So far, has Apple capitalized on this advantage? Nope. Are they gonna? Most likely, given the fact that they always wait for everyone to show their projects in high school science fair and then they roll out their better comprehensive showcase.
If Apple does not launch a voice based home automation product in late 2017 or early 2018 then it might be too late.
Apple's true competitor in this domain can only be Google with its staggering command on Android ecosystem. Microsoft effort is limited to Windows based PCs and even they might have a better advantage than Amazon.
Amazon has mainstreamed a voice tech. They have done it well and made a point that this is the way of the future. Now the second part of the problem is to integrate already existing daily use computational devices with this "voice-tech". Amazon literally has none, no mobile and no desktops/laptops/tablets.
Why would it be too late? This isn't a social network where the network effects make competition almost impossible. Apple already has a platform in HomeKit that leverages the much larger iOS installed base. More HomeKit compatible devices are being released all the time.
In fact many people criticized Apple for imposing requirements around hardware-based security which slowed the release of devices, but the recent DDOS attacks and the Internet of Shit Twitter account has shown that was the correct move.
I don't think there's any deadline for Apple to release an Echo competitor. If they release a product that is better/differentiated in some way, people will buy it. And they'll likely have a number of HomeKit devices already that work with it.
There are still economies of scale. More microphones being used more frequently means more data on which to train.
Most of Apple products (Mac/iPhones/iPads/AppleTV/[future Apple home product]) go through them.
Google went the other direction, recently adding routers to their product mix.
The big selling points of Apple WiFi routers were the ease of use and the integrated Time Machine backups. Now they are hard to use (since there is already an ISP-supplied router to work around) and people backup online instead of locally.
Probably a weird coincidence but is interesting.
The price point of Google Home is aggressive, and the same functionality is provided (or will be soon) to the Android ecosystem via Google Now / Google Assistant.
Their voice and intent recognition is getting very good. I have noticed a marked improvement over the last year, even in noisy environments. I think competitors will struggle to keep up, and this is probably the single most defining feature of a personal assistant.
Not really. Microsoft has dozens of apps on iOS and Android, including Cortana, and it has Cortana on Xbox One as well. Microsoft also has a huge cloud business, second only to Azure. The Windows PC is one of the smaller parts of Microsoft's business nowadays.
Microsoft has a big advantage over Google, because it supports online, on-premise and hybrid operations. It has a big advantage over Apple because it doesn't really care which hardware you use.
> Amazon literally has none, no mobile and no desktops/laptops/tablets.
Amazon has a substantial tablet business with a forked version of Android, and has its own app store. It also has plug in Fire TV products (which work much better than Apple TV), and an e-reader range, with content libraries to match.
Ooops. That should say "AWS"...
Amazon has Kindle Fire tablets, ; these have Alexa built-in.
We can argue about the quality of Siri, but I think Apple have the best chance of dominating the wearables market.
I'm increasingly feeling like humans around me are remarkably less protective of their private spheres than I am, and am curious as to where the difference lies. I'm pretty dull - it is not like I'm protecting the privacy of my wild and crazy lifestyle. But the idea that the noises made in my home are being shipped to and stored... somewhere, with access controls unspecified (but to a company that has proven it rolls over on command for the government, at least when not discussing taxes) such that I don't know who may be listening to it, is the sort of thing that effects how I behave in my own home. And pardon me, but fuck that, no way.
A friend compared to to having sex in front of pets, but I'm pretty sure my cat doesn't speak any human languages, hasn't written systems to routinely disclose data to various groups of humans, and isn't likely to try to sell me $product to improve my performance. For instance.
The main place voice is useful to me today is in the car. I'm already on display in a fishtank, and talking to what amounts to a remarkably clever turnip that sometimes gets something right is useful.
 http://www.cnn.com/2010/US/12/01/wikileaks.amazon/ . Sure, just a run-of-the-mill TOS violation, nothing to see here.
 Doesn't matter that the unknown humans likely aren't. I'm not claiming rationality.
On the other hand, a very large segment of the population subscribes to 'if you have nothing to hide, don't hide it', which I attribute to ignorance of how these things are misused, and subtle propaganda telling consumers not to worry about it.
The fact that I can have roughly the same experience as Alexa on my phone at anytime is huge. I know that you can get the Alexa app for your phone, but having Google Assistant as a first class citizen is huge. I think it is a matter of time before Google catches up and surpasses Alexa here.
This degrading of UX is very common with Google through the life cycle of products. What's going to happen to Google Home next year or 2 years down the road? We saw OnHub get castrated in less than a year after launch. The Nexus Q was still born.
Although at this time Google Home is only conversational for search and is still very bad about being entirely keyword based in context; I suspect that this will change.
I'm not sure what it is, but talking to computers just makes me wig out. I get extremely self-conscious, wondering who is listening or might hear me. This doesn't happen with humans on the other end.
Saying "Alexa 5 minute timer" is also convenient. "Alexa, weather" is less convenient because she doesn't know that I prefer Celsius over Fahrenheit. This is unfortunate, but at least she can do a conversion.
But then you do an "Alexa what's the velocity of an unladen swallow" and she won't even perform a basic Google or WolframAlpha search and give you the first result. Just says she doesn't understand. Lame.
Before all that, though, I had a lot of essentially stage fright whenever that Alexa blue light came on.
Same thing happened when I started using Siri after I was already used to Alexa. Invoke the voice thing, and the prompt comes on, and I am awash in stage fright, unsure what to say.
There's a device-level setting in the mobile app to set Alexa's preferred system to Metric...
Or I'm growing senile and didn't actually try that when i thought I did.
Also (cliche warning) I only know one person who owns an Alexa or has even expressed an interest in buying one.
Based on the commercials, it seems to solve trivial problems like answering trivia questions. Why can't it vacuum my room like a Roomba, fix my leaky ceiling or fix my power outage? That's what I call an useful home assistant. Being able to parse and understand my voice is nice but it's relatively useless if it can't perform extremely valuable tasks.
The things I'd really like an assistant for around the house--and which I pay people to help me with in various cases--are vacuuming, cutting the lawn, dusting, doing (and putting away) laundry, doing (and putting away) dishes, etc. Activating lights with a voice command is pretty far down the list.
That said, given enough smarts and interfaces to enough online services, a purely virtual assistant would still be pretty useful for a lot of people. "Book a trip with these general parameters." It uses my preferred booking services, knows my preferences, and comes back with some choices. Get to the level of at least a competent personal admin and that's a service I'd be willing to pay for even if it can't pick up a broom.
Expedia gave an interesting demo at AWS re:Invent around a digital travel agent. It was a very simplified demo--and, of course, Expedia specific--but I think it did give you a glimpse of what's possible.
Snark aside, once you get it paired with reasonable smarthome products, there's an element of magic there. "Alexa, turn on the kitchen lights" feels downright futuristic.
I know it seems a bit overt, but I'm pretty serious... I find myself reaching for google now more often than I ever have before. They're a step past Alexa when it comes to the sheer data that Google has to pull from. My only complaint is sometimes I hit clear all, and I can't get back to the reminders auto-set via email invites.
I also don't like that they've effectively deprecated the hangouts UX... they made interacting via SMS outside the phone pretty painful. I know there's hard costs to the stuff that came out of Grand Central / Google Voice, but I feel like when they first launched Hangouts, the experience was better overall.
In general using Google's services works so much better than anything else I've tried. That said, it's kind of creepy ay times.
Neither it nor amazon seem to understand the concept of a family or being in public even at the very lowest level. I can't imagine tying it to my personal account unless I lived alone. Anyone can use it to do anything they want. I have been rickrolled, all someone needs to do to rickroll an echo owner is walk by an open window and yell "Alexa (pause) play never gonna give you up by rick astley" then run like hell. It was funny the first time. I would imagine if I were dumb enough to link my amazon purchasing account to it, I expect I would now own a lifetime supply of dragon dildos from jokers walking by my window. Being able to respond to any human voice with no training sounds like an awesome idea, and it is, mostly, but unlike star trek daydreams, Alexa has no idea who's talking to her and if they should be allowed to talk to her, so she can only be given access to stuff that mostly doesn't matter.
IoT means closed non-interoperable silos but there exists a java project that emulates a Philips lightbulb Hue controller accurately enough to fool Alexa, and it can execute arbitrary URLs, and misterhouse automation can most certainly control lights and appliances using arbitrary static URLs so I have a pretty decent voice activated home automation. This brings up the point that most people cannot handle setting that up, but we're in the equivalent of the 1980s home computer boom where everyone has a different idea what a home computer would do, but we all agree everyone needs one. Alexa can do all kinds of weird things, surely it does something useful for everyone. Supposedly she has lots of sportsball features, none of which would interest me, but someone probably wants them. Everyone probably wants something Alexa does.
Being located in the kitchen I use the alarms and timers a lot while cooking.
Somewhat predictably, the best working use cases are ones that drive additional revenue for Amazon. Playing music from your Amazon Prime Music subscription, ordering laundry detergent and dog food from Prime Now, etc.
An excellent question especially in the context of adding another always on audio device to a private space. I think for many people the value has to be more than just a novelty to justify it.
No. We built a market analytics app on top of Alexa and we have Echos all over the office. It's great.
But I don't want to be tied to stationary speakers in the office. I want my app integrated with the devices I carry with me everywhere. Google and Apple are poised to win here once they open up Assistant and Siri to third-party development.
Private spaces are essential for creative thought, exploration, experimentation and just being yourself. The price is too high - it will answer questions for me, and execute commands, but I will lose my last bastion of privacy as an admission price.
They had a few other similarly uncritical, almost breathless pieces on Okta  and Vote.org  earlier this year. Am I paranoid/overly cynical, or is Backchannel becoming the outlet PR departments grant access to in return for coverage they know will flatter them? Being hosted on Medium, they don't serve ads (AFAIK)... so this may even be a part of their funding model.
: "A Comany You've Never Heard Of May Have Solved The Password Mess" https://backchannel.com/a-company-youve-never-heard-of-may-h...
: "This Y Combinator-Backed Company Has a Secret Weapon to Sway the Election": https://backchannel.com/the-simple-secret-weapon-that-could-...
Why isn't your desk in front of the telescreen?
Obviously this is not an easy task, and maybe it is being worked on already. Thoughts?
And how many people have a Google-powered phone with them every day? Siri on their iDevice? Cortana on their auto-upgraded Windows 10 PC and in their living room Xbox? All of them have Alexa's penetration beat.
When I was still excited about Alexa, I wrote stuff like "How to build conversations via the Amazon Echo"
But I was also aware of how difficult it could be to get Alexa to understand certain words. See: "Using a glottal stop to force the Amazon Echo to correctly pronounce “tw”"
But, you may recall, I became frustrated with the process for getting into the Amazon store, and we had a very long thread on HackerNews about that:
I'm sad to say there has been almost no improvement over the last year. The problems that I listed in "Amazon has absolutely no idea how to run an app store" are still mostly true.
And I know many, many developers who were excited about Alexa a year ago, but gave up in frustration because we felt like Amazon was not listening to our concerns.
In retrospect, I was wrong for 2 reasons:
1.) many people pointed out that voice was a terrible interface for a variety of purposes. At the time, I thought this was merely a matter of offering the right voice prompts, and getting better at understanding what people said. But I now think the criticisms had much more merit than I admitted a year ago.
2.) Amazon's behavior has been disappointing. I'd like to see them aggressively listening to developers and aggressively improving the service based on developer feedback. That is not happening.
I will never be an Alexa customer, the value offering is fundamentally not worthwhile to me.
I really hope development on this isn't at the expense of more traditional interfaces.
Google's Pixel phone does the same thing.
Godspeed for those who enjoy this, but I hope regular touchscreen or keyboard input isn't going anywhere.
Although I explicitly state that it does not work with Echo, I anticipate many returns and bad reviews after the holidays. I am pretty sure they'll keep the Echo, so it is all good for Amazon.
Ironically the Alexa Skills Kit doesn't even support Bluetooth. Does anyone know when or if it ever will ?
Then there is the problem of the multiple platforms to build for. We have one Web, two desktops (even if I'm on Ubuntu), two mobile OSes. Are we going to be able to afford developing for many different platforms? Will companies pay for that? Maybe we have to wait that only a couple of platforms are left.
Text to speech
Maybe I have just seen too many movies about Stasi (East German Secret Police), Big Brother, Brave New world, etc., and that ruins it for me.
All you need is a microphone and access to any of many voice-recognition APIs to build similar functionality. You can build something similar on any phone, laptop, or desktop computer without asking your potential customers to spend a dime.
I think the big question is really how big of a market exists for voice control. I don't personally see much of one. Everybody thought it was neat to open netflix up on their xbox/ps4 with voice commands for a day or two, but then most people primarily shifted to using the more reliable controllers. Any question I want answered, I can simply "OK Google..." and have the option of correcting with a keystroke.
I can see the potential for a market to be made, but I also see the history. Remember dragon naturally speaking? When you finally got it working great it wasn't actually an improvement in workflow for most people. How many people prefer navigating phone menus via IVR consistently?
Voice control is a solvable problem and that makes it attractive for development, but solutions haven't historically been particularly marketable beyond the novelty stage.
If I'm in the kitchen cooking, I'll just go "Ok google, add butter to the shopping list, but if I'm on my computer, I'll open Keep and type it there.
Both methods have their advantages, and the awful experiences come out when a user tries to do all of their tasks with either or.
As for who is ahead, I strongly disagree with this article. Yes, Amazon had a 2 year start, but Google has had far more years of voice recognition and AI experience. They also have infrastructure for all sorts of things including maps/navigation, translation, search/knowledge graph and just machine learning in general.
With Apple, Samsung and Microsoft entering in the digital assistant world too, it'll definitely be a huge competition in the years to come, and the 2 years head start aside, I think Amazon is actually the least well situated for the challenges ahead.
I see Microsoft succeeding in enterprise area but Amazon is nowhere.
The last think I need is a device that gives me access to that subset of life that a given vendor supports (stores, searches, people, etc) or that requires some sort of "accommodation" to get to work properly. A very, very high bar that may be reached with the help of today's early adopters.... but not with my help. I spend too fucking many hours a day as it is dealing with "good enough" technology. Getting into arguments with a box in the middle of the room?! All I can say is get off my lawn, Alexa!
As for who will win? Seems that in many of the consumer device battles Amazon has a way of being so exclusive to their products that I wouldn't bet on them being the long term winner even if they do have the lead. Consumers want a lot of Amazon stuff, but that's far from making them the cultists they'd have to become to not be bothered by the Amazon ecosystem (.... and I mean consumers, AWS is a different matter in the marketplace).
For many products that might even be desirable behavior. Lights ! Versus "Alexa turn on the lights". Plus it means no integration necessary, which won't exist for many products anyway.
Also these are very cheap algorithms to run. A trigger word implementation is a one-word vocabulary GRU RNN. Does something like 2000 multiplications every second. Easily in microcontroller reach, for power budgets measured in milliamps. Actual word recognition perhaps isn't but the microcontroller can just constantly record, and when it sees a trigger send it on. So you could have 20 projects in the house, all listening to your voice, and it wouldn't really affect the electricity bill.
It looks like voice will be one of the next big things, but calling Alexa the winner is a bit premature in my opinion
My colleagues and I built our skill for the elderly to use. We found it was easier for them to interact using their voice rather than a touchscreen.
I think as computers get better at understanding language and verbal communication, we will be able to use screens less - which is great.
I think it's kind of primitive to have to constantly look at a screen to do things - at the very least it makes sense to cut out screen time if and whenever it's possible.
Alexa is like a terminal that you can talk to. Give a simple command in a particular way, and it will reliably do it.
It's an exciting development that Alexa can now be applied outside of the Echo.
I have hopes for Google Home (which i'm yet to try), but i actually think this will end up a two horse race between Apple and Google. I would completely leave Apple out of this, but Homekit is a pretty powerful concept (if done properly) given the number of iDevices out there.
Given this is all about one company, I conclude it's a Sub. The unicorn of technology? It is very difficult to guess this. One other thing, no mention of accent. Can you imagine how difficult it is to create a voice platform that recognises anything but a generic accent?
They've tried to get around this awkwardness by special casing home automation type devices (thermostat, etc.), but it's still only a stop gap.
Google has the best machine learning along with MS, and they have billions of data points.
Amazon has neither comparable tech nor the innovation or customer data. Putting an Echo in every home doesn't matter when its dumb. Does Alexa understand context? Can it get even close to Cortana/Assistant? Lets not mention Siri, that's stone age.
Even if Amazon continues to put its eggs in the voice-as-primary-interface bucket, it seems like it will be unlikely for Facebook to go that route.
Doesn't feel like Amazon have much of an advantage to me.
We have an echo, bought when Amazon gave a huge discount to developers, and I bought one for family members for their home. I sort-of trust that no data leaves the house except for a ten second audio recording made after the device hears "Alexa". If Amazon collected more than this then they would ruin this business.
I usually use voice input for sending texts and short emails, assuming I am not in such a public place that I would inconvenience people near me.
I have a hierarchy of device use in the following order of preference: First voice, then input on a cellphone, then iPad, then laptop.
I have looked into writing my own mobile app for voice recognition and maintaining privacy that way, but even with my experience, it would be a ton of effort and would miss the context that google, amazon, and Microsoft have.
Edit: I also meant to say that my hope is that Apple will be the vendor I eventually use for voice interfaces because their business model is to sell expensive devices, not collect personal data. The new Siri enables headphones look effective, but I haven't bought a pair yet.
"Ok, google glass, zoom in on that hot guy!".
Voice, a great interface for SOME things. Likely not what this article is pitching.
My wife, daughter and two dogs seem to think they have lives of their own for some strange reason. I work from home and probably use it 20 times a day.
I recently discovered it can play the "white noise" channel from Spotify when my teenager's are loud late in the evening. Along with a sleep timer to turn it off in an hour and I am set.
Now I agree discovery of what it can do is challenging at times. And sometimes she is downright dense. Thank the gods for cloud services!