Voice input is what we humans use when we speak to each other.
Except that humans have an average IQ of 100. They understand context, they get what I'm trying to say. Try getting Siri to turn off all repeating alarms, just for today. Or turn off all alarms until 12pm.
"But that's not how Siri is supposed to work!"
Sure, fine. It's just a ux to a limited api. But then you can't also compare it to how we use the same input for humans. Robots aren't humans. Not by a long shot. They have an api, a very specific and limited set of capabilities. They are not flexible. They don't get context.
Voice interfaces have their place. Because once you know the api, the possibilities, voice may well be most practical. But not because we happen to talk to humans.
Siri and Google Voice are in completely different leagues. I find Siri completely useless; I use Google's voice interface all the time; it even understands context.
Thats like complaining about a commodore 64 not being able to run the same games as a Playstation.
The discussion is not whether Alexa, Siri and Google Home are perfect AI's but whether they are good enough for some very basic things like those I described.
And no it's not because we talk to humans that they are good, it's because we interact with other humans through speech.
If you really understand context, a lot of the time you don't need voice at all.
Consider a voice operated light switch. Why do I need to tell the switch I want the lights to turn on when I enter I room? The house AI should know the ambient light level, the time of day, my location in the house, and my default lighting preferences, and lighting should "just work" unless I want to change something - which is when I can ask for it.
Alexa can't do any of this yet. I can give Alexa commands to turn lights on or off, but currently there isn't even a context for the current light state. So I can't say "Alexa, lights" and have Alexa work out whether that means "turn the lights on" or "turn the lights off" for the room I'm in.
IFFT may eventually be able to do this, but it's far from transparent and straightforward.
It's about affordances. I have a reasonable idea what a human's affordances are. The current voice UI (vUI?) equivalent is a handful of dots of implemented functionality surrounded by huge areas of not-working-yet space.
Not only is there no map, there's no way to guess what might be on the map.
It's also about cognitive load. I'm typing this in a bedroom with a couple of 433MHz switches controlled by a remote - one for a light, one for a heater.
Using the remote takes no conscious load at all. When I had the switches controlled by Alexa, formulating a command took effort.
I disagree. My habits would be so hard for a computer to learn, mostly because they already frustrate humans who are much smarter. When the sun goes down, I don't want all the lights to turn on. I don't even necessarily want all the lights on in the room I'm sitting in. Many times I sit in a lit-up room just because I can't turn the sun off. Once it goes down, it's nice to sit in the dark watching TV.
That is unless I'm reading a book, in which case I need a light on. But not just any light, I'd probably just lamp next to me to shine on my book instead of the one that lights the whole room. I don't need the whole room lit, just my book. So now the contextual system needs to have cameras to see if I'm reading or just watching TV or playing on my phone or napping.
And when I walk into the bathroom in the middle of the night, I'm fine with the nightlight above the toilet. Turning on all the lights will wake me up and ruin my night vision. But my wife likes the light on. So now we need facial recognition to tell who is walking into the room.
Right now I have to get up and turn on a light if I'm on the couch reading. If we're talking effort, that's a hell of a lot more effort than saying "Alexa, turn on my reading lamp". And don't get me started on the effort it takes to try to find the missing remote...
All what you are saying is true but it's missing the point I was trying to make originally which was that what it can do now is more than enough to be very useful for many of us.
I don't think you are appreciating the things that used to be which got lost with digitalization and now can get back.
It's not just about what the AI can do, it's how it gets expressed.
Except that humans have an average IQ of 100. They understand context, they get what I'm trying to say. Try getting Siri to turn off all repeating alarms, just for today. Or turn off all alarms until 12pm.
"But that's not how Siri is supposed to work!"
Sure, fine. It's just a ux to a limited api. But then you can't also compare it to how we use the same input for humans. Robots aren't humans. Not by a long shot. They have an api, a very specific and limited set of capabilities. They are not flexible. They don't get context.
Voice interfaces have their place. Because once you know the api, the possibilities, voice may well be most practical. But not because we happen to talk to humans.