For this argument to make any sense, there would need to be data that showed growing usage of chatbots, followed by a plateau and then a drop off. I’m not sure what that data looks like, but I’m pretty sure there was never a high growth phase where users were actually interacting with the chatbots. It was all hype created by Facebook and a few idiotic VCs who wasted money on what they thought would be the “next App Store.”
I, and many others, were saying around the time of this hype that chatbots would never become a category defining product, simply due to inherent usability flaws in their design that have been discussed ad nauseum.
The only people surprised that chatbots failed to become the “next big thing” are the people who mistakenly thought they ever would be. This assumption was never grounded in any real data of user desires or real problems. Chatbots were then, and are now, a solution looking for a problem. I’m not surprised at all.
On the other hand, if someone would have pitched Facebook to me, I would have also dismissed it. I keep that mind to stay humble.
For any trend/hype there will always be people dismissing and praising it. Thus, if it is successful or not, there will be people who say "I told you so" afterwards.
It does not matter what a comment on HN says. You personally have to decide, if you want to speculate/invest in some hype or not. You have to decide which new technology you learn and which ones you ignore. In hindsight, it would have been great to learn machine learning five years ago.
The one that I dismissed was Twitter. In fact I thought it was a joke when I first heard about it, something making fun of how long winded boring blog posts on services like Livejournal tended to be. I also thought Vine would be doomed to fail.
Uber was not a surprise at all for me because I'd been complaining about the bloated and horrible taxicab system for many years. It was a market that was crying out for disruption. I remember in 2008 my local taxicab company had exactly two ways to hail a cab: phone or fax. They might or might not show up at the appointed time, and they'd certainly never send you any kind of updates. It cost upwards of $5 to step into the cab, and after that there were fees out the wazoo (per 1/6th mile fee, per 45 seconds fee, fee for having a bag, a fuel surcharge, a license surcharge, a regulatory recovery surcharge, and a couple others I forget). The cabs might have credit card readers but they were always broken so it was cash only. Drivers took weirdly roundabout backroads paths and always claimed it was fastest if you asked them about it, even though I knew the highway was not busy at 4am. Every single thing about the experience sucked. When they started crying later that Uber was killing their business my response was "good".
When I moved to the US in 2003, the lack of something like Uber was a surprise.
Not in the smartphone-app sense, of course (those didn't exist yet), but in the sense of regular-Joes-making-a-buck-giving-a-ride sense.
Back in Ukraine, there was no taxicab monopoly. You could hail a ride on the street simply by raising your hand - and someone would pick you up pretty fast. Some people did it for a living, some would use the extra income on the way to/from work if you were going in the same direction (not unlikely if you are standing on a major street going to downtown, for example).
The fares were unregulated and negotiated on the spot, but there was an equilibrium point for every route. (Unless you were a tourist, in which case you'll be almost surely ripped off - nothing any local would care about though).
And, of course, there was no rating system, which took away the incentives for drivers to go out of their way to maintain a nice appearance. Don't like it, don't take the ride; the next car is will stop by a minute later for you, the next passenger will be around the corner for the driver.
Other than the lack of ratings and fixed pricing, this worked like Uber works now in other aspects.
What might be surprising is that central pricing and ratings really don't make ridesharing much different from what I described. It made the system more accepted in the US, but the end result is pretty much the same: I can hail a ride anywhere from people I don't know. The system stabilizes at a price point and expected level of service.
And there's a remarkable similarity in the downsides of this system. My father made some money giving rides back in Ukraine between jobs, and while the short-term revenue looked great (compared to average monthly salaries), so much of it went into gas and maintenance that the whole gig was hardly worth it - and it took more time than a job.
This is what Uber drivers are discovering themselves right now. The ones who took out loans for their shiny cars are especially screwed.
These existed nationwide in the United States in the early days of the 20th century, until about WWII. In many cities they were called "jitneys." Some were organized enough to operate as simple, private, anonymous bus companies with fixed (but flexible) routes.
They started disappearing after WWII when most people outside the urban cores got a car, and municipal governments started taking over public transit.
These still exist in some places, but are mostly in ethnic minority communities and spread by word-of-mouth. (I know first-hand because I got hit head-on by one full of schoolchildren in the late 90's.)
It was initially a carpooling app. The original pitch was that regular drivers would just drive to their own intended location and have an ability to see people on route who wanted to go the same way. Much harder to pick that idea out as a future billion dollar company.
(I was at that conference: everything went wrong, but it was mostly super cold. Cars were stuck in slippery snow, so someone suggesting the idea of carpooling sounds odd, but I suspect anyone there would have loved to wait out the traffic inside a heated car.)
I was using VRBO a decade ago for vacations. AirBnB is more distinguished by a slick UI than business model innovation.
We still don’t know how price sensitive the average Uber customer is.
What surprised me was seeing a serious journalist like Wolf Blitzer regularly reading people's tweets on his prime time show.
“No one ever went broke underestimating the intelligence of the American public.” - Mencken (paraphrased)
I still remember finding it strange to see how Ashton and Demi personally wrote short messages, since every form of communication from famous people at the time were formal statements released via the press.
But while it does that, it also plays to something even more deeply human: to watch other people from afar, and to replace complex, anxiety-ridden face-to-face interactions with easier, gentler text-based communication.
Google Glass and all the other hyped techs never struck at anything deeply human. VR is nice and all as a tech, but it's not something we've always done.
VR, virtual reality, is a complete immersion in a virtual world. A 3D game, a 3D movie, a 3D Minecraft, you name it.
AR, augmented reality, is additional "augmented" information in the real world.
VR is nowhere NEAR ready for mass consumer market. There's 2 screens on the glasses with very low resolution. Because the glasses are very much near the eyes, pixels are very apparent. That's ugly. Furthermore, you need a big fat gaming rig with a 1000 EUR/USD graphics card for the current low resolution. And then there is the issue of the data transmission. Those huge wires (its not wireless yet, and wireless suffers from more lag). Finally, it is pricey.
AR from Google Glass flopped due to privacy concerns. It would aid users in their day to day life like a smartphone or smartwatch does. Its being used for this purpose in business settings (also due to its price) just like some other expensive tools by Microsoft such as Hololens and Surface Hub (84"). The user interface isn't quite there. For example I'd say eye tracking is useful. Some cars also get HUD in windscreen.
(YMMV but as a glass user who uses glasses which also work like sunglasses in the direct sunlight, I'd love to have a pimped glasses with AR.)
Anyway, in short, my conclusion, is:
VR is mostly fun, though some useful sims (e.g. flight trainers, but also things like 3D meetings would make Hyperloop and air travel redundant which saves time & environment) do use it as well. It is more of a niche than AR. AR is going to be part of our every day life in useful/productive ways. AR will be vastly more rampant, and it will be in use in a massive scale earlier than VR will be.
As a side thought imagine going on vacation in VR and it being just like the real thing cause you're on a robot. Your 5 senses get stimulated while you walk through Cairo. You smell the spice whilst you walk through the market on your robot which you control. Why actually still go on vacation then? I already have enough pleasure with Google Photos or the new Windows 10 login screen as it is. Going to see the Eifel Tower which you already saw hundreds of time on pictures is boring as is receiving that piece of metal on a postcard. Oh well.
Oh I read your post, its against HN guidelines to assume and express someone else has not read your post or the content.
If you read the post of your GP (and that GP as well) you'll see they're casually mixing up AR and VR. You do the same, and I called you all out on the difference between the two. There's no need to feel offended about that.
You say you haven't had an opportunity to try AR, and mention Google Glass. The thing is, Google Glass is just one example of AR (a very known one) and also a very multi-functional one. The usability I had in mind is more subtle, more specific.
Examples: Layar has been out for ages (it adds AR on your smartphone screen from camera input e.g. adding complimentary digital info to a paper magazine or food you bought), if you have multiple smartphone cameras you use it, LIDAR uses it, Google Maps ecosystem as well (Street View for example). AI in general can use it. There's all kind of usages of AR such as this traffic light . Not even mentioning about the experiment on how biking lanes are being lit up due to solar energy in the evening/night. Google Glass could've potentially done that all, but there's no reason that cannot exist in the future. It was just the wrong company to come first with the product. Google was also too early with it, just like Apple Newton was too early and GM was too early with the EV.
My first time using VR was on vacation in Spain as a child, using those red/green glasses we were watching a dinosaur movie. It was rather primitive, but immersive. I'm not anti; I just see it rather limited. I've read enough reviews about VR glasses and read reviews of VR games -including realistic screenshots- to know that the quality isn't there yet.
Long-term, VR will make more impact, but that's because it is so different whereas AR is rather complimentary. On the short term, AR is going to be more widespread and useful. I mean, I'd love a HUD on my windscreen telling me I am speeding. Like, of course.
>Its against HN guidelines to assume and express someone else has not read your post or the content
well, if you're talking about limited AR such as in HUDs or Pokemon Go or glass, that's already here and it's cool and useful, but not very exciting (at least to me). No doubt AR will be a thing in everyday life. I don't see why people (myself included) talk about both technologies (AR & VR) as if its a zero-sum game and we'll only have one but not the other. No reason they both can't exist in the brave new future.
I'd say, that when AR is "boring" that actually means it succeeds. It blends in without being annoying.
If you remember the time when desktops and mobile phones and smartphones become boring; that is when they started to gain traction among the common man; arguably, when they started to become good.
Take the traffic light example I mentioned earlier. If that's rolled out and seen as "boring", I'm happy with it. Because that means it works. Right now, it isn't yet boring, btw. It is seen as an innovation.
Same with innovations like electric vehicle and autonomous cars. Once they're "boring", their functionality is accepted. And we are talking about the "West"; it hasn't been adopted in poorer countries at all, not yet.
VR def. has the wow factor, I agree on that. Look at all the VR movies the past 30 years. AR is much less apparent, though its there as well.
Ultimately, for me games are just spielerei they might be fun (not so much for me anymore as I grow older); they have little to nothing to do with productivity. AR is going to be more useful for productivity than VR. I just don't see a lot of use for VR, especially not on the short term. You can notice this how the VR glasses are largely marketed for gaming, while AR has professional applications which -ultimately- aim to make our lives easier.
My apologies, my tone was probably a little aggressive.
Gaming is the original VR killer app intended to gin up interest and spur development but current VR offerings go beyond that. One of the more interesting applications so far is 'live events' where your digital avatar can attend live sporting events, concerts etc with other digital avatars (you have to experience this to understand what a potential game changer this is). Even watching regular (i.e 2D) movies in an immersive environment on my headset is more fun than watching it on my tv, with the drawback that as of now you can't share the experience with other people not wearing a headset. I'm just scratching the surface of content that is already available NOW (there's even porn if that's your thing, though I personally don't find this very compelling mostly because of the low resolution). Ultimately, if things go the way I think they might, your VR headset would potentially replace not just your gaming console but also your TV set and your laptop/home office (currently I can play chess in a virtual room with only primitive controllers so I don't think working while completely immersed in a virtual environment is a long way off). If you do have the chance I encourage you to take some time to catch up with the current state of the art.
I agree with you that 'boring' is good but in the world we live in unfortunately new tech has to be 'sexy' and 'exciting' to attract venture money and developer interest it needs to get to the boring stage.
The requirement of a high-end GPU, and the current price of high-end GPUs thanks to the cryptocurrency mania, meddle with the adoption.
We're far from VR being mainstream; AR costs far less resources.
From my experience, VR is a huge gamechanger (a new paradigm that will enable a lot of new innovation) whereas AR is also cool but doesn't have that new paradigm potential (like e.g. the internet, smartphones etc. did).
We should also keep in mind that chatbots are clearly a thing that people want; we're just using capitalism to test whether society, technology, etc. are "there" yet, so it's not entirely irrational for investment to flow into this area. (We might debate the scale of the investments, or we can just let the market determine that for us.)
However, I do think there's some truth to this article - the rise of digital assistants in Google Home, Amazon's Alexa, Apple's Siri, is, in a sense, the rise of the chat bot. You can text them and they'll do things. You can talk to them and they'll do things. You can even get them to schedule a haircut for you, as Google recently demonstrated. I think these bots are semantically identical to what we think of chat bots as doing. If you would've told me 3 or 4 years ago, that people would willingly let a "bot" into their home that listens to everything they say and talks back, I would've been skeptical, to say the least. Now, even my hippie roommate has one of these things. I still don't get it, but to claim that "bots are dead, long live humans" misses the mark about how fast speech to computer/computer to speech tech is evolving. Watch the video and be... amazed:
So yeah, the idiot VCs missed the mark - this isn't going to be a consumer revolution, led by a few scrappy start ups, one-man teams, and dreamers. It's a revolution in data collection, human-computer interaction, and AI that's already been taking place behind the scenes at Apple, Google, Facebook, and Amazon for years, and will continue to as long as they hold the tech world's best AI talent.
Which is to say, they're not getting a whole lot of usage of features you might consider to be chatbot functionality.
This, to me, is likely why Amazon sends me emails saying "What can Alexa do now?"
Ex: "Hi (...), since you always turn off the heater when you leave, would you like me to schedule to always do it for you?"
The Voice Assistance is a vocal chatbot. People were thinking it was going to be text.
No one missed anything except the actual product everyone was talking about. I feel like this happens all the time. We get caught into one strict interpenetration and miss the HUGE thing staring at our face, or ears in this instance.
It's actually the people who believe that Android is Linux-based that are
mistaken. If you write for Android, you write for Android and nothing more.
You can't just run the application on a regular Linux (i.e. the one with libc
and X11), and if Google ever rips Linux kernel out of it and replaces with,
say, DOS, most of the applications won't even need to be recompiled, because
they were using Android's API/ABI.
No, not quite, not without enormous effort. Look at Cygwin or WSL or Wine:
tons of work poured into and still not there yet.
On the other hand, typical Android application only ever touches Android's
API and doesn't even leave its JVM. Replacing whole kernel (with obvious
single point of work in Java interpreter) would be practically invisible.
When we can reasonably "chat" with our voice assistants then we have actually gotten there. More specifically, when I can tell my voice chatbot I just want to say "alexa" in the morning when the alarm goes off to snooze. Or when they are adaptable and pattern observant enough that I can say "alexa the usual" in the morning and get the higher volume as I walk through the house and news and my favorite podcast as I ask for every morning. When they come close to accomplishing those things, then we will have vocal chatbots. When they can "remember" just as much as the original Eliza chatbot then we will be there. Currently they are more like vocal light switches than vocal chatbots.
There will be room for startups to put the glue between the bots be everything else, but the core tech for the conversational interface is not going to come from startups.
Actually, I'm still quite bullish on this. Advocates always talk about how 'VUI' (Voice User Interfaces) will be the next big thing but from what I've observed most people - actual users, not financially invested in their success - still remain quite bullish on them, or they only use them to set timers.
I don't believe Amazon, Google, or Apple have ever actually released sales figures for their assistants. I believe its because the numbers would look so insignificant compared to everything else.
I mean that I'm, put lightly, skeptical of the whole digital voice assistant thing.
You're one of today's lucky 10,000! :D https://xkcd.com/1053/
And the other ones the other one.
The successful bots are ones that don't attempt to understand and respond to every single question perfectly, but instead act as a supplemental tool / guide and offering concrete decision points and actions for the user to take. For example, instead of asking "How can we reach you?" and letting the user enter free text and figuring out if the user entered a phone number or email or some random text that just short circuits the bot, show two buttons "Via email", "Via phone" and clicking each one would then ask for an email, or phone number.
The successful bots also know when to failover to a human being and failing over fast. I've hardly ever had a good experience dealing with sites that employ chat bots and it's not great to be frustrated by a bot when I'm already an angry customer who hasn't received my order.
I had to have an internet-connected phone on me at all times because people would send messages at any hour and response time was a factor in our discoverability on facebook.
Of course all of these questions were answered on our website (AND ON OUR FACEBOOK PAGE) but simply responding with a URL resulted in attrition. So I found myself running through the same script several times, all day, all night, every day, every night.
The best solution would have been to set up some kind of text parser which would allow people to navigate the website through facebook messenger (oversimplifying) and then alert a human if it couldn't parse the input. We could even have hooked it into the comment feed because facebook was really bad at notifying us of comments. Taking it a step further, the Mero Mero's dream of offering facebook marketing to other companies as a service would actually be reasonable, because it wouldn't just be me sending out hundreds of copypastas 24-hours a day, forever. But I was never able to put a system together because of other time-consuming duties the company needed me to perform.
This is how chatbot functionality should be implemented. Like those "chat with us" links you see on a website that drops you into a window and you get put in someone's queue. It doesn't actually wake them up until the chatbot frontend can't handle the users' request using NLP/decision trees/regex (depending on your level of sophistication of user), and then the whole log is sent to the tech support guy when they get free of the last chat and can read through it and take control.
20 years ago “artificial intelligence” where taboo words for anyone hoping any funding. It would be called ‘expert system’ or ‘automated agent’ or whatever else that didn’t make people show you the door.
I think what you put as “chat bots” is having the same issue. The concept has been deployed at super large scale and people interact with bots everyday, it’s just not marketed as such.
Currently my phone company, my ISP, my health insurance company, the last airline customer support I had to deal with, they all process a crazy amount of interactions, and all the basic steps were clearly handled by a bot until I got escalated to a human. They are all real world huge scale applications, I don’t know by what other metrics they would be deemed as “failures”, and I don’t think they plateaued, I expect it’s still growing.
Usability? Sure, someone keeps paying for them, but they're not the people who get stuck interacting with them...
Human operators on big operations where already bound to a script for years now, and the “human contact” part was mostly spelling letters and numbers over the phone.
I’m more than happy it switched to chat, and both sides can deal with the exchange asynchronously, in particual with the bot handling the requirements to open the user info.
I feel it’s actually a win-win situation, as incredible as it seems regarding customer support technology.
Has it? Are these experiences people enjoy and would like to see more of?
There's no data before a breakthrough. We should be wary of hypes but at the same time we shouldn't diminish pioneers.
I was working at Radio Shack in 1995-1996 and mobile phones were definitely starting to grow in popularity. By then, you already had subsidized “free” phones and mobile plans were around $35 a month.
When we say that "Mobile is eating the world" or something, we are not talking about the mobile phone we can make a call with but the phone we can use the internet and applications with.
Your data is about the mobile device that allowed us to make a phone call.
If you say that "this is also data" then we have that kind of data for chatbot as well, e.g., the number of sales of Alexa and Google Home devices, or Messenger installations, etc.
Yes, it will take a few years. But consider that there are now tens of millions of Alexa pucks out there. The bot may suck today, but it’s getting better quickly, and consumers are receptive and ready to reward the tech giants when they finally succeed.
AWS was not great in its first 5 years. Its capabilities 5 years ago were pretty good but nothing compared to today.
Google is bad at long term development, especially with Ruth Porat at CFO. Microsoft is mixed - for every Xbox and Hololens there's a Courier.
One thing missed in chatbots is that X dot AI is a chat bot that works pretty well. There are others that are similarly domain limited where the current interaction model is an email thread or a text conversation. These tend to be doing well.
As an aside, I tend to think about what I'd be prepared to pay for something before I click on the pricing page. In the case of X dot AI, it was about $5-10 per month. I'm sure it is good, but docusign (similar price) is MUCH more valuable to me - time kills deals and docusign helps bring them in. An AI bot that books in meetings just doesn't deliver the same ROI.
My perception of Microsoft is that it's a Hydra.
The upside is that it's not all bad! The head responsible for the product you care for might have been there for a long time, and is doing a really good job in keeping it up.
The downside is that the other heads might be doing something entirely different and turn the whole beast around .
And while all this is going on, some heads just get chopped off here and there.
(sniff Lumia sniff)
At the time the tech industry was hailing speech recognition as the next big thing. There was a lot of investment in speech: BeVocal, TellMe, Nuance, SpeechWorks, AT&T, L&H, etc. Replace your call center workers with automated systems, use voiceprints to secure sensitive transactions, etc. Amazing AI would effect cost savings and make your business efficient! Sound familiar?
Very few users actually liked these systems over the previous versions that just used the phone keypad. Poor recognition accuracy really pisses users off. And most of the time the recognition errors aren't the user's fault. Ever try to recognize speech on a cellular channel using an acoustic model trained on a landline channel? Crap! This was certainly a problem during the early 2000s because of the emerging mobile phone market.
The UIs (or VUIs as they were called) were awful since they mostly replicated the touchtone versions but added "you can say 'accounts' to...". They continue to be awful because, while speech recognition had gotten a lot of funding (DARPA, etc) in the past, the UI aspects were pretty much ignored and underfunded. People interact very differently when speech is used as the medium. The interaction automatically becomes social. Social interactions with machines are decently well studied (Cliff Nass for one), but the very nuanced aspects are difficult to bake into an IVR system, especially under pressure to deliver on a deadline. The experience feels unnatural very quickly and users either run for the door or starts mashing the keypad.
Ultimately the web killed off industry interest in IVR systems. Want to know your account balance? Log in to a web site and it's right there. No need to dial in your account number, verification and run a gauntlet through a phone menu.
Fast-forward to today and chatbots are the new IVR systems, but without the speech-to-text portion. The speech-to-text portion is the easiest part by far in those types of systems. There's still the need for a parser (shallow or otherwise), a way to pair up queries with responses/actions, the ability to track a dialogue and its context, and most importantly, a "natural" feel to the interaction.
Chatting/texting and speaking are serial in nature; very slow and inefficient. A well-indexed FAQ or an intuitive, well-designed GUI is more useful than a chatbot. The former helps in shortening search time and the later helps at streamlining transactions. Information search and transacting comprise the majority of actions that users perform when they visit a site/system/application. When you talk to a human agent for these two types of actions, guess what they're using to assist you? The same interface you'd use if you served yourself. They just know it better.
As an aside, the only noticeable/innovative use of chatbots I've seen in the past 10 years is on porn sites. I really took notice of the messages in the chat window when it started nagging me for ignore it. I wouldn't be surprised this round of chatbots were inspired by that.
Behind firewall, so here's a key quote:
Facebook’s early efforts to turn Messenger into a service that people can use for shopping or getting news or weather sputtered quickly. But two years on, the Messenger effort is showing signs of life.
Several developers say they have been selling more bots and applications through Messenger in recent months. Trip-booking service SnapTravel, for instance, has seen its cumulative sales through Messenger grow from around $1 million as of May 2017 to around $10 million now, according to CEO Hussein Fazal. Msg.ai, which develops tools for customer service and marketing for messaging apps, has seen enough of a pickup in business to hire a new executive and marketing team, said Msg.ai CEO Puneet Mehta.
The little chat bot widgets never took off because they didn't solve NLP anywhere near as well as Amazon or Google did.
If a next gen chat bot can make a dinner reservation over the phone (like Google's demo), it definitely will be the next big thing.
So much of our day is communicating and if we can automate our communication effectively, it opens the doors for even greater productivity.
There was somewhere around 2003-2005, when a couple good ones started showing up on AIM. Since that was the IM of choice at the time, there was zero friction to actually interacting with one.
Anyone predicting it as a "next big thing" after around 2007 totally missed the actual hype and was trying to build on something users had already dismissed as not actually the future.
I at least thought that there would be some Alexa/Siri-level AI going on, but most chat bots are about as intelligent as texting commands to an SMS short-code number.
I think its more about chat then chatbots. Chat has already become the next big thing between customer to customer interaction. What chatbots are trying to do is enabling brands to start interacting with their customer on chat. Human-based support was always there but to able to interact with customers using chat during his/her complete lifecycle was not possible using human agents. But chatbot can solve that. We are moving back to chat because it is one of the two natural media of communication (voice/chat & gesture) to humans. The hype about chatbots was bad, resulting in everyone trying their own version of chatbots and fail. But for those who used it in its actual capabilities, it has done wonders. For some top financial brands of India, it has increased no. of people interacting with a brand to about 400% resulting in ~200% increase in marketing leads and lead quality by ~150%. This is actually the NEXT BIG THING for the marketing department. Also, chat apps are more evolved then SMS/phone apps which give user/customer more control over communication, resulting in happy customers.
Chatbots are not solving any problem as they are not a solution. Instead, they should be used as a tool to solve any other problem, just like AI/Crypto.
These interfaces are almost like a dark pattern because of how bad they are.
"Hi, in a few words, what can I help you with today?"
"It sounds like you have a question about your bill. I can help you with that! If you can give a few words to describe the reason you are calling, I can help you with your bill."
"OK, let me get you to a representative who can help!"
... instead of spending 10 minutes wrangling with the vapid AI, I can actually move on with my day after speaking to a human. Was this the future we envisioned in the 90s? I think not. Some systems let you spam 0 (zero) and it transfers to a human, but more and more are requiring you to interface with the system in some way, even if disabled or impaired.
> one two three etc
"Hello this is agent a, Can I verify your X"
> It's XXXX
"Looks like you have <totally unrelated subject to what you called"
> Yeah but, I called for Y
> Y this, Y that, I need Y to do Z
"Let me forward you to an agent that can help you"
"Hello this is Agent Q"
> I need to do Y! damn it I've been on hold and transfering for 20 minutes
"I can help you with Y, but first, I need your account info
> FREAKING A 1234
"Can you verify XYZW?"
> YES GOD DAMN IT YYYY
"Ok sir, I'll need you to FAX it in"
> What, that's technologies from the 1920s
"I'm sorry sir, is there ANYTHING ELSE I can help you out with <condescending voice>"
> PLEASE KILL ME NOW
"Would you like to take a survey?"
a week later, 7pm you get a robo call
"You had a call with Agent X how did that call go"
Can't they see what I just typed before? Seriously, how bad can this be.
It is probably not the most complex integration technically, but it made the experience so smooth.
So if you know where the thing you're looking for is on the website, you can just launch the website and use that.
Super simple feature but it seems like for so many companies, if you go online, it redirects you to their mobile site which has been "phased out" and tells you to the get their app, which you have and would have used if it had the feature you needed.
For such a "post computer" world it seems ridiculous how often I throw my hands up and find my laptop when trying to complete an online chore from my mobile.
If its a charge on my credit card or checking, that's what I do, I give them this counter party risk. I call and do a charge back. Basically after 20 minutes with a vendor on hold, I declare that they are being unreasonable and they are unable or unwilling to refund me, I call the bank to do this charge back.
Now, what you tell the bank is critical. For whatever reason, vendors have been able to un-do my chargeback. So instead, I tell the bank that I dispute the charge -- but that I am also disputing the method of payment. So if they argue that the charge is valid in the investigation, I say that is fine -- however, the bank is not authorize on my behalf to pay them, and if they think the charge is authorized they must contact me to arrange a different method of payment. And under no circumstances can they use the bank to make payment. So yeah if they argue the charge is legit, that's fine, contact me (I wish I could put them on hold haha), and I'll pay them -- but the bank can't. The bank likes this because they are off the hook and I like it because they don't get to screw me. Worked every time.
If everyone did this, and made this a huge country party risk, these vendors would stop putting people on hold and ignoring or purposely inadequately addressing their greivences with their product/services.
It’s funny how they never keep their promises of calling you back until you actually stop giving them money. They also magically start paying more attention to what you say, and complaint letters that went unanswered for months and presumed “lost” suddenly become found and answered.
But I hear you. I recently wanted to change the ownership of my cellphone account. "Change ownership", "It sounds like you want to change cellphone plans, is that correct?", "Agent", "Okay, let me get you to a representative".
The only tasks these things are equipped to do, are the tasks that I can do via the company's online portal, and in a much less frustrating manner. I wonder who is actually using these things.
Old people. I'm not kidding. I used to do PCI compliance work with call centers. I've listened in on many a call to verify procedures. Call centers deal with an unending stream of older people who cannot or will not use "computers". They want to talk to a person on a phone. The people who are comfortable with the online interface only call when the online interface fails them. Old people go to the telephone first.
And there is a growing subset of old people who are functionally illiterate. I don't mean they don't know how to read, I mean their eyesight is diminished to the point that using the computer/phone/tablet is uncomfortable. Or their fingers don't move well enough for a keyboard/touchscreen. The phone is simple and reliable, not requiring either visual or physical dexterity.
Even non tech items like opening a bag of chips or box of crackers. I estimate that some packaging must require 40+ lbs to open. And I think of the "how it's made" show and how advanced manufacturing plants are. So they use all this tech to make a box of crackers as cheaply as possible, they have cameras fast enough to find and remove a single dark grain of rice in a torrential stream, but put zero thought into the UX experience and how the user can open the box.
It sounds like the simplest "tech fix" for this phone menu is once their account is found,
if (age > 60 )
Furthermore, you could do "soft" account verification with their number. So if they call from a number that matches an elderly customer, just route it straight to a human, even though numbers can be spoofed it's not going to harm the customer to just assume it's them for that step only and fully verify their acct later.
"We are sorry our website didn't work, let me get you a human..."
If they don't, is that because they don't know any better, or because they actually want to use the IVR?
My guess is either they go straight to a person, or they don't know that's an option. So if I'm correct (and I think the onus would be to prove that old people think differently than everyone else in this regard), these things exist to take advantage of the ignorance of old people to save a company some money. Greaaaat.
It's annoying when my issue isn't on the list, since I'll have to hear it 50 times. And it's incredibly aggravating when my issue is on the list, but the website doesn't actually work right. But it's not hard to understand why it happens.
Yeah, that's unhelpful if "the website" is a sprawling, vast wasteland. Finding a specific thing on an obese, convoluted site is like trying to find the bucket of ice cream in Siberia. I always check the site first, so if I am calling... it's because your website is pretty bad.
So if I do find the thing I need and it is broken, then I call and dodge the idiot AI, and the beleaguered agent or offers the pathetic advice of: "It's on the website". I tell them it doesn't work, and this has led to two outcomes, either the agent tries to do it and it is so broken and fucked up that they tell me that's it's down and I have to call back or try the website days or weeks later, or I am lucky and the agent has to do a tedious task that I would have preferred to do myself online.
I get an email "Hey, I need to contact you so we can proceed, what method would you prefer?" I reply "Email, please" provide the information I assume the rep needs and which the online form should have been able to ask me and offer "What else do you need".
"Is there a time I can call you and [go over the script and fill out a form on my computer, as if you'd just stayed on hold hours ago]?"
"I'm sorry, it looks like you want to talk to a representative. You need to answer a few questions first."
The people using the automated phone services are the people without internet or even computers.
Believe me, they exist ;)
Usually you'll get it try to say stuff, give up and then say "Thanks, I'll put you through to one of our team"
- Me" "REPRESENTATIIIVE"
- Phone: Sorry, I didn't get that. Say "representative" to talk to an agent
- Phone: Sorry, I didn't get that. let me transfer you to a representative
(This specific one happens on United calls, every single time.)
Failing that I also resort to answering in one word answers. Bots can't handle any sort of ambiguity. Even when they manage to solve the language-parsing problem, people will still just bark one word answers at anything that looks or sounds remotely non-human because that's how the early AI's trained us. The future-concept of people chatting away to robots is incredibly unlikely because to do that you need a level of empathy for the person/object you are talking to. We've clearly shown so far that we just don't have it for bots.
 Like the 19th century British Explorer shouting English at the natives.
"Okay, let me get you to a representative."
Penguin doll: http://en.wikipedia.org/wiki/Tux
Bearded dude with swords:
which is a recursive reference to XKCD: http://xkcd.com/225/
Shibboleet: Shibboleth from Judges 12
combined with "leet"
http://en.wikipedia.org/wiki/Leet from hacker culture
The guy you were talking to eventually just routed you to someone with the same job but who has a mic, because you weren't giving him anything to go on and all he'd have been able to do is push the button again to ask you the exact same question (and having a person do it who can at least vary the inflection of the question is less infuriating than seemingly being stuck in a chatbot's infinite loop).
Source: have a friend who used to have this job. He has thankfully moved on to something less soul sucking than being a human literally pretending to be a robot.
I'd love to know more about these operations, thanks for mentioning this.
I'm just speculating here, but I suspect it means that the employee needs less training. All he needs is a quick flow-chart to route the calls. No training for how to talk to customers, no liability regarding things he might say, etc.
He was glad when he found another job.
Is this practice widespread at all? I've worked at several contact center software companies with many customers and never once heard of this. It's always been automated IVR software controlling this. This sounds to my ears like maybe an outlier of a particularly awful contact center? I'd certainly pause before claiming this is how it works for "many of these systems".
Edit: ahh I can no longer edit it :|
It seems very strange to have a human who can listen but can't speak. I guess these companies think that by removing speech as a possibility, they're able to pay the employee less or something?
In a sort of sick and very selfish way, I find myself somehow glad that blind and deaf people exist and are protected by accessibility legislation. I've often found accessibility features, particularly those in software, to be exceptionally useful even though my vision and hearing are fine.
I need a machine to take all of the measurements and make decisions entirely without any kind of subjective judgement on my part.
Please, validate my health the same way automated testing and other such diagnostics checks work.
> when one tries to be as clear, as brief, and as orderly as one can in what one says, and where one avoids obscurity and ambiguity
In a sense, this is not unlike an unskilled and nervous attendant who, faced with a request that they don't understand, starts to chatter more and more in hopes of eliciting the information they really need.
Always having to ask for permission is tiring.
With Siri or Google Assistant, you soon figure out few things that this very low intelligence person can do(like telling the weather or setting an alarm) and stick with it.
This is also why I'm excited about iOS12 with all these Siri shortcuts, instead of pretending that we are talking to a smart being let's have a concrete list of things that can do.
On the other hand I do believe that these voice interfaces have some potential, just the technology is not there yet.
When Siri first hit the Iphone and I was trying to figure out if it was useful for me (no), I named my phone Searle.
How many times have you called and the prompts go:
Press 1 to talk with sales
Press 2 to talk with marketing
Press 9 to talk to tech support, the only reason anybody dialed this number
Speak your 18 digit account number, being sure to pause between each digit to make sure the computer records it correctly.
Speak your phone number
Speak your 24 digit hexadecimal product code
Finally you get through to a person and 100% of the time they ask you for all of that information again so they can type it in (and get it wrong).
And even when you get a person on the line they make you go back and do all of the stuff you already tried before finally transferring you to someone with half a clue.
a) The backoffice is really shit, broken, and doesn't even display any info (down, badly designed, etc)
b) Backoffice and/or VPN connection from outsourced call center to ISP is really slow, so they work faster by asking info.
c) CRM/Customer database has no consisten quality information, so agents do not trust it.
I may forget something, but those are the most common ones.
Spanish law also require a second ID verification to disclose some information (let's say your wifi password), in our case it's the las 4 digits of the bank account number.
Also to pipeline you into different call centers, because it won't be reasonable to demand agents to know all the tech and commercial info about the company, it's just too much.
The Web was legitimately the next big thing. And after that, mobile. Both of them have changed our lives in deep and lasting ways. But we as an industry are absurdly hungry for the next, next big thing.
How many dumb-ass voice and bot and AI and blockchain projects are there out there now? That basically don't work, but have been shipped anyhow? How many millions of dollars have been wasted? And really I should say billions. Theranos alone burned through $1.2 billion of hype. And there was the wave of "Uber for X" companies, busily failing to replicate the business model of a company whose success still isn't a given.
I should be clear that I'm not opposed to trying new stuff. I'm all for it! But I think if we explore technological possibilities with less flagrant waste, we'll learn more. And be able to explore more.
And as an aside, we haven't really had voice-only interfaces for 100k years. Really, they've only existed since the telephone. What existed previous to that was humans, whose in-person interactions are almost always far more than voice. People have different estimates of the amount of information conveyed in a conversation through expression, gesture, posture, glance, and the like, but it's never a small amount.
My point is that voice is a powerful and ancient channel, and it's about time that computers leverage that channel.
As I said, I'm all for trying new stuff. We should look at the extent to which computers can usefully leverage that channel. But I don't think we should presume that it will be particularly useful.
It's just not true, either. So I end up trying to figure out how to structure my query so the robot on the other end will understand what I want, instead of just saying what I want.
In the end, I just repeat "human" and mash the 0 button over and over until I get a real person to talk to.
They hide key functions behind a 3D Touch, but there's absolutely no discover-ability. So you're either left trying to 3D Touch everything to see what works, or actively researching 3D Touch tricks.
As phone gestures become more popular, they'll have the same issue.
Seriously, watch someone on a new iPhone and it's just painful.
Oh, I created this contact for fun, now how do I delete it. Tap it? Nope. Tap and hold? Nope. Double tap? Nope. Is there a menu somewhere? Nope. Maybe I slide it over? Nope, that brings me back to the previous page. Maybe I pinch? Nope. Guess I'll pull up the help, oh there isn't any. Off to Google then. Oh, I have to slide it over from the middle, not from the edge.
The tipping point for me was when the AT&T small business line I used to use changed to a voice interface and included fake keyboard taping sounds after each interaction. That just felt so damn insulting.
Lately I just say ridiculous shit with these interfaces to see what happens.
A few days ago I was using one for Delta that couldn't tell the difference between "Yeah" and "Yes". Sigh.
In an alternate pinch, licence some item get sounds from vidoegames and do A B testing on customer satisfaction.
Except that with text adventures I was willing to overlook an obtuse parser if I was enjoying the game. I never call my bank just for the fun of it.
The irony, of course, is that research is making strides in NLU, it's just too late for the last wave of chatbots. Here are two recent papers from DeepMind:
Learning to Follow Language Instructions with Adversarial Reward Induction
Relational inductive biases, deep learning, and graph networks
Of the bots that are marketed to be more human with lots of machine learning. From my experience, they feel no more better than the original ELIZA (https://en.wikipedia.org/wiki/ELIZA) despite the leap in tech.
When I look how my kids interact with Google home it's becomes fairly obvious to me that to them this is completely natural. Google Home is almost like a pet to them not just a tool.
We are finally at a stage where voice recognition starts to become powerful enough to understand nuances now the next question is what to connect them to. One thing that I really like is that it allow us to retrieve information without having to look at a screen. It feels like having a 5th person at the table.
At First Principle, we built a little a voice app that allows you to ask Google Analytics or Salesforce for data (and potentially whatever you want to connect with) for meetings so we can ask instead of having to look up. It becomes a natural part of the conversation and everyone have access to the data.
That's where I think it will first make an impact. In meetings with relevant data.
Not perfect but good enough that it provides value.
I now use voice a lot because of the quality with Google tech.
80% success and 20% no action might be usable for an assistant, but adding in 5% random behavior makes it drastically worse than useless.
Oh and this comes along with a modern website that can execute all those use cases too.
But then you throw in the natural language, enabling users to write complex queries in English. That and great funded teams focussing on niches.
My experiences with bots are becoming outstandingly good.
To go back to interfacing with a program using written language seemed like an odd step. It's never been the most efficient way of doing something, and it requires very advanced technology to accurately understand what people are trying to say, in whatever slang, shorthand, or bad spelling/grammar they use.
Besides, it's not really dead, the tech just moved to "voice assistants" rather than "chatbots" - really just spoken word rather than written word. And I'm not convinced that's the "revolution" most people are expecting either. I'll stick to clicking buttons and typing things into my terminal.
I don't know what "sword" is.
By the way, what OP says may be true of a game like Dunnet. I don't know if it is no more maintained or if it is kept that way because nostalgia, but more recent text-based games (that is, one that could have been programmed by your father instead of your grandfather) do way better than this. Just try some popular MUD (the MMO version of text adventures), I'm sure you'll be surprised.
Some of the current interactive fiction (as the genre is now called) would surprise you.
Anything by Andrew Plotkin  (aka Zarf) is guaranteed to be interesting. Other indie authors of renown to look for are Emily Short and Adam Cadre. But also look for new authors! Mind you, modern IF focuses less in puzzles and more in narrative or exploring the boundaries of the medium.
One of my personal favorites is "Spider and Web" by Zarf, because I love its Cold War-esque setting. Mind you, it can be difficult! The best IF games also explore the console interface itself, such as in "Fail Safe".
An example of a particularly innovative game is "Rematch" : it's a single move game (i.e. you win or fail in a single input, which can be quite complicated and shows off what modern parsers can do). It's sort of a "Groundhog Day" where you must prevent a disaster in a single move, and if you lose you replay it again, and again, and again, till you get it right.
Many of these games can be played in a browser, without installing anything.
sudo apt-get install gargoyle-free;
You can play it at https://classicreload.com/zork-i.html or http://textadventures.co.uk/games/view/5zyoqrsugeopel3ffhz_v...
There is only so much input you can give and subsequently evaluate if you show a GUI with 3 buttons.
Now if you restrict the chat or voice input to 3 options than the interface feels unnatural, annoying even.
It seems like a lot of times when people talk about chatbots, they really mean these phone trees in text form, in which case I would agree with your sentiment.
However, be cautious in conflating chatbots with CLIs. I would say a CLI is not (always) an intuitive interface, but for a lot of problems, they are quite efficient.
The CLI style chatbots tend to be much better since they are basically CLIs in an easily accessible location (e.g. in an app on your phone).
Understanding natural language is much more than just a design problem. It's a grand challenge and core subfield of Computer Science. It's the original Turing Test.
The biggest manifestation of this is the self-driving car hype for when grandpa can't legally/safely drive himself, however voice-assistants also fit that mold: Something to sell to grandpa when they don't want to learn/buy a new thing and his eyesight is bad and arthritis makes typing hurt.
At that point I just push 0 and repeat "speak to a human."
Why, oh why, would any rational person think that this kind of technology was about to suddenly take over everything?
This is a typical example of groupthink delusions.
The most annoying part of the experience to me is the menu items are always ~25 things I can trivially do on the website. No, I'm not calling to check my balance, pay a balance, update payment information, etc. I'm calling because your website specifically said that function isn't available online and I need to speak to a representative.
I think if you're from a certain era, you just pick up the phone first and want to talk to somebody, even if it might be pretty easy to find what you're looking for online.
This is also true for CLI interfaces. It illustrates the popularity of GUIs - much to the dismay of CLI enthusiasts everywhere.
Edit: Yes, a -help or "help" command can be used to list the menu. But then this command has to be known beforehand. What if a clever designer decides to use -assist or "assistance" instead?
IVR, is supposed to be 'intuitive' or 'natural' or however one's preferred marketing dialect describes it. The assumption is usually flipped - that tools are supposed to assist those ignorant of their use.
It has been a long time since normal users have been expected to use a CLI. If you're looking at one, you choose to. This is not at all true of IVR/Sirlextana.
 Of course, they don't serve the caller, they serve the robot's owner, so interests are only aligned to the extent the robot efficiently helps the caller with their goal, which adds an additional layer of opportunity for frustration.
swipes to the right
> Message archived.
Guess it doesn't.
You really don't have this option with voice. On the other hand, when I visited my mother I was surprised that she figured out that Google Assistant can provide her with nutrition information. She would have never figured out by herself on a CLI or menu because it's very intimidating.
This somewhat contradicts my previous comment about how much I dislike voice interfaces but they probably have some strengths.
And you get to enjoy fumbling around trying to guess whether this particular program's magic incantation to summon help is "help", "-h", "--help", "?", "/?"...
On voice interface, your best chanse is to connect to an actual intelligent being as soon as possible because getting the options is a frustration by it's own.
If I'm saying 'help' it means I've tried that and failed, so it's not clear what the benefit is.
Think back to the earliest programming courses people take:
What's your first name: Jack
What's your last name: Ch
Hello Jack Ch!
"I need to find out how much space I have left on this machine."
bash: freespace: command not found
bash: diskspace: command not found
> disk info
bash: disk: command not found
bash: diskinfo: command not found
> man disk space
No manual entry for disk
No manual entry for space
> help disk space
bash: help: no help topics match `space'. Try `help help' or `man -k space' or `info space'.
> df /dev/sda1
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 19478160 1370768 17094912 8% /
EDIT: Maybe I should have said 'CLI Newbie' - I am in no-way singling out Linux here.
$ apropos space
CORE (3perl) - Namespace for Perl's core routines
arpd (8) - userspace arp daemon.
CMSG_SPACE (3) - access ancillary data
CUDA_ERROR_INVALID_ADDRESS_SPACE (3) - (unknown subject)
cudaErrorInvalidAddressSpace (3) - (unknown subject)
df (1) - report file system disk space usage
(Seriously though, reading the manpage for man is where I first learned about apropos.)
Google "linux how much space on drive" and the first hit tells you exactly what you need.
Another question someone might ask is "how do I change my password". Google for "linux change password" and there is a special help box right at the top of the search results.
apropos password and you get this enormous list:
apg (1) - generates several random passwords
chage (1) - change user password expiry information
chgpasswd (8) - update group passwords in batch mode
chpasswd (8) - update passwords in batch mode
cpgr (8) - copy with locking the given file to the password or gr...
cppw (8) - copy with locking the given file to the password or gr...
cracklib-check (8) - Check passwords using libcrack2
create-cracklib-dict (8) - Check passwords using libcrack2
crypt (3) - password and data encryption
crypt_r (3) - password and data encryption
des_read_2passwords (3ssl) - Compatibility user interface functions
des_read_password (3ssl) - Compatibility user interface functions
doveadm-pw (1) - Dovecot's password hash generator
endpwent (3) - get password file entry
endspent (3) - get shadow password file entry
EVP_BytesToKey (3ssl) - password based encryption routine
expiry (1) - check and enforce password expiration policy
fgetpwent (3) - get password file entry
fgetspent (3) - get shadow password file entry
fgetspent_r (3) - get shadow password file entry
getpass (3) - get a password
getpw (3) - reconstruct password line entry
getpwent (3) - get password file entry
getpwnam (3) - get password file entry
getpwnam_r (3) - get password file entry
getpwuid (3) - get password file entry
getpwuid_r (3) - get password file entry
getspent (3) - get shadow password file entry
getspent_r (3) - get shadow password file entry
getspnam (3) - get shadow password file entry
getspnam_r (3) - get shadow password file entry
git-credential-cache (1) - Helper to temporarily store passwords in memory
gitcredentials (7) - providing usernames and passwords to Git
grpconv (8) - convert to and from shadow passwords and groups
grpunconv (8) - convert to and from shadow passwords and groups
grub-mkpasswd-pbkdf2 (1) - generate hashed password for GRUB
lckpwdf (3) - get shadow password file entry
login.defs (5) - shadow password suite configuration
lppasswd (1) - add, change, or delete digest passwords.
Net::LDAP::Control::PasswordPolicy (3pm) - LDAPv3 Password Policy control object
Net::LDAP::Extension::SetPassword (3pm) - LDAPv3 Modify Password extension ob...
pam_pwhistory (8) - PAM module to remember last passwords
pam_unix (8) - Module for traditional password authentication
passwd (1) - change user password
passwd (1ssl) - compute password hashes
passwd (5) - the password file
passwd2des (3) - RFS password encryption
putpwent (3) - write a password file entry
putspent (3) - get shadow password file entry
pwck (8) - verify integrity of password files
pwconv (8) - convert to and from shadow passwords and groups
pwunconv (8) - convert to and from shadow passwords and groups
seahorse (1) - Passwords and Keys
setpwent (3) - get password file entry
setspent (3) - get shadow password file entry
sgetspent (3) - get shadow password file entry
sgetspent_r (3) - get shadow password file entry
shadow (5) - shadowed password file
shadowconfig (8) - toggle shadow passwords on and off
smbpasswd (5) - The Samba encrypted password file
smbpasswd (8) - change a user's SMB password
ulckpwdf (3) - get shadow password file entry
unix_chkpwd (8) - Helper binary that verifies the password of the curren...
unix_update (8) - Helper binary that updates the password of a given user
vigr (8) - edit the password, group, shadow-password or shadow-gr...
vipw (8) - edit the password, group, shadow-password or shadow-gr...
xcrypt (3) - RFS password encryption
xdecrypt (3) - RFS password encryption
xencrypt (3) - RFS password encryption
$ apropos -a change password
chage (1) - change user password expiry information
kpasswd (1) - change a user's Kerberos password
passwd (1) - change user password
PKCS12_newpass (3ssl) - change the password of a PKCS12 structure
$ apropos -a space disk
df (1) - report file system disk space usage
df (1p) - report free disk space
man -k -S1 password
I use man and apropos all of the time, but I have no illusion that they're new user friendly. They're amazing when you just need to look up some options or get the syntax for a system call, but if you've not assimilated the Unix way of thinking they're quite obtuse.
Apropos would be a powerful part of a "help" command, but in 50 years nobody has ever written a help command that gained enough traction to be included in the standard toolchain.
Apropos doesn’t do it’s job because it isn’t named — or integrated with — help.
The tools themselves leave a lot to be desired from a UX experience too. Things like nonsense names, bad defaults and/or no inherent intelligence in the program, means you have to specify numerous options manually for things that should be easy to automate.
The whole experience would benefit greatly by telling people about things like man/apropros on first startup and other one time tutorials.
There are some days where I feel tempted to write small helper scripts whose sole purpose would be to rename/reorient the default experience/flags of various CLI apps, so that it is not a jumbled mess full of historical accidents that can no longer be changed.
“I need to find out how much space I have left on this machine.”
"Oh there's an entry called My PC" (click)
"Aha, there's my disk and little bar showing the free space"
It's two clicks if you have your wits about you, maybe a few more if its not that obvious.
I don't know why you singled out Windows. It's just as easy in Gnome or KDE etc - the point here is that CLIs can never be as intuitive as GUIs. As the original article said - Humans are very visual people. Seeing things in a visual space is much easier for us.
Personally I think the experience is very similar for noobs.
But then again, why wouldn't the Linux guy search google in the first place?
Or if it is a total newbie, look for the disk icons, the information is there as well.
then it's a really bad one.
In a sense, the Google search bar is a type of chatbot, but we don't converse with it in grammatical English. It doesn't present as a human, so there's no uncanny valley effect. What gets typed into Google is a sort of lingua franca that we've all collectively learned through 20+ years of increasingly capable search engines. What we need is that level of lingua franca, but for a full, state-change-driving conversation, instead of just a one-step search.
It's not the machines that need to learn, it's us.
The uncanny valley effect, in my opinion, is due to the fact that 99% of our speech and writing is not dictated by content or necessity, but instead driven by social tendencies.
If I ask my friend to lunch, I'll text him "lunch?". But if I ask my boss out to lunch, it's going to be something like a paragraph, explaining what I'd like to cover during the lunch.
Most of what we say is a kind of dance to ensure other people that we aren't stupid. We don't need this formality when dealing with non-human entities, so speaking in grammatical language when chatting with a bot feels incredibly stupid.
Particularly, of course, they no longer care whether results include your required words.
One has to learn and re-learn how to get the best from these systems.
Aside, I'm impressed with Alexa on FireTV, but until it enters data in to the apps, and searches within media libraries, it remains a novelty.
I think customer service departments find them boring and want to automate them away.
Voice-based systems currently combine all of the limitations of human conversation (e.g., low bandwidth, strongly linear, possibilities hidden) with all the limitations of computers (e.g., not very bright, highly literal, inflexible).
I think the main reason they're so popular for support is that people are bad at accounting. If I install an IVR system, I see that calls to human agents go down, saving me easily measured cash money. But I probably haven't measured the time and cognitive load burden shifted to customers, or the value lost by suboptimal use of whatever they're supporting.
They already have those online, in the form of FAQs and knowledge bases that don't tell you anything new.
It's not as though the tones are harder to recognise!
But seriously, it's the same reason so many sites won't accept a credit card number with spaces in it. Someone writing from scratch the same thing that's been written a thousand times, but only thinking about the way they would use it (or the way they were told others would use it).
Because the boardroom got a huge boner when realizing they can do major cost cutting in the human department.
Spamming the # key repeatedly seems to work pretty well for this too.
Of course one can say that this is not an exclusively AI issue. Call centers, especially offshore ones, have the same problem.
Those strings are piped through a Switch statement of static responses, or fall out into the Default response. This was the situation in 2013 when I created a speech bot in Powershell, and it sounds like the state of the art here hasn't progressed far.
What's needed is logic to dynamically build the Switch statements, or otherwise better parse human entry and build responses. There has been much work on a few different fronts, but I'm not aware of any which were wildly successful.
Cloud services backed by big data sets tend to be better although I admit I haven't tried a local copy of Dragon Naturally Speaking for a long time.
In any case, at least assuming fairly mainstream American/English accents, the voice recognition isn't really the problem any longer. Sophisticated NLP and responses are. We're a long way from virtual assistants that can do anything sophisticated.
Amazing what they are spending to keep up illusions.