

A verbal command line for the world - ThomPete
http://swombat.com/2011/10/26/siri-command-line

======
mseebach
IF Siri can become a CLI for the world, then that is brilliant. But the most
powerful feature of the CLI is the pipe - " _passing the output of one program
to the next, to make magic happen_ " - and I remain skeptical if something
resembling that can be made to work using a voice interface. Basically, I want
to be able to say "Find a bus 19 that allows me to be at work at least 15
minutes before my first meeting each day, add it to my calendar and update
every day at 9 pm", the equivalent of which is pretty simple in a Unix CLI.
But, not having the command line to review and edit, I also need to be able to
say this without first carefully considering the exact structure of the
sentence.

Being able to control and input data to an increasing number of applications
reliably is undoubtedly very good, but considering the many many many failed
attempts to replace text entry to describe anything mildly complex suggests
that it's not going anywhere.

~~~
jamesrcole
_But, not having the command line to review and edit, I also need to be able
to say this without first carefully considering the exact structure of the
sentence._

That's no doubt difficult, but you just seem to be implying it can't be done,
without giving any reasons why.

 _Being able to control and input data to an increasing number of applications
reliably is undoubtedly very good, but considering the many many many failed
attempts to replace text entry to describe anything mildly complex suggests
that it's not going anywhere._

No it doesn't. Simply that a lot of people have tried something and failed to
achieve it doesn't mean anything. Lots of people failed to make flying
machines, until someone did.

You may be able to give particular reasons why it is difficult and why you
think it not likely to be achieved, but the fact, by itself, that no one has
been able to do it so far, doesn't demonstrate anything.

~~~
jerf
"Simply that a lot of people have tried something and failed to achieve it
doesn't mean anything."

Yes it does. It doesn't _prove_ anything, but it does _mean_ something.

I'd point out that we've surrounded command lines with a lot of accoutrement
over the years, like history, conditionals, and various ways of outputting
things such that we can debug our system. The utility of the command line
would plunge if we didn't have any of those, and it is completely unclear how
to add those to Siri without turning it into simply another programming
environment, a voice-activated interface to Yahoo Pipes.

I'd submit that there is a certain irreducible complexity in the command line,
and despite how fuzzy and wonderful it may make you feel when using it,
wrapping a voice interface around this complexity makes it _worse_ , not
better. Or we'd already be doing it. If this is trying to do the
mathematically impossible task of reducing a task's complexity below the
number of bits it actually has, then yes, it _is_ impossible, too.

~~~
evgen
_I'd point out that we've surrounded command lines with a lot of accoutrement
over the years_

I would counter your claims with the simple fact that the earliest command-
line interfaces had none of these. A process change only has to offer an
incremental improvement over what came before to drive adoption. Those of us
who used the earliest shells and the job-control languages that preferred them
know that the bar for usefulness can be set quite low and the tool will still
be used and improved upon.

~~~
jerf
"The utility of the command line would plunge if we didn't have any of those"

If I'd meant "been eliminated", I would have said that. Nevertheless, I cite
the "your mom" argument. "People" are terrified of the _nice_ command line,
and we think that they're going to switch to a vastly, vastly more user
hostile one, no matter how friendly and chipper it may seem on first glance?
If "people" wanted command lines that much they'd already be using them.

------
6ren
> The voice recognition is ok. The natural language processing is passable.
> But all those are things that can be improved...

These AI fields have been notoriously slow to improve. Apple is great at
applying new technology, e.g. the integration mentioned. They may be able to
adapt somewhat with all the data gathered (pronunciation, word-usage patterns,
typical requests), but I wouldn't expect significant improvements.

~~~
swombat
I expect them to improve because there are obvious, straightforward, practical
improvements they can make already. As an obvious example, the NLP is
currently limited in terms of the variability of ways to add things to your
shopping list. Adding more variants of that would be easy.

The huge amounts of data they're gathering about what people actually use this
for is indeed a huge advantage. With that kind of data, they can build a
centralised dictionary of common patterns for asking just about anything.
Because of Siri's practical orientation, there is no need to build a "generic
NLP engine" (which is what you're, quite rightly, suggesting is slow to
improve). They merely need to improve the effectiveness of the NLP engine they
already have, by making it understand more common patterns. That's achievable
through expert-system-like functionality that's existed for 30 years.

In terms of the voice recognition, that has been progressing steadily for 30
years too, and I imagine Siri will benefit from improvements to the technology
base, as others will.

~~~
fuzzix
"In terms of the voice recognition, that has been progressing steadily for 30
years too, and I imagine Siri will benefit from improvements to the technology
base, as others will"

Will it progress to a point where ambiguity is reduced to nil? The strength of
the command line abstraction is its completely unambiguous interface - do what
I say.

The do what I mean interface of Siri could be limited to actions with
insignificant consequences - the human brain incorporates the best speech
recognition available and still makes mistakes. Any speech AI performing
significant actions would need to outperform the brain.

 _edit_ : That is not to suggest AI speech recognition outperforming the brain
is not possible. There are probably metrics and methods for disambiguating
common sources of confusion which could be performed in the blink of an eye
rather than the minutes, hours or weeks later you find yourself thinking "Oh,
THAT'S what he said!"

~~~
swombat
_The do what I mean interface of Siri could be limited to actions with
insignificant consequences - the human brain incorporates the best speech
recognition available and still makes mistakes. Any speech AI performing
significant actions would need to outperform the brain._

Or actions which can be undone.

I'd be very concerned if the US army decided to use Siri to control its
nuclear missiles. Less so if someone uses Siri to change their thermostat or
query their fridge contents.

~~~
fuzzix
"I'd be very concerned if the US army decided to use Siri to control its
nuclear missiles. Less so if someone uses Siri to change their thermostat or
query their fridge contents"

Indeed... A command line for non-critical parts of the world.

For launching missiles? Nothing less than a bash script! ;)

------
rdl
The only time I'd want a spoken command line is when I can't type; that is
basically only while driving (or while dodging being shot/bombed). Admittedly
that's a big use case (the driving part), so siri makes some sense there.
Otherwise I'll stick to typing -- I _prefer_ textual communication even with
another human in the same room, so maybe I'm an outlier.

~~~
jamesjyu
I've been finding myself using Siri for texting even while walking or doing
less dangerous activities like driving. I find that I can "type" much faster
with my voice, and Siri is damn accurate.

------
icebraining
The problem is that Siri has the drawbacks of the CLI, without many of its
features. Don't get me wrong: as I said previously, Siri gets me excited like
no other iPhone feature does.

But the reality is that, like CLIs, Siri has an unforgivably precise syntax,
but unlike CLIs it lacks contextual help (tab completion), easy ways to string
commands together and encapsulate them as one (pipes and scripts), job
control, etc.

Right now, Siri is a poor voice based CLI, yes. But I think in the future
-when natural language processing improves- these types of applications will
be less and less of a CLI, becoming their own UI paradigm.

------
danmaz74
If Siri is really the equivalent of a CLI for phones, then it's destined to be
a niche technology.

The main reason why most users don't use a command line isn't, as the article
says, because "most people have been satisfied enough with that [the GUI], so
they haven't ever bothered learning to use a command line". It's because
command lines are only better for very advanced users and/or people with an
exceptional memory.

Fact is, with a command line you don't "talk" to a computer. You actually
program it, with a kind of interpreted language and a plethora of libraries
(executables). This means that you actually need to be VERY exact in what you
write, you have to remember the "magic incantation" in every detail, or it
will not work - or, likely enough, disaster will follow.

Vocal commands can be useful for the mass for very simple tasks, if the
incantation is very easy to remember. To be able to actually talk to
computers, on the other hand, we need actual AI, of the Turing test kind. We
aren't anywhere near that for now.

~~~
swombat
_The main reason why most users don't use a command line isn't, as the article
says, because "most people have been satisfied enough with that [the GUI], so
they haven't ever bothered learning to use a command line"._

They've been satisfied enough with the GUI because the alternative, the
command line, was several orders of magnitude harder to learn than the GUI.
Siri is comparable, or possibly easier, in terms of learning difficulty.

 _Fact is, with a command line you don't "talk" to a computer. You actually
program it_

But with Siri, you talk to it. That's where the NLP piece comes in. "Wake me
up at 7am" works, so does "Can you set an alarm for 7am?", etc. The syntax is
deliberately loose and not programming-language-like. So you don't need to be
exact at all.

~~~
Angostura
That's precisely the reason why the CLI parallel falls over, in my opinion.

A CLI is an interface that lets the user communicate using the the computer's
language.

Siri is an interaface that [attempts] to let the user communicate using their
own language.

It's a fundamental difference I think. Indeed the only thing that Siri and
CLIs have is that they are non-graphical.

~~~
jamesjyu
_It's a fundamental difference I think. Indeed the only thing that Siri and
CLIs have is that they are non-graphical._

This is actually the similarity that matters. In a GUI interface, you have to
page through "commands" to find what you are looking for. In CLIs and Siri,
you simply go straight to the command by typing or speaking.

Programmers love CLIs because, after the initial cost of learning the
commands, you can do things much faster than a GUI. But, speed of executing
commands is less important on the desktop than on a phone, which is why I
suspect most normal users don't bother to learn a CLIs.

But when you're on a phone with trying to tell your boss you're running late
while coming down a flight of stairs, Siri is going to come in real handy.

------
bobbles
I loved this quote

>("Add buy ketchup to my shopping reminders" failed, but "Add ketchup to my
shopping list" worked)

This is exactly the way I would expect it to work. Would you say to a personal
assistant: "Add buy ketchup to my shopping reminders"? It isn't natural. The
second phrase is much more natural and was understood as one would expect.

~~~
incremental
The funny thing is that he is clearly thinking like a computer programmer in
his first sentence. I wonder if naive users would generally have better
results on first contact with Siri, compared to technical users who try to
over-structure their commands?

~~~
swombat
You're very possibly right. That said, the "Add ketchup to my shopping list"
version only sounds good for shopping lists.

If I wanted a list of work reminders, I might have: "Add send contract to john
to my work list." which would be less natural... also, in my mind, what I was
adding to my reminders was the task to buy ketchup, not "ketchup" by itself.

Anyway, I'm sure they'll add more supported syntaxes, and in parallel people
will learn and adapt to the syntaxes that are actually supported.

------
Tichy
"When is the next bus 19 coming?"

This might be where the problems will start to surface. I suspect at the
moment Siri nows that some keywords likely refer to the calendar, and acts
accordingly. If you add bus schedules, you create a big problem - "when" will
make Siri search your calendar, which will not be the right database for bus
schedules.

Then again this might be easily fixed by simply searching all connected
services for "bus 19", but this too might reach a limit of feasibility with a
growing number of databases.

Maybe Siri can learn that "bus" most likely refers to the public transport
database. It's probably not completely stupid, but it still seems unclear how
much it can really do.

Of course having such interfaces in the future would be cool, it is just not
clear that Siri is the first step in that direction.

------
helton
I wonder how third party apps would interact with Siri.

"What is the music that is playing now?" Shazam and SoundHound would want to
be the service provider for this kind of query. How Siri would select which
app to use?

~~~
CWIZO
Same as android does when you trigger a task that more than one app can
handle: it asks you (and you can, off course, tell it to remember you choice
so it doesn't bother you again in the future).

------
adambyrtek
All this excitement about Siri makes me wonder, would people really talk to
their phones in public places? Sometimes social barriers are much harder to
overcome than technological ones.

~~~
swombat
I think it will, and Apple have made a brilliant design choice in allowing
Siri to fire up when you bring the phone to your ear (and you're not in a
phone call). People are already comfortable talking on the phone in public,
and it's not immediately obvious to onlookers that the "person" being talked
to is the phone itself.

~~~
andrewflnr
I have actually seen people talking on the phone to people just holding the
phone in front of them, so there's already precedent for that as well. I don't
think social barriers will last long.

------
Tycho
Siri on Mac desktops would be great. Organize your work and surf the web while
eating your lunch, hands off the keyboard.

~~~
Tichy
Talking with your mouth full?

------
stcredzero
For some reason, people have missed out on one of the biggest avenues of
potential in R&D originated environments like Self and Smalltalk -- _every
object basically has a CLI_.

Yes, I'm serious. If you poke around in the Smalltalk environments, you will
find not only that everything is an object, but also that _every single
object_ everything from complex libraries to the humble character has
something like a CLI that you can write little scripts against.

Squeak Smalltalk not only has something like a CLI for directly manipulating
objects on a low level, there is also a framework called Morphic which lets
you directly manipulate every individual object with a little GUI. It's as if
every object also had its own lightweight IDE attached to it.

