In my opinion the biggest thing preventing me from using siri is not what it can and cannot do, but that it has been nearly impossible for me to develop a ToM for Siri. And since I simply dont know what siri is capable of, I only use this bot for a very narrow set of tasks. Furthermore one of my ToM priors for siri is that she has a mind that is incapable of learning. This is a big turnoff - bigger than i think we acknowledge; since we are used to interacting with entities that can, even if laboriously. For instance my Australian shepherd might not be able to bring me the newspaper; but if I really wanted her to do that task, I could slowly get her to approximate that behavior, and it would probably be satisfying to see progress in performance. With siri, I simply assume there are things she can do and things she cannot, and it'd be pointless to try and goad her into adding even a trivial task to her functional repertoire.
I'm not a fan of the current AI craze, and happily like my devices somewhat stupid, but if they're going to try and be smart, at least learn my predictable behaviors, and offer me shortcuts on those.
It's unclear to me what these new shortcuts really offer, but I'll be interested to try them. However I have a hunch what I want is actually even simpler than these will provide.
I have a Workflow that when run, uses Maps to estimate the drive time between wherever I am and my house, pads the estimate by 5 minutes, then adds that to the current time to get an ETA, formats it into a text message and sends the ETA to my wife.
In Launch Center Pro, I have a geo-fenced shortcut that presents a notification to run my workflow whenever it sees that I have left the area around my office building.
The UX is, halfway between my office building and my car, I get a notification from LCP - I tap the notification, Workflow thinks for a minute, and presents me with a text message to my wife and I tap "send".
As nice as this is already, I believe Siri Shortcuts may improve it in several ways:
1) Currently, apps are not allowed to send messages without explicit user interaction, so Workflow can't actually text my wife, it just presents me with an imessage screen and an already-written message that I must tap "send" on
2) Workflow does not support geofenced or time-based running of workflows, hence why I need LCP to launch my Workflow based on a geofence. If Siri Shortcuts supports this, then I won't have to rely on LCP anymore.
3) There is no way (that I'm aware of) to trigger my workflow automatically, I must tap a notification and unlock my phone, or run an app and select the workflow within it. An improvement would be a programmable Siri Shortcut ("Hey Siri, let my wife know my ETA"), and even better would be automatic running of workflows based on defined conditions, so my workflow would run and send a message automatically when crossing the geofence without me even needing to be aware of it (besides maybe a notification that lets me know the workflow has run).
> It does look like certain annoyances with Workflow are showing in Shortcuts as well. For example, sending a message via a Shortcut still requires you to manually hit the send button
I have noticed, however, that it wont always make suggestions for similar repeated texting events. I think it has to do with limited space in the iOS auto-complete bar, because once I get past the first couple of words it fills in the rest.
If you're using iOS, is it possible that your usual phrase is too long to fit in auto-complete?
As for the other things you suggest, it appears this is the direction Apple is headed, based on the most recent WWDC keynote.
I set the same alarms Monday through Friday to wake up for work (usually a few since I'm a deep sleeper). While I know I could schedule these, I usually prefer to manually set them, so I wind up doing that every night.
In the morning I can do, "Siri, turn off all alarms" and Siri will disable any that are set.
But at night time, I can't do "Siri, set my usual alarms for tomorrow morning."
Seems like a critical type of action that a predictive engine should be able to process. But perhaps I'm wrong.
: No affiliation -- their app just solves a need of mine.
The child is in a room with Sally and Anne. Sally has a basket and Anne has a box. Sally puts the marble in her basket and leaves the room. While she's out, Anne steals the marble, and puts it in her box. Sally then comes back into the room.
You ask the child "Where will Sally look for her marble?". This tests whether or not they can fathom the world through Sally's eye, and see that her view of it will be different from their own.
The experiment was formulated by Simon Baron-Cohen.
Your comment seems to be a general gripe, and it doesn't address the fact that this new change may actually solve your problem.
(I agree with you, but I'm excited by this addition because it finally feels like a difference)
For instance, from 9to5mac, these comments -
"While Siri and native integration with the operating system is fantastic, this is still very much the Workflow app, with a few extra bells and whistles. It works and function in a similar fashion by creating blocks that trigger one after another"
"One example of using Shortcuts with Siri is creating a Shortcut that grabs your current location, creates an ETA home, enables Do Not Disturb, and then sends a text to your roommate that you’re on your way home, and plays some music. All while you’re getting into your car, and preparing to leave. And all you have to say is “Hey Siri, I’m on my way home.” Alternatively, you could also just go into the Shortcuts app and start the request there."
All Shortcuts is adding is a way to trigger all of them in one new step. It's like a dumber IFTTT for Siri, basically, one you can't build new behaviors in, you can just script together existing ones. Useful, but it's still hard to tell what can Siri do.
A huge area all of the voice assistants could improve upon is in error conditions. At home I'll say something like "Alexa, turn on the bedroom light" and it will respond that it doesn't know any device of that name. It should offer to tell you device names it does know about.
You say that as if it's a completely non-trivial thing to implement.
But, assuming that's what you meant, how does that compare to, say, speech recognition and understanding? Apple, Google, and Amazon are up to their eyebrows in machine learning and AI. They're working to solve very, very hard problems. But the next step of that needs to be to make it so their hard work is discoverable by users organically.
I think that problem is much harder than just plain machine learning because you have to somehow gauge what the user is doing or what tasks they're performing well enough for the AI to be able to even tell where it can insert itself. I don't know of any product or service right now that can "watch" a user's behavior and suggest places where the AI can insert itself and I suspect the reason why is because doing something like that is very, very difficult.
It's one thing to detect that you leave work at 5pm every day and that you usually drive home. It's another to say "I noticed that you set a cooking timer every day around 6pm, do you want me to set that for you automatically?" You're not setting a timer just to set a timer, you're setting a timer because maybe the recipe you're using requires it. Different recipes will have different timers. The AI doesn't know the intent behind the action just the action itself.
I say that as somebody that has greatly benefited from that type of functionality in the past.
Again, what you're asking for isn't a trivial implementation.
At the very least, it would be nice if I could say "Hey Siri, what voice commands can I use with app X?"
Just looking over the list still doesn't address the OP's point. Maybe if I memorized it, in the way that if I had an exhaustive list of everything a 4 year old child can do that I memorized I'd be able to 'intuit' what they could do, but why would I do that?
That's just it; the whole 'theory of mind' is basically the idea that we can intuit what someone else can do, think, etc, without such a list. I'm able to limit my own own mind to have the same limits as someone else. That is, I can imagine what someone else is likely thinking given a subset (or even theoretical superset! I.e., "They know if there is money in this account. If there is, they are likely to do X. If there isn't, they are likely to do Y") of information that I have. I can determine what a child will be able to do based on exposure of other things they can do.
None of that applies to Siri. I can't infer what she has access to (both in terms of data and functionality). I can't use capabilities of one thing to infer capabilities in another. She can order me an Uber; can she order me a pizza? Can she order me a highly detailed expandable oak table from a boutique vendor? I can infer what a real PA is likely to be able to do (even if I don't know the specifics of how), but I can't do that with Siri. It's a black box. Giving me an exhaustive list of all the things doesn't change that; it's now a black box with a manual. Yes, okay, maybe if I memorize the manual I can determine what she can do, but the point the op is making is that for a real PA I can infer what capabilities they have without memorizing a manual. Until Siri can as well, she's not a replacement, feels gimmicky, and has real barriers to adoption to overcome.
Funny, I'm the opposite. I like the ability to flip through and see "Oh, I didn't know it could do that!" Then I mentally file it away as a thing that exists.
I would never have learned that Siri (via Wolfram Alpha) can tell me what planes are overhead just by trying it except for having read it in a list somewhere, because I would never have thought to ask that. But since I read a list of interesting things that Siri knows, I now know it has that information.
Just trying to guess what capabilities are available is like trying to learn how a unix command works with no man page. "Just run it with every possible flag and see what happens!" It'd be great if Siri could do everything, but she can't, and the search space of possible actions with natural language is far too large to find everything I might use by guesswork. A black box with a manual is better than a black box without one.
In the new shortcuts app, there's a pretty clear view of what siri can do. Basically any text editing you want (someone even made a C parser)
There's a fairly concise list of areas. With a 10 second skim, you can get a sense of "oh, Siri can be taught to do stuff in these areas"
In the same way you wouldn't necessarily know what a dog could do without a little outside research (per their theory of mind dog example)
Meanwhile, with ios 12 all apps will be able to add prominent "add to siri" shortcuts in their apps. So as you use an app, you'll see "ahh, Siri can be trained to work with this all in this way"
Then these can be chained together. So you can now get a fairly clear general idea of what may be possible.
You still need to learn details to do specific training, but this seems like a big step up, and closer to the dog example. You can at least form a theory of what Siri can likely be taught to do.
Not a PA yet, but to me this new iteration is at least graspable.
One of the problems with Siri/Google/Alexa is they try to be everything and do everything at once.
What if they split the expertise/domains they're capable of into different names?
For example, there could be a "mind" that is really good at playing music.
So you'd say, "Hey MusicBot, pause the song."
Or, you could say "Hey, InfoBot, what's the weather?"
Over time, you could get to know more "minds" that live in the cloud, and develop a theory of mind for each of them.
Given she's an Australian shepherd, it would probably be only a bit startling to subsequently come home and see her analyzing the finance section of the paper. :)
This is, I think, one of the failings of human ToM. We tend to underestimate entities which are different from us. The differences are not only of capacity and of information. They are differences of worldview, values, and biases that are very hard to capture, even between adult humans of different cultures. It's easy to imagine people as a less intelligent and less experienced versions of yourself, but that is not an accurate theory of their mind.
The same differences, only more so, are present between humans and dogs or humans and Siri. Given that the most empathetic and social people (with, I assume, a better theory of mind) are generally not found in technical roles, I wonder if they might find it easier to make that stretch and effectively use Siri or Alexa?
I hear what you are saying but given we are the designers of the siri 'mind', we are in a unique position to make it more human-centric. It will be easier in the long run to adapt siri to humans than the other way around.
- Hey Siri, am I fat?
- No Bruce, you are just big boned
- Thanks Siri.
Skeletons may not be "big boned" per se, but there are people with wider hips (as one example) who would never be able to be as slim as some supermodels based purely on their bone structure. I agree, though, that the term "big boned" is just a nonsensical platitude to avoid saying someone is overweight.
Also, for what it's worth, LMGTFY is considered by many to be a very rude response.
This is, of course, patently untrue. (Mine told lies at 12 months.)
The classic experiment I mentioned above involves a child and two adults in a kitchen. One of the adults wants a piece of candy (or something - going from memory here), but the other adult says they cannot have it yet, and puts it in a cabinet. All 3 witness where the candy is placed. At which point the one who wants the candy leaves, but on their way out the door says "when i get back I want the candy". So then it is just one adult and the kid in the room. The adult asks the kid to move the candy to a different location in the kitchen. The kid does this and then sits back down. After a few minutes the adult says to the kid something like - "That other adult who wants the candy will be back in a few minutes. Where do you think they will look first when they get back?" - without fail 3 year olds will point to the cabinet where the candy is currently located, whereas 5 year olds will always point to the cabinet they adult who left saw it last. Something happens in brain development during age-4 that gives us the capacity to understand that different minds hold different information based on their own experiences. This task has been repeated dozens of different ways, including simply having an adult standing in a different location in a room as 3,4,5 year olds, where the adult's view of a wall (of pictures) is clearly obscured by some screen. Then they ask the kids stuff like... how many pictures of teddybears will that adult say there is on that wall? Again 3 year olds report what they themselves see, while 5 year olds perform like adults would on this task.
While older child only lies when they think that I don't have enough information to know the truth for sure.
Long story short, I think direct vocal query to gain one piece of rote information is a terribly inefficiant way to develop a ToM of someone/something. If Apple wants to bring siri to the next level they need to figure out a way for us to learn about her abilities passively. The catch however is that I dont want siri bugging me all the time, but maybe sparsely she could give me a popup that says "hey I can do that for you". Particularly if she has 'seen' me do it 100 times and think she could save me a few minutes of effort. That imo would be more game-changing than adding additional siri-api for developers, which will add a bunch more eastereggs that i will never discover.
Siri is a cartographer. It seems like she has good domain specific mastery of maps, directions, places, etc. and not to mention, she might even have have an thin (but growing )representation of her own, about my desires and what I know and dont know when it comes to geospatial logistics. In maps Siri really shines, right? I could say something like "Siri, where is the nearest Jack-in-the-Box" and via Maps she say stuff like - There are two Jacks within 3.1 miles, but you might also be interested in knowing there's an In-n-Out and a Shake-Shack nearby as well. In-in-out is currently not as busy as it usually is this time of day, and examining traffic patters I've formulated a shortcut. Should I give you those directions?
Then a few min later I might be eating a double-double and say "Hey Siri, how much memory do I have left on my phone?" and she will reply "Sorry I can't help you with that" (the actual response I got to that question just now). Know thyself Siri! Anyway, it's just stuff like which makes it so difficult to pin down what she is capable of.
But I'd be satisfied if Siri were to master of a few more apps in a domain general manner. That way I can continue avoiding the rote memorization of commands and simply rely on Siri being a domain general boss at this or that app. Paticularly for other use-cases where (as in maps) it actually makes sense to get and recieve info via voice and not simply interacting with the screen.
Up until now it hasn’t been possible due to the restrictive API. I’ve looked into the new shortcuts and Siri API docs and although the beta documentation is sparse, I’m confident I’ll be able to develop a natural, first class Siri integration.
Unfortunately, it won’t be trivial to implement (at least to do well). It probably won’t be done before the iOS 12 release, and I’m hesitant to start working it just yet due to the sparse beta docs.
Over the next year or two, I think there will be some great Siri integrations built. Hopefully, users discover and use them :)
-e- heh, quick LinkedIn stalk reveals you went to UC at the same time as me!
Irritatingly, Siri doesn’t distinguish between calendars so you can’t reel off just classes, but you can ask about the day or the next ‘event’.
Personally I found this setup along with Apple’s ‘Up Next’ widget and Siri Apple Watch face to be way better than any other class management app I tried.
I guess they weren't too happy one of the kids had to keep a database of all of their students' passwords... :P
> This user-centric approach paired with the technical aspects of how Shortcuts works gives Apple’s assistant a leg up for any consumers who find privacy important. Essentially, Apple devices are only listening for “Hey Siri”, then the available Siri domains + your own custom trigger phrases.
I don't get it. How is this different from Android?
Android Actions  were announced before this.
I think Assistant also works offline (with most voice commands + in voiceless mode).
SiriKit allows you to build Siri support into your app a la Android Actions, but Siri Shortcuts is designed to allow drag-and-drop end-user "programming" of workflows that can be triggered by Siri.
"Hey Siri, I'm on my way home" could turn on you thermostat up, order you a pizza, remotely trigger your IoT enabled kettle and start playing your home-commute playlist.
For the more advanced of us, Workflow currently allows doing things like calling arbitrary REST APIs and parsing JSON. I've reverse engineered the API of a local coffee-ordering app so I can one-click order my morning coffee.
Next thing I'm planning is my "I need a coffee" button which will get the nearest cafe, order me a flat white, and pull up the directions.
Shortcuts enables basically any end user with enough devotion and dedication to short circuit this. It doesn’t require them to be an app developer and it doesn’t require them to learn code at all. The most basic shortcuts can be created without any if-then-else logic while enabling so much.
I’ve seen my mom ask her Google Assistant to do things with “and” a lot and they just don’t work because command chaining hasn’t been implemented. But with Shorcuts, she could conceivably make a chain and designate a phrase to be conversationally equivalent with an “and” in the middle.
Instead of having to explain to family why Siri can’t do X or Y, I can just make a Shortcut or show them how, and I’ve solved the problem rather than explaining why it can’t be done.
it is implemented. It just doesnt work often (or at all) for foreign languages.
I'am pretty sure this is the next big thing google is updating
> Instead of having to explain to family why Siri can’t do X or Y, I can just make a Shortcut or show them how, and I’ve solved the problem rather than explaining why it can’t be done.
At least with GA and ifttt you can as of late make your own phrases (and responders). Nothing for the nontechies, but some progress there.
I found this:
Which might let me write my own API to do something but it would be good if it were built in somehow.
Iam actually not sure if this should be the feature of my google home assistent while, ofcourse i see the use cases.
The dependincies show that this uses the https://github.com/thibauts/node-castv2-client. So iam guessing this only allows one to use the chromecast api of the google assistent device.
Iam pretty sure the google assistant api will not be open soonish.
This is the important thing. I'm a fairly competent programmer and if I'm really missing out on something, I really don't worry about it. The problem is that most users aren't programmers!
> it is implemented. It just doesnt work often (or at all) for foreign languages.
Everyone has latched onto my comment because of command chaining, but this misses the point. The point was to give a concrete example of something that a user can now accomplish that they couldn't before. The Google Assistant cannot in fact build rudimentary constructions out of arbitrary system objects like Shortcuts can -- command chaining is just the example I picked.
AFAIK only for Google Home devices, not phones
Apple is moving Automator to iOS, this isnt voice UI, its imprecise programming: https://pbs.twimg.com/media/DhbmQJBX4AMEoho.jpg:large
Is there a good doc or video that delineates what data stays on the device vs. being sent to Apple for processing? E.g. how much of this functionality will be available if you are not signed into iCloud?
Sure it was more of a novelty, and had to be trained on your voice. But that was 20 years ago.
Users actively taking control over how they use Siri (as in iOS 12 Siri shortcuts) will almost certainly encourage them to more conscientiously adjust their usage behavior patterns.
There exists a general assumption that a voice has a human-intelligence behind it, but obviously now not all voices do. This poses a tough learning curve, as evinced by criticism of Siri as ineffective or plain bad. Yeah, Siri won’t respond the same way your boyfriend will when you ask him to find some good Chinese food, or express some feeling. But Siri will excell at setting alarms, or adding an event to the calendar, or starting a meditation with Timeless. It comes down to matching the language to the tool / following a protocol. It comes down to manually engaging with precision.
This is one of the reasons I've always been bothered by the over-humanisation of Siri. I feel there is a belief on the Siri team that if they make it feel more like a human with jokes and overly verbose responses then people will be more forgiving when it fails.
I've always felt the opposite, that it gives it a "incompetent super-intelligence" sort of vibe instead and I start to wonder if when I ask it to set a timer for the 3rd time if I'd be less frustrated if it just responded with a confused beep rather than a quip.
I think Siri did pretty well when I asked her these:
Q: Find me some good Chinese food
Siri: I found four Chinese restaurants a little ways from you:
Q: How are you?
Siri: Very well, thank you!
> respond the same way your boyfriend will
Note that she ignored the "good" part of the query, has no memory of the place you went to last time that you did/didn't like, can't filter by your preferred mode of travel, and so on.
Directions. - Take me to X.
Reading text messages - “Read my text messages from my wife” or “Read my last text message”
Reminders - “Remind me to call my X/do X when I get in the car/get out the car/get home/at Y”
Music/Podcasts - Play X/Play a song by Y.
Taking notes, calendar events, (What do I have to do today?)
For example, this "get travel time to input destination" Workflow: https://workflow.is/workflows/ff987bcf0ad746d496415d7f4c75a8...
Ordinary users will just pick prepackaged workflows out of the gallery.
And some I can imagine being very popular e.g. "Post last photo to Instagram".
As a non-user, and as someone who types faster than they speak, I find it hard to come up with compelling use-cases.
I can't type that fast and when I'm driving I don't want to type at all. For me, being able to say "navigate to 123 main street" or "play the beatles" is very convenient.
So far all I do is:
"Remind me to wake up at 6am" or "Remind me to get my laundry in 30 minutes" (sets it in the todo list reminder app, since the alerts are better than the alarm app)
But things like Reminders are quicker with Siri. Especially things like Reminders based on external triggers - like locations and getting in and out of the car.
Siri Shortcuts lets you define your own local command and control phrases from presented actions, effectively solving both issues. It could (depending on per user investment) make Siri way more individually useful.
I can’t decide how much of an improvement it is over iOS 11 though (which was already performing good enough for me.)
I think this year’s big winner is macOS, not iOS.
Definitely feels faster, even on an iphone 8.
(One of the replies is by Ari Weinstein, who is currently on the Siri Shortcuts team)
Either you can use drag and drop to write your own scripts, or you can run scripts written by others.
You can run your scripts in a variety of ways, including a trigger phrase you set with Siri.
Here's a short video showing the creation of a very simple script in an older version of the program.
If you are a podcaster, you might create a more complicated script that converts a source audio file to MP3, adds MP3 tags and artwork to the resulting file and then uses FTP to upload the result to your podcasting network.
Another script idea would be to text someone a list of the blocks of free time you have open during a given workday based on your calendar data to help set up a meeting.
The possibilities are wide open. You can even tie directly into web API's.
I’ve been waiting for this obvious functionality for years.
I took it another step further and am working on a version where it looks at the prices on Lyft and Uber, presents you with the respective options, and calls a ride to your next appointment. Shortcuts in iOS 12 are a real fun little thing to play with, the most fun I've had programming in a while.
Happy to send it to you. I looked for your contact info but couldn't find any way to contact you, but feel free to reach out in my profile if you want.
The funny bit is though, that your Google Home will still recommend that you set up routines for things, except you just...can’t.
Since I don’t have it available myself, I cannot comment much on it, but while searching for why it wasn’t appearing in the Home app, I certainly didn’t get the impression that people are impressed by it :(
There've been a lot of attempts by Apple to create richer hooks into apps like search integration, but they don't do much for engagement. There's some good ideas out there, along with anecdotal successes, but the interaction model isn't great and better app integration won't fix that.
Given that iOS will be prompting people with possible Shortcuts based on what they do frequently I can easily see people starting to adopt this.
I don’t think you have to use Siri to trigger these, it’s just an obvious easy/fast way. Users could still use the shortcut app or widget to do it.
As I’ve been watching some of the Apple community on Twitter since this started to go into beta they’ve already produced some fun/surprising stuff. This seems like it’s going to be a BIG deal.
What it comes to is the people on HN don't represent most people using iPhones.
I can tell you that the Siri suggestions in Search really annoyed me (usually use search to find... apps, not to be fed all the crap from random apps).
I want a clean, non-intrusive experience. Apple should focus making the hardware better and to get the software out of the way (don’t make me think about it) instead of pulling all this crap on its users in the name of innovations.
In iOS 11 the delay for searhing for apps is unacceptable. instead of displaying the apps found right away (substring match over all apps should be instantaneous), it waits until all the crap from Siri and other random apps is fetched.
It's extremely frustrating when iOS takes forever to find an app, or worse proioritizes all kinds of other garbage before the app's icon. If I type "waz" and I have the Waze app installed, I sure as hell expect to see the Waze icon instantly at the very top. Ideally, after typing just the "W".
All of the other stuff loads very quickly. Mail results are very very quick (over 3 inboxes containing over 100,000 mail items), and then a-bit-less-quick for the Siri search suggestions.
I'm actually shocked at how much iOS 12 improved responsiveness.
I…what? They clearly do, considering that they have the best smartphone hardware on the market and have what’s generally considered to be a good mobile OS? This feature is something that power users have been asking for years–if that’s not innovation in your eyes I’m not sure what is.
One thing I would like to know is why Siri cannot answer basic questions about what music is on my phone. For example, “what albums by New Order do I have on my iPhone?”. That does not work but “play album Substance 1987 by New Order” does.
English is so ubiquitous that, in my opinion, all Siri queries should be checked for both local language and english.
Alexa is quite popular because it has so many skills the people have created for it. That’s one of the reasons people claim Siri has been “behind“ and Amazon’s Echo devices have been doing so well.
I don’t know if Cortana or Bixby support anything like this.
The ability to have contextual shortcuts could be pretty powerful.
The "Ok Alexa ask $someapp to …" prefix is disgusting.
Interesting to see that the author is a former PR member of http://my.workflow.is/ + has one recent article for TechCrunch.
Oh hey looks they were purchased by Apple: https://techcrunch.com/2017/03/22/apple-has-acquired-workflo...
I'm not saying that this isn't good technology, rather I'm more concerned that TechCrunch is taking contributors who have ties to products being reported on and may have interests that are not being communicated to readers.
(I'm not accusing you of anything, just making commentary)
It should also be noted that post-acquisition Workflow (where the author worked at and is noted in the author blurb) was the precursor team to Shortcuts.
But as a reader, I do want to know this stuff.
It's Apple catching up.. which if this was a journalistic article would be mentioned.
"Apple announces same feature Google announced 6 months ago"