After a couple of weeks of this, I somehow got the idea that it was related to the devices' configured locations. And sure enough, telling the Home that I lived in the next city over fixed the problem.
So I started a binary search and eventually found that the issue was limited to my ~10x20 block Seattle neighborhood - basically the outline shown when I search for its name in Google Maps. I then also realized that it applied to weather queries on my phone as well, but since the phone uses GPS rather than a specific location setting, I could only reproduce the broken and working behaviors by crossing one of the neighborhood boundary streets.
Turns out it was some long-standing configuration issue with Knowledge Graph's entry for my neighborhood, and some recent code change in location-based weather queries began butting heads with it. Luckily I worked at Google at the time and was able to track down and pester people that could help fix the issue.
Anyway, one of our customers - representing a company in Germany I think - filed a bug report that said something like "Weather module hasn't updated since January". They'd been going to their fancy intranet home page and seeing the same weather for months at a time.
And this bug report just sat there. For a mixture of technical and political reasons, there seemed to be nobody in the European office able to pick up this report and do anything meaningful with it. We knew about it, we knew that what we were serving to paying customers was hopeless, but we somehow couldn't get our hooks into the right point in the Weather feed to figure out where it was going wrong. Or, collectively, we didn't care enough.
There were various technical and structural factors making it difficult to fix. Weather feeds were known to be problematic (still are I guess) and this code would have been surprisingly low-level C/C++ with custom serialisations and limited logging. Structurally there were problems in getting attention from a team in California to support a problem experienced by a different team in London, especially since it affected relatively few users - a tension in supporting paid products in a company that is focused on non-paying users at far greater scale.
(I am assuming this bug would have needed some actual development work - I don't recall, but I think we were familiar enough with common ops problems that it wasn't just a question of kicking one of our own feed servers.)
But I do think there was an issue about lack of concern - at heart we didn't have enough confidence in our own product to motivate the personal pain of working through these problems and getting them solved. I think that, if you had gathered us together and asked our collective opinion, we would have suggested that this customer would be better off not using our product at all - it simply wasn't ever likely to be good enough. Once you reach that way of thinking about your own product, it becomes extremely hard to countenance fixing the most difficult problems with it.
I've sent feedback and error reports about this repeatedly, and even had a friend that knows someone that works on Maps pass it on to them directly. It's never been fixed, and I've basically just given up on it at this point. It's really shown me how impossible it is to get any kind of support from Google for even an extremely obvious, straightforward issue.
I recently moved, and found out that virtually every single web site from my credit cards to my bank to the library uses Google to verify address entry on the fly. The problem is that Google's database entry for my address is wrong. So any time I try to enter the address "123 Oak Street, Apartment Q" Google unhelpfully corrects it on the fly to "Oak Street, Suite 1." No amount of keyboard jockeying can override Google's on-the-fly autocorrection.
Of course, there's no way to contact Google about its error. Maybe in Google Maps? I dunno. How do you find an address that Google Maps doesn't know to tell it that the address it has is wrong?
In the end what worked for me was registering as google maps client/customer, reproducing the issue via API, and then reporting it as an API issue. The underlaying data was fixed within a day or two, and I got my emails answered by google engineer within (literally) minutes.
(please do not take this as an endorsement of google maps support, merely an anecdote of what did work for me that I hope might help you)
With Apple, I've submitted perhaps 5 corrections for 5 different (usually minor) problems in 5 years. Problems like a place claims to take Apple Pay when it doesn't. Or the actual place is across the street from where Maps claims it is.
In each case, Apple sends back a notification within 2-3 weeks saying they've fixed the problem, and when I've checked, it has always been resolved. Pretty happy with the service.
"What is the weather report for today/this week?" is a more accurate question, despite an annoying amount of verbosity. But answers are still given relatively. "Cloudy" could be an accurate answer, for now. But it will be "Sunny" this afternoon.
Some people will prefer a one word answer to "What's the weather?". Others, will want an hourly breakdown of the day displayed on their screen. Others, might prefer a week. It's hard to give an ideal response for every situation.
Ironically, if I say "Hey Siri, weather" I usually get what I need.
Now that you mention it, I find it strange that voice assistants don't give natural responses when they encounter an error. It would make them seem more real, and it would be less frustrating.
When you ask a human what the weather is and they tell you to look outside, you don't try to rephrase the question in a way that will make them give you the right answer, you just realize this is not the way to get an answer and you look for another way.
Maybe this isn't the best way for it to work in this case, since there is a different thing you can ask to get the response you want, but maybe this would make interactions better if, say, the phone can't detect your location. "Where did you want the weather for? I think I'm lost."
Of course, the other problem is that if it gives the same response every time it'll get grating. The hundredth time you hear "Look outside," it has probably lost its charm. I wonder how possible it would be to generate responses that take into account all previous conversations, so that this doesn't happen.
One of my meteorologist friends always answers "What's the weather?" with, "The state of the atmosphere."
The possibilities are endless.
Other times it will tell me the wind speed.
Other times, for the same query, it will tell me the wind speed AND direction, which is what I want.
But it's always random what I get: wind speed, wind direction, a combination of both, or (occasionally) the definition of wind.
I wasn't aware of any other occurrences.
I think that pretty much any assumption we're making about strict hierarchy are bound to be broken at some point.
I only have one smart lock, which works perfectly, and it is called "FRONT DOOR" in HomeKit.
When I ask Siri about my FRONT DOOR she responds that she cannot find it.
When I ask Siri about the status of my DOOR, she responds with "The FRONT DOOR is locked/unlocked".
I'll then say 'Alright Siri you literally just used the phrase "FRONT DOOR" five seconds ago and the text transcript on the screen says "FRONT DOOR" hey Siri is my FRONT DOOR locked'
Siri: WTF are you talking about? You don't have a FRONT DOOR.
"Hey Siri is my door locked"
Siri: Your FRONT DOOR is locked.
Google and Alexa handle things flawlessly.
Me: "Siri, turn off the bedroom lights."
Siri: "OK. Your 6am alarm is off."
For the most part Siri works for me, with the exception of the above and her insistence on adding "ginger ale" to my grocery list as two items.
/Native English speaker, specifically trained in non-regional diction because I used to work on-air in radio.
Siri: "Contacting emergency services in five seconds"
To be fair I was in a noisy environment and Siri only got the "120" part but seriously, why would that be okay? My phone is registered in America with an American phone number and English set as it's only language. Why should it think 120 is equivalent to 911?
I know some countries use 112, but that's too many edge cases colliding.
Well there's your problem (/s):
me: "Hey Google, add half and half to the shopping list."
gh: "I've added those two things."
me: "Set an alarm for 2:30 tomorrow"
gh: <generic alarm set response>
wife at 2:30AM: "hey... HEY... why's the alarm in the kitchen downstairs going off?"
And never-mind the literal interpenetration of the command - it's far more usual to set up an alarm very early in the morning (got to catch a plane, unusual event) rather in the afternoon.
Me: "Hey Google, play Nine Inch Nails, you know, the one in my Google Play library"
Google: "OK, playing Nine Inch Snails, a band nobody on Earth has heard of and is definitely not in your library!"
Me: (Repeat a few times, trying all kinds of accents, eventually I get tired of songs about nine inch somethings, and I pull the car over and type in by hand what I'm looking for.)
It's quite interesting to hear very small children talk to voice assistants. Probably not surprising, but it seems like in normally-learnt human communication you expect your conversational partner to remember the context of what you were saying, and carefully forming canned commands is a separate learned skill. It suggests these voice assistants have still got a way to go, and it seems more like a paradigm change, a big leap, than just incremental improvements.
In the end I had a “draft booking” and a conversation loop, where the bot would repeatedly ask to fill in missing parts (eg nr of participants) and then give you a summary and opportunity to correct things. It was hard to do, and definitely required a lot of contextual understanding of how people book meeting rooms. That approach doesn’t scale up well.
I think the basic problem is being stuck in a local optimum. The scripted bot approach doesn’t scale to complex conversations, and you need to start from scratch to do better.
Wanna have the simplest parser? Finite State Automaton to the rescue!
So people automatically assume that a the simplest approach yo conversation is also something like a finite state machine.
Here's the thing. The only reasonable FSA would be a clique.
You can always move between nodes.
A much more feasible approach is the "actions competing for repelevance" one.
Where you have global state manipulated by actions, and all the actions generate a "appiccability score" for the given user input.
The system then chooses the most appropriate action, and it does it's thing.
And on the next user input the cycle repeats.
I've just been trained to not bother. Unless I'm setting a timer, I just don't try anymore.
Which is why I have no confidence calling it AI if its not even intelligent. Its just voice recognition on preprogrammed operations.
That's because it really should be called Simulated Intelligence, and would be a much more accurate description. The marketing team wouldn't like this though.
When it read back super incorrectly from an alias, I said something to the effect of "could you pronounce that correctly?" and it asked me to say it
Since then, it's understood that person's name. ¯\_(ツ)_/¯
It really needs to expose the option to train those easily
This article clears everything up, https://discussions.apple.com/thread/8116586
I definitely appreciate the effort they put into understanding the semantic nature of a device from the name I assigned it. Nowhere did I ever designate one of the smart outlets to "behave like a light."
If I say "Siri, call lawn cutting corp", she'll say "I'm sorry, I can only call a single person at a time."
If I say "Siri, call lawn corp", then it immediately opens the phone and says "Dialing lawn cutting corp."
File a bug and it will probably be fixed.
I think I just made a palindrome of homonyms.
If the person asking the question lives in Ohio, they may actually be talking about London OH (or Dublin OH). Some people in neighbouring states may mean the same, though they will be more likely to mention the state. However, how close should you be to London, OH even within the state to mean the Ohio one and not the UK one? How close is close enough? Is a few hours of driving close enough? A 3 hour flight? What if I'm roughly at 6 hours from London OH and 7 hours from London UK?
Further, if the person is a British expat in Ohio, especially if they are working for a multinational business (or not), they would more likely mean London UK. German expats, though? Russians? Or an Irish person who lives in Amsterdam having some relatives in Ohio US, looking to book a flight to Dublin. Etc. etc.
There are so many contextual layers here that even human assistants can occasionally get it wrong, and without the context the task becomes insurmountable for the "AI" algorithms. That is not to say virtual assistants are useless, just that selling them as "AI" is a big lie, bigger than even those who market these algorithms as "AI" think it is.
I would seriously doubt this assumption. Why on earth should someone living in a state specifically ask for the local time in a different location within that same state?
On the contrary, this context information would make it much more likely that the person actually meant "London, England". Except if there is a timezone border going through the state, of course.
However, I obviously agree with your general point regarding the severe limitations of what we currently call "AI" and how little "intelligence" there actually is.
True, and that's another contextual layer to deal with: that e.g. the state of Ohio is in a single timezone and that - why on Earth should someone ask the time within the same timezone? - like you said. And then there may be contextual exceptions even from this rule...
the fact that americans are inclined to say the state name as part of the name of a place could also help - since they might say london ohio, london might be more likely to mean the real london.
For similar reasons, anyone who asks for the time in Boston probably means Eastern Time regardless of how far they are from Lincolnshire, though I think the more usual and canonical way of referring to that is New York time.
In this case it is totally normal for someone living in Netherlands to ask the time for a polish city.
Who said they were in that state when asking? People travel.
Just use your imagination a bit.
Maybe I live in state X at location Y while my parents live at location Z in the same state, about 250 miles away from me, maybe there’s a serious storm where I live and I wonder if I need to check on my elderly parents?
Because they want to know what time it is in a different location.
14 states have more than one time zone. Do you know which ones?
No and that’s irrelevant trivia. What I know is if the state I live in is on that list. Oregon and it is. Don’t care what the other 13 are if I’m asking for a time in the state I live in.
Idaho is the same way. What time is it in Riggins? I’m from Idaho and I don’t even know the answer to that question. It’s a reasonable thing to ask siri.
"14 states have more than one time zone. Do you know which ones? "
I had hoped that would also demonstrate that the solution here is to just answer what was asked instead of saying “that’s not something people would ask.”
They would ask that and if you are designing a system based on what you think users won’t do then you’re gonna have a bad time.
I think it's possible that general acceptance of these non-AI gimmicks being referred to as "AI" will end up pushing genuine progress in true AI further into the future.
To take Gruber's example, if you had an office in London Ontario, were talking about setting up a video call, then asked your assistant "what time is it in London", and they picked London England because it's the most famous, you'd question how smart they were.
The context is not where you are, or which "London" is largest or most popular or least driving distance away or where you grew up or where you lived once, the context is why you are asking about the time in London, it's all of the present brain/attention/conversation/local state bundled together.
If I'm physically in a city named London, and you ask me literally "what is the time in London" that's actually an ambiguous question. Why would you specify the name of the place you're currently in? Typically someone in that instance would just ask "What time is it?"
I don't hire people to make "educated guesses" on my behalf. If they detect an ambiguity then I expect them to initiate a dialog that resolves that, not just blithely pick the first thing that comes to their mind.
But Gruber is not in London. If he was, perhaps we'd have a discussion about what the right answer might have been, maybe we need some more clarity, that kind of stuff. If I ask a stranger outside (well, if I did before everyone isolated themselves) they would immediately give me the time in London, England, and if I actually wanted the time in Ohio I would have to clarify myself.
> I don't hire people to make "educated guesses" on my behalf.
Of course you do. Do you really want people asking you questions all day when there could be any ambiguity?
You will be surprised on how many cities are named 'San francisco' in the world : https://en.wikipedia.org/wiki/San_Francisco_(disambiguation)
To me it seems that developing AI on that level will be here sooner than developing solutions to many context problems, given the difficulty the best funded algorithms in the world have answering this question which we humans see as very simple.
That wishful thinking has turned out to be dangerous tho, as we have moved towards an ML-dominated world where we don't even know how the ML algorithms produce specific results.
Add that to their bizarre behavior (like this example with Siri) and you'll realize chances of another AI Winter are not low. If we have to develop solutions to many context problems one by one, that may reduce so much of the hype and interest in "AI". We'd be basically back to square one.
But with better tools! It might not be going from 0 to 1, but going from 0.1 to 0.2 is still progress.
Canada kept a few.
I guess mississippi is one but most US places have white names.
Multi-hop reasoning models have started working surprisingly well. ie. reasoning over multiple levels and conditions.
Common sense reasoning is also getting a lot better. By having huge knowledge bases, the model can actually learn some degree of human like general purpose context. Such as, returning the time at the London which has the most similarity with the user's hidden representation.
The asker has a mental model of the answerer's default contexts, and if their question is likely to be ambiguous in those default contexts, they are more likely to be explicit in their ask. The converse is also true - if the asker is not explicit about which London, that's actually a signal to the answerer to lean even harder on default contexts and best guesses.
Humans do this without even thinking because though culture and conversation we are quick to arrive at shared mental models and lean on shorthands "in the other person's head". AI not only doesn't have it's own context, it doesn't have an estimate of it's user's context and where the two might differ.
Which is Siri?
What if you were calling an individual human personal assistant who knew you lived in London, Ontario?
What if you had previously clarified to this person that when you said London, you meant London, Ontario?
I think both of these questions ought to be relevant to the digital personal assistants that we're creating.
"No, I meant the time in London, Ontario"
"Sure. The time in London, Ontario, is 3:23pm."
You start by doing a best guess, and actually listening for a correction. For other kinds of requests, you reply with a best guess and ask for confirmation or for clarifying questions.
Hard, yes. Not impossible.
“What time is it in London?”
... gives answer for London Ontario
“No, London England”
... gives answer for England
However there’s no memory to this; the same thing happens next time you ask.
I don't see why that would be necessary, since a human does not need to know everything about you or have access to everything you've ever communicated to anyone to guess accurately that when you ask about "London" you probably mean London, England.
It depends on your circles/bubble and the context.
Sure, for me and almost certainly the majority of people the majority of the time, assuming London = London, England is almost certainly the correct disambiguation. However, maybe not someone in Ontario or Ohio asking a question about "London." And I expect that the person who sees London, England as this far away place they certainly don't have regular questions about would find that always being the default annoying.
But not on the intimate personal details and communications of the individual person, which was what the post I was responding to was about. Sure, there are going to be circumstances where London, England is not the most likely guess, but you don't need to have access to someone's entire personal history to know what those circumstances are, since a human can spot those circumstances without having that knowledge.
The problem is these "AI"s are plain stupid. The solution for it is moving on from gimmicky and hacky solutions to true AI.
Does it? AGI is very difficult, but I think this example only illustrates that Siri is kinda bad, given that DDG, Google, Alexa, and Bing all got it right.
I mean, if anything, just appreciate how amazing humans are at differentiating these ambiguities.
I think it's very worthwhile to point out these seemingly basic errors as a way to maintain appropriate skepticism about the limits of our technology.
That’s kind of weird. It’s not that Siri is especially bad. It’s that “Siri” is something different depending on how you query it. Other online search systems aren’t like that, and integration and consistency are typically Apple’s forte.
Look, I'm not arguing it's not a bug, but I'm just really surprised at how software people, who I think should know better, are surprised that such bugs exist, or more importantly that completely eliminating all types of this class of bugs is basically impossible with current technology.
The aspect that's ruffling feathers, I believe, is that it's one of those cases where someone might have reasonably assumed something was built one way 'under the hood', and was confronted with an effect which forced them to see that it was not implemented that way at all. The issue isn't the 'bug'. It's the realization that their mental model was wrong.
Specifically, something has a name ("Siri") which might lead one to believe that everything from that manufacturer using that name refers to the same thing. (Isn't that the point of a name?) Clearly, it's not.
Your hypothesis sounds plausible, so I tested it. I have a Mac laptop, which has the same 'Location Services' that iOS has (AFAIK). I asked Mac Siri what time it is in London, and got a response for the one in England (further away from me). So that doesn't explain it, or at least not all of it.
Half the time, Siri replies with "I'm sorry, you don't have an app for that. You can try searching for one on the app store."
Repeat the same exact query to the AirPods seconds later, and bam, it starts the workout on the Apple Watch.
Totally disagree, because the only way these assistants get better is with real-world usage (which is why I can definitely agree that one should wonder why Apple isn't improving as fast as the competition).
It was only a couple years ago that using Google Assistant was an extremely frustrating experience. I'd say it got about 5-10% of my words wrong, which meant it got my intent wrong about 25-35% of the time. These days I find its accuracy uncanny - it almost never makes a mistake with most of my "standard" queries. No way it could have gotten that good without real-world feedback and data.
I don't understand why anyone other than hobbyists can stand to use these things. They are so obviously years away from ready for serious use, and the novelty value wore off years ago.
Not shitposting here, these are serious comments about absurdly bad UX.
How can you stand to deal with humans?
I think the debate here is about whether anything has shipped that is actually "good enough". I don't think it's that controversial to avoid shipping stuff that's not good enough.
This is the difference between a usable product and something that is not
If my voice assistant is going to make a significant % of errors, it either needs to be very cheap for me to correct it (it's not -- usually you retry and if it keeps failing what do you do?) or I'm going to stop using it
Steve Jobs made great, not passable, products. It's a shame that Siri is so far behind
All these thresholds or ranking factors seem to come intuitively to humans (I would guess a good intuition for them is actually a sign of intelligence), but it seems to be incredibly hard to capture them in ranking.
As others have pointed out, a solution here would be to make Siri more conversational. A simple "Which London?" could've removed the ambiguity and given Siri the opportunity to learn something about that particular person (that London, England is more important to him than London in Canada).
IMO I would be very disappointed if Siri started asking clarifying questions at a significantly higher rate. Siri is already a bit too chatty, and I never feel like having an extended conversation with her.
I’d rather she just say the wrong thing (but make it clear that the answer is for a specific London, e.g. “The time in London Ontario is...”) and I can correct her. It’s the same number of conversational “turns”, but in the happy path when she actually gets it right the first time, it’s one-shot and done.
It’s a lot harder to get signal on this for learning, but I feel like there are ways around this as well. (Maybe saying “thanks” can signal she got something right, and prefixing the next utterance with “no” could signal it was wrong...)
Tell me about it. I have to unpair my bluetooth headphones every day, for stupid reasons that aren't Siri's fault. But when I say "Siri, open bluetooth preferences", it parses my command on screen VERY quickly, and then slowly enunciates "Okay! Let's take a look at your bluetooth settings." I'm just tapping my foot and waiting for her to quit talking.
Of course, then, 1 out of 3 times it takes me to the wrong settings page. Because if Settings has been opened recently, it can't deep link from Siri. /shrug
But that would make it almost as smart as an Infocom game from 1981. Something, something, doesn't scale, mumble, something...
Instead of looking at people, you can also scrape websites to get the relations. But here you may get a recursive problem because if a website speaks of "London", you might not know in advance which London they speak of.
Here comes my favourite brain freeze moment - recently my parents asked me to explain this to them. How do you construct a good search phrase? My brain blanked. I HAVE NO IDEA. It seems I have learned fluent Goonglish without noticing, and now can't explain the grammar or vocabulary of it.
Recently Google got much better in understanding full sentences and there are tons of SEO optimized pages for certain phrases. Nevertheless, using keywords is what I imagine advanced users do.
It's weird that it feels like my tried and true abilities are getting worse. Or Googles algorithm is hurting some of us that became very proficient in very specific ways.
The problem reminds me of the difficulty of programming in applescript. In applescript, articles like "the" can be inserted optionally in the code, and there are lots of equivalent ways to write things, i.e. "if x equals y" is the same as "if x is equal to y". As a result I never remember the syntax, and error messages are less helpful.
Until they have real metrics around how often Siri fails they will continue to think that their correct response rate is great.
Why would editors make user feedback any less valuable? It's hubris to think it's not.
“Sorry, could you say that again?”
Nobody asked for you to interrupt my chopping, Siri.
Sometimes you can ask a question and watch it be perfectly transcribed in real time, but then receive a nonsensical answer from it. Ask the exact same question immediately after on the same device, transcribed exactly the same way, and get the correct answer.
Where does such unpredictability come from? How can Siri transcribe the words correctly but fail to deliver the right answer?
Voice assistants generally use both the text transcription and a bunch of contextual metadata as input. That metadata could include things like what's currently visible on the screen, your location, your recent queries, etc.
So even though the underlying algorithms powering the assistant may be deterministic, the input data between two seemingly identical queries could vary quite a bit.
For instance, Siri almost certainly has context around the previous questions you've asked. It would be reasonable to assume that if an assistant received two identical questions back-to-back the initial answer was wrong.
In that scenario, the assistant might decide to use the a different answer (perhaps one that had a lower ranking) in an attempt to get it right.
I tried it again right after, and the reminder said "call FRIEND_NAME".
I don't think there was any previous conversational context or anything like that. Hard to fathom how that could happen.
But that fails completely when you get to names like London (or Paris or Moscow or Cairo).
But it happens with people, too. I'm from Mississippi, though I haven't lived there since I left for college. I now live in Houston. At a family reunion many years ago, I ran into a cousin I hadn't seen since we were kids. She asked where I was living, and I told her.
"Oh, isn't it terrible about that wreck?" she asked.
Baffled, I asked for more information. "Oh, you know, that wreck over on 406!"
I did not know. "I'm sorry, Houston's really huge. I don't know what wreck you mean."
"Oh, did you mean you live in Houston, TEXAS? I thought you meant Houston, MISSISSIPPI!"
I was, at the time, about 30. I grew up in that state, and lived there until I went to college. And until that moment, I had never even HEARD of Houston, Mississippi (a metropolis, it turns out, of about 3600 people in the misbegotten northeast corner of the state).
To approximate what a human would do, one would presumably want to start by ranking places on a range of dimensions:
* biggest (or maybe size category: big city, city, town, village)
* how many times user has asked about this place before
* how recently user last asked about this place
If most/all these rankings put the same place in the top spot, go for it! Otherwise, ask the user for clarification.
Reality is hard. And with machine learning (especially proprietary, remotely-hosted machine learning) there's rarely a way to pinpoint a line of code and say: "this is what happened and why you're now frustrated and firing hypothetical personal assistants".
Cities sometimes have clear legal boundaries that feel irrelevant to the question, like the City of London, but more generally have metro areas that sprawl well into an ambiguously defined countryside. There's rarely a "this block is city, the next block over is clearly not" situation, so the number of people you include ends up being pretty arbitrary.
When ranking by population it often makes most sense to use the population of the metropolitan area. That is, to ignore the administrative divisions, which vary too much, and focus on the physical reality of the urban area.
Two cities in one City; it shouldn't be allowed.
My guess is that Apple would find it difficult to provide robust references to John to explain why it happened, or how they've fixed it for him (and whether that fix is a one-off workaround for his complaint, etc..)
Remote, proprietary personal assistants tend to apply their own (generally unknowable and unaccountable, from the user's perspective) interpretation of the context.
Worse there is an Alton, New Hampshire, which confuses even Google sometimes.
Even worse, Apple seems sometimes confused where I am as I have twice woken up to see a tornado warning on my IOS lock screen in the mornings. I live pretty far from any decent tornados. Unfortunately, I have been too sleepy to prevent myself from unlocking the screen before I remember to screenshot it.
Me: "Hey Siri, play Radiolab podcast"
Siri: Which Radiolab podcast, Radiolab or Radiolab: More Perfect"
Me: "The first one"
Siri: "I don't know >the first one<"
Me: "Siri you're useless"
Siri: "That's not nice"
Me: "Could be but it's true"
90% of the time works fine, and it's essentially all I use Siri for. It's very convenient when cooking and my hands are dirty. But 10% of the time I get something along the lines of...
"I'm sorry, but you don't have the Timer app installed".
It's infuriating because I know Siri is dumb so I use the same exact simple phrases to avoid confusion. Sometimes it works, sometimes it doesn't. It always transcribes the command accurately though! I've actually lost my temper and smashed an Apple Watch before over this. This is in my house, on a very reliable network, always with my phone within a reasonable distance.
Before lockdown I even had it disabled entirely because it would get activated randomly from time to time, even if nobody in the vicinity said anything remotely close to "Hey Siri".
You can't reply with "no Siri, not that London" and have it remember. It doesn't learn your voice among the people who normally use your Siri in your household.
"Artificial intelligence" is always going to make mistakes, as do real humans. Humans can perform unsupervised learning - in fact it's one of the key skills that employers like to select on! Until AI can learn in context it's going to be very limited.
I've had to disable "Hey Siri" because my daughters name is pronounced vaguely similar to Siri. Worst thing is, Siri transcribes what it hears, and it transcribes my daughters name. So it doesn't hear wrong; it just activates on a different name than Siri.
I've tried telling Siri to shut up; but it never learns not to activate when I call out my daughters name.
At least that's what I heard about how iirc Alexa works.
Siri is easy enough so we never looked much into it, but “OK Google” for instance looked like a real PITA, so we did some research before buying an assistant.
It appears a ton of people just intentionally say “Ok GooGoo”, “Ok Boogle” etc., whatever is easier for them to pronounce and it works perfectly fine.
It’s a genuinely hard problem to solve, and I am willing to give the benefit of the doubt to Apple for instance when they have humans reviewing samples. There may be other motivations, there’s ton of people in any of these companies, any given feature must be seen from a different angle depending on the department looking at it.
But I think a lot of what we see as privacy violating is primarily an effect of the flaws and all the hacks needed to make the feature work at all (when it works).
My latest MacBook (16") is so unstable that it is actually funny at this point.
I develop on this thing. It is running a great Unix os. I can't stand desktop Linux. The hardware quality was the best with a wide margin before the latest gen. Battery life is also great. I like them for development work when they are stable.
A lot of people are also really invested into the ecosystem. My entire photo collection is on iCloud. I use an iPhone. I can copy paste between my computer and phone. My Apple watch unlocks the computer when I'm near... List goes on.
But now I feel like Apple is a fantastic phone company that also happens to make some computers. They have been degrading pretty bad.
I think it's less that OS X is bad now, but more that it's finally degraded to a level of annoyance that people just have gotten enured to with Windows. It's not to say that that's a good thing, but at this point, I have known bugs and annoyances with all of the computers I work with, no matter the platform.
Some of it is also that Apple has a "real" integrated ecosystem. To what you say, you can easily move things between iOS and OS X. If you're watching stuff on your Mac, you can throw it to an Apple TV or your Airpods. Windows doesn't have a version of that that "just works". The closest you get is opting into Google's ecosystem and going Chromecast/Android, but I'd rather not trust Google with even more of my info.
The MBP was the first laptop I'd used that 1) had a trackpad good enough that I didn't feel like I needed to carry a mouse around to use it for more than 10min at a time, and 2) had battery life good enough that I didn't feel like I needed to take my power supply with me if I'd be away from my desk for more than an hour. It had every port I was likely to need for most tasks. In short, it was the first time I'd used a laptop that was actually usefully portable as a self-contained device. They kinda ruined that appeal by going all-USB-C and The Great Endongling, but that's another story.
It was also very stable, and over time I came to really appreciate the built-in software. Preview's amazing (seriously, never would have thought a preview app would make a whole computing platform "sticky" for me, but here we are, it's that good), Safari's the only browser that seems to really care about power use, terminal's light and has very low input latency, it comes with a decent video editor, an office suite I prefer over anything I've used on Linux, and so on. In short it's full of good-enough productivity software that's also light enough on resources that I don't hesitate to open them, and often forget they're still open in the background.
These days I like having a base OS that's decent, includes the GUI and basic productivity tools, and that's distinctly separate from my user-managed packages (homebrew) rather than having them all mixed up together (yes, I could achieve this on Linux, if it had a core, consolidated GUI/windowing system so various apps weren't targeting totally different windowing toolkits, but it doesn't, so separating a capable and complete GUI "base OS" from the rest of one's packages gets tricky). There are quite a few little nice-to-haves littered around the settings and UI. Most of the software is generally better polished UX wise than Linux or Windows, and that doesn't just mean it's pretty—it performs well and, most importantly, consistently. There are problems and exceptions to "well and consistently" but there are so many more issues on competing platforms that even if it's gotten worse, it's still much nicer to use.
Given the premium on hardware (that's come and gone—at times there almost wasn't one if you actually compared apples to apples [haha], but right now it's large) I'd rather use Linux (or, well, probably a BSD, but that'd mean even more fiddling most likely) but the only times that's seemed to function genuinely well and stably compared to its competition was when I either kept tight control over every aspect of the system (time-consuming, meant figuring out how to do any new thing I needed to do that other systems might do automatically, which wasn't always a great use of time to put it mildly) or in the early days of Ubuntu (talking pre-Pulse Audio, so quite a while ago) which was really sensible, light, and polished for a time.
I do still run Windows only for gaming, and Linux on servers or in GUI-equipped VMs for certain purposes.
The devices are so complicated now that they cant do their most basic functions right.
I see something similar and assumed this happens because Mail / Calendar are relying on ics attachments (not sure what the behaviour is with the Gmail integration). I believe this means that if Mail is closed you don’t get Calendar updates until you open both and refresh.
Either way I find I have to refresh Mail and Calendar a lot to keep them in sync.
Calendar / Todos depends on the backend. If you’re using Exchange, check the settings to confirm that it’s not set to poll every hour or something like that.
This is supposed to be a personal assistant. And I have a whole list of what it could do for me, personally. But it doesn't.
I've been trying to figure out how to hook Google's speech recognition and voice into other apps, since they're great and it's 99% of what I need, hands-free control and feedback. Maybe they should make that easy, preferably offline and let other people create their own personal assistant modules or something.
Like they would ever do that. Then you would no longer be their "corporate bitch".
Seriously, the one thing that stands between home assistants and being useful is opening the software up and letting it be used by regular OSS devs. Alas, every one of the four big providers (Apple, Google, MS and Amazon) treat them as their moat; they want control over the ecosystem. It's the same in many other places in the industry - we're technologically way behind where we could be, because everyone wants to be the platform and commoditize everyone else, which necessitates having total control.
Maybe there's already something like that in the works, with all the talk and investment in AI, we should be seeing some real world results...
Microsoft OneDrive tags my photos. It's mostly useless. For instance, I have some pictures of squirrels on the tree outside my window. Squirrels can really do the most amazing things on trees, but they are small compared to the tree.
Microsoft with all its AI muscle will invariably tag those images as something like #Outdoor #Grass #Plant #Tree.
It's the same problem with all of those benchmark beating AIs. They have no clue what's special about the picture and what just happens to be in them as well.
Meanwhile, my iPhone has gotten much smarter about adding meetings to my calendar, or guessing the person who is calling me. These seem like the real use cases for AI going forward.
But yes, Siri is the worst.
Well, maybe some things are worse.
And maybe I'm the exception here, but I have refrained from buying an Apple Homepod specifically because of how bad Siri is. If it was on the same level as Google Assistant I would have bought one by now.
Now it randomly activates multiple times a week and really struggles to even pick up me talking to it right next to it.
Convinced they've switched to a less capable microphone system because assistants were all the rage in the 7 era but now I think people have realized it's not really that important.
For a while, it randomly decided that “call my wife” meant to “call my mom.” It clearly said call my wife on the screen and then switched to “mom”.
Set a timer for fifty-one minutes or fourty-nine minutes.
Even siri can hear that
I think Apple should have been more honest about it in their privacy messaging -
"Hey guys, pretty please can we listen to your Siri recordings? We know it's not the privacy style you're used to from Apple, but if you want Siri to ever not be a piece of crap, this is really the only way."
Apple knows a ton about me, they have realtime access to my email, calendar, contacts etc. If they have guarded access, then I would accept a toggle. A lot of Siri processing happens locally on the device nowadays, which could be why the Watch, Mac, iPhone and Homepod all can give wildly different results.
Also, Apple could train it themselves, they possibly do, except we haven't gotten an update yet. A large portion of my personal training data could be stored in iCloud, I mean my passwords, mail, documents and my photos are there, right? The analysis of my voice data is sent to Apple anyway.
My Nokia 1320 with Windows Phone 8.1 come with a very basic VA that was capable of understand Polish but only if I drop all grammar and talk like a robot. The "call mum" is "zadzwoń do mamy" in its proper form, but I had to do "zadzwoń do mama" which sounds unnatural; not mention that stuff is also being read without proper Polish grammar; "calling mum" is synthesized as "dzwonię do mama", not "dzwonię do mamy". The grammar complexity is a problem for VA technology and probably that's why neither Siri nor Cortana supports Polish or other Slavic languages, not mention dozen of other languages.
Every once in a while she will decide to listen through my car, but it's very rare. I don't have Apple Car Play so not sure if that's a factor.
Google has been far better and more consistent listening through my car even if it didn't get my query correct all the time, at least I could correct it without pulling my phone out.
Siri is like that nice employe who was hired by way of nepotism, and she's attractive, but she sucks at most things, but the organization won't fire her because of aforementioned nepotism in the organization and the only reason you put up with her is cause she's attractive and she, at the very least, makes coffee and makes photocopies just good enough, but you can't trust her with more advanced tasks.
What's worse is that the organization also won't hire her more talented and equally attractive contemporary Google Assistant because of aforementioned nepotism. The boss thinks there's only room for one assistant.
In the last year or so it's gone from correctly handling "add red salsa to the shopping list" to consistently adding two items "red" and "salsa". (It also fails on "buttermilk" and others.)
And around the time this started happening, Siri went from acting after a short pause to saying "just a sec" after a short pause.
Perhaps it's time to file a feature request to Apple to allow us to plug in alternative digital assistants in place of Siri.