Hacker News new | comments | show | ask | jobs | submit login
Alexa: Amazon's Operating System (stratechery.com)
278 points by misiti3780 141 days ago | hide | past | web | 168 comments | favorite

Alexa has a fundamental short coming that stops them from being a robust operating system like the writer describes: the cognitive load in memorizing all possible variations of 3rd party apps on their platform. Siri might have greater difficulty with natural language processing than Alexa, but I never have to think much about the word order or phrasing. With Alexa, I'm constantly feeling like I'm playing MadLibs with my eyes closed.

That being said, the openness of their platform is a huge benefit, and the main reason I have one. I for one am very interested in seeing which improves first: Siri's natural language processing, or Alexa's contextual matching of their api.

I know Google Now wins in a lot of areas, and there is a group of people that have no problem using them, but overall sentiment seems to be one of distrust in giving them an always on microphone

I made my own app and I can't even remember how to use it. I keep saying the wrong trigger word. It's my one gripe with Alexa is that the app name needs to be a prefixed.

What I want to say:

> Alexa, open the garage door

What I actually need to say

> Alexa, tell Some App Name to open the garage door

> It's my one gripe with Alexa is that the app name needs to be a prefixed.

The thing that annoys me is when this doesn't work.

> "Alexa, play <station name as it appears in Alexa App> on Pandora"

> "I couldn't find <station name as it appears in Alexa App>, here's some crap on Amazon Music"

You know you can set the default music provider in the Alexa app, right? Setting this to Spotify allowed me to stop asking Alexa to "play X on spotify" and just say "play X" without getting bounced to Amazon music.

Yes. It doesn't help. Alexa will understand what was spoken perfectly, it shows as much in the App. Sometimes if finds the station, most of the time it doesn't. Regardless of the default music provider or the qualified "on Pandora", Alexa will fail to find the request and default to searching Prime Music.

Submit this to Amazon and they will fix it.

I try and get it to play Kiss FM UK and says a strange word and then dies. Raised it to Amazon and they are looking into it.

The dream:

    > they will fix it
The reality:

    > they are looking into it

The dream:

    > they are looking into it
The reality:

    > it's written in a ticket somewhere

I submit every card in the App, it hasn't helped yet.

Huh. I have a very high degree of confidence that I used to do exactly that, though it's been a few months since I did so. I just tried now, and it doesn't work, just like you said.

I wonder if they changed something, or my memory has tricked me. (The weird thing is not only that I remember doing it that way, but I have no memories of having any troubles with it, and this is certainly what I would do on my first try)

This is also a big gripe of mine. I created an app that can be used as a fun "neutral" third-party when instructing my toddler to do something.

What I want it to say:

> Alexa, tell %NAME% it's time (for a nap|to go to bed|to eat dinner)

What I have to say:

> Alexa, tell Toddler Boss to tell %NAME% it's time to go to bed.

It's not natural sounding at all.

Exactly! Funny you should say that... one of the first apps I tried to make was to just say "Alexa, tell [my wife's or daughter's name] I love her" and I gave up because unless I name the app with my wife's name as the app name it won't work the way I want.

Russ Hanneman?

This is what he did on HBO's Silicon Valley. It was quite..weird.

He's disrupting parenting!

Could you not create an app called %NAME% which then uses Toddler Boss?

Ha! I didn't even think about that. But it's not ideal for vertically scaling the family :)

Names are cheap... cheaper than kids anyway!

How about speaking directly to your child? ;)

I guess...

I trust to publish such an app on their store (just to go through the process...they were giving away sweatshirts). they rejected me on the grounds of targeting kids. I'm guessing you never tried to publish this?

Nah it was just a fun project to introduce myself to the platform. I'm surprised they did that to you though. I wonder if that policy will change in the future considering there's tons of apps that target kids in their app store.

Is that a intrinsic limitation or just an omission with the implementation?

I have a TP-Link Smart Switch and have their Kasa app integrated with my Echo, and all I need to do is name my switches and say things like "Alexa, [device name] on", "Alexa, turn on [device name]". So it's definitely possible for 3rd party apps to turn devices on and off with Alexa without requiring users to mention the app name.

Maybe you just need to change your app to add your garage door as a device to the Echo and use the provided on/off commands rather than register your own custom commands? I can imagine the device on/off paradigm could be limiting for certain use cases, but not for this one.

I hate this, what would "on" be for a garage door?

I have a google home, and through integration with IFTTT I can say "Okay Google open the garage door" and it will work. It's so much smoother than telling the garage to "turn on", or telling the front door to "turn off".

It's the major reason why i've stopped using and actually sold the echo but i'm getting a second home soon.

I love all of this smart home stuff, but the software developer in me cringes at the thought of some message queue connection getting messed up somewhere and my garage door going up and down 3 hours later as I lay in bed...

I've got fixes for that, mainly that it can't open from the Home when we aren't home, and it announces when the garage door opens by literally "announcing" it through the home's speaker, as well as a notification on my phone. There's also some trickery that you can do with IFTTT's activation time variable and checking how long it's been since it was "asked" to do something vs when it got to my system.

It's not for everyone, but i'm really enjoying the setup. Although if i could I would remove IFTTT from the whole equation. Everything else is running off of a server in my house, so IFTTT is the only part that's really not fully in my control.

But as it's nothing but a "layer" on top of the rest as an "ease of activation" kind of thing, i'm not that worried.

My (now sidelined/back burnered) startup's main product was kind of like an advanced IFTTT that runs on your own hardware. The major difficulty was scaling up development and production without capital, and getting a basically Turing-complete system to be easy to use.

While I would love a "self hosted" IFTTT, the biggest issue to me would be getting the integrations.

IFTTT is the ONLY thing on Google home right now that lets you create fully custom voice actions. It's got a lot of things like this.

There are a few similar services, but at the end of the day, if it's within my capability to talk directly to a service, I'm going to just write my own code to talk to it (or my FOSS HA system will more likely support it very quickly).

IFTTT's power to me is completely in its ability to integrate with otherwise "locked down" services.

> "Okay Google open the garage door"

Isn't this quite a security risk? I'm imagining a burglar standing outside with a megaphone and ordering your house to let them in.

Well considering the fact that it's pretty far away from any windows, and my condo is on the 3rd floor, couldn't they just break the glass window? Or shove the door down? Or more likely use a bump key or even easier fuck with the garage door RF "encryption" which IIRC is laughably weak.

But that aside, it's setup so that my home automation system will only listen to the voice actions if there is someone home. But that was more to prevent IFTTT bugs from opening when I'm not home.

If someone wants to go through all that trouble to break into my house, let them. I have insurance and security cameras. I'm not going to build my house like a vault to prevent someone from stealing my TV any more than I'm going to walk-around in a bullet proof vest to prevent someone from shooting me.

Edit: Sorry if this comes across a little shitty sounding. I get asked this just about every single time I bring this up. I have this response pretty loaded...

I think there's special integration for home automation. I have the same switches. Remember when you first set them up and you had to go to a special "Smart Home" section of the Alexa app? There's a special section there just for that stuff.

I'm not sure if that part is open to developers or not. It looks like it is. There's a "Get More Smart Home Skills" button that shows 59 options, so it's probably not just for the big boys.

You could set it up to say "Alexa, tell my garage to open." Not perfect, but a little better.

As someone who recently got an Echo Dot for Christmas, it seems like some apps do have this capability. For example, I paired my Sensi Thermostat, and I only need to say, "Alexa, set thermostat to 70 degrees".

Smart home functions are built-in, and not a "skill".

Used Siri since 2011 and use to love it. Then this summer I used Google Now and just recently bought a Google Home. Google wins hands down and makes using Siri frustrating!

For me Google Now and Alexa are the next huge leaps in computing ... like the iPhone was. I wish Google would create an app store much like Apple's where users try an app out and buy it for a buck if they like it.

I have my google now hooked up to turn my TV, lights and thermostat on. It understands my commands 97% of the time vs. Siri's 85%.

Can you activate Google home without saying Google yet? Okay Google! Is about the most awkward thing to have to say as a wake word as I have heard. It makes me want to throw my phone if I have to say it more than once. Hey Google! Isn't much better.

They are never going to win if you have to babble awkward phrases when you use it. Alexa just rolls off the tongue.

Edit: Amusingly, when I said "OK Google!" "Change your wake word" to my phone just now... It replied with: here is information from Amazon.com. (information about how to change Alexa's wake word to Echo or Amazon).

Ok Google. You have to fix this.

You can say "Hey Google" with the Home. It's a little easier to say.

before I started I am using Siri to type that stop it Siri…

Well I agree… I don't like having to say OK Google turn the light on… Or OK Google… Or everything… It does not feel conversational. But there needs to be a command otherwise the AI speaker will be doing thing every time I hear something… Maybe we can change the name that would be good

Eventually this sort of thing will be triggered with gestures. Big at first, but gradually smaller until one can input surreptitiously.

Gestures from way across the room or from a separate room entirely? To me voice beats that.

I'm not sure why it should matter where one is when one performs a gesture. If I'm sitting on the couch I certainly don't want to have to stand up to adjust the lighting. Everything gets smaller until it eventually disappears, and the future versions of Echo etc. will just be integrated into the ceiling or a "smart" shirt or whatever. One won't have to worry about where they are, unless one wants to have a conversation without FBI or the spouse hearing it.

> With Alexa, I'm constantly feeling like I'm playing MadLibs with my eyes closed.

Well said!

Echo has an always on microphone as well. The story is that audio gets sent to a temporary buffer on a server somewhere to listen for your trigger word, and I trust them, but it would be so easy to start storing that data if they wanted, (or rather, the NSA wanted).

I'm not normally really intense about privacy (relative to other tech people I know), but all these devices creep me out.

No, the processing for the trigger is done locally, no network communication until the trigger is recognized. That's why you can only have one of a couple of trigger words.

The same is true of most modern android phones with Google Now AFAIK. E.g Moto X was one of the first phones with a low power dedicated ASIC for wakeup-word recognition to further make it sustainable on mobile devices.

Yeah I read somewhere that it was all sent off, but everything else I've looked at says it's all local.

The data is not sent to a server all the time - only when someone is using the device.

Source: tcpdump

Do you have a rough guess as to how compressed the outbound audio is?

I was under the impression that the audio is stored locally while listening for the trigger word. Once the trigger word is recognized, Alexa begins recording and sends the result to be processed.

We got an Echo over the holidays, and watching my 7, 5 and 3 year old daughters interact with it has given me a glimpse of what bringing home a black-and-white TV in the 50s must have been like: Objectively it's pretty limited, but it (voice-driven interaction, rather than the Echo specifically) is so obviously the future it's striking.

Separately, I've told my daughters that they probably won't ever need to learn to drive--cars will probably do it for them by the time they are driving age.

They've put two and two together, and the other day I overheard them saying "Someday we'll be able to call an Alexa car and have it take us where ever we want."

I've had the same experience. My 5 year old came down stairs the other morning and put on some music for herself (Bowie, even) which I thought was wonderful.

Though it was pretty funny/depressing hearing how broken the experience was this morning.

    "Alexa, play Moana" - "Here's songs by Nirvana"
    "No, Alexa, play Moana soundtrack" - "I think you might like music by Adelle"
    "What? Alexa, play the Moana original motion picture soundtrack" - "Playing Moana original motion picture soundtrack"
We're so close, but we still have a long way to go.

The other night, sleepless and excited by having found a few commands around podcasts / reading kindle books back (apparently "read from my Kindle" is an Audible command, grr) I tried:

    "Alexa, play me a podcast"
Sadly, I got:

    "Here's a station you might like: Linkin Park"
...it even got my request right in the app. I have no idea how it managed that "fulfilment".

Well, unless you ever annoy Amazon, bounce a check, charge back any purchases, or get put on any government no-drive list. Then you're walking for the rest of your life.

I watched my friends' two-year-old stand in front of their Echo excitedly and yell "Alexa Alexa Alexa Alexa!" Her "toddler accent" meant it had no idea she was trying to say the keyword.

My toddler pronounces it "Uh-yexa" which won't activate. That's a feature in my mind.

"Uhyexa Order bears!"

"I found a teddy bear for $49.99, would you like to order?"



"Uhyexa, order ehyephants!"

Funny thing is, my toddler was babbling something at dinner and the Echo Dot picked it up and tried to do a Bing search for "Eva has more fries".

My four-year-old son is the same way. He talks to my Google Home all the time, asking it weird questions (which it sometimes answers! "Hey Google, are you a robot?" "I prefer to think of myself as your friend."), getting it to tell him jokes, turning our Christmas tree on and off via IFTTT and a WeMo switch. It's all very natural for him.

I think we still far away from "Someday we'll be able to call an Alexa car and have it take us where ever we want.". It might or might not be possible. We need anyway several leap in AI innovations to achieve that.

True, but it's plausible enough that 7 year-olds can connect the dots from existing or close-to-existing technologies.

brb.. going to start self driving car company called "a lexa".

"Alexa, get a lexa car"

"A lexus car has been purchased with Amazon One-Speak™. You have been charged $72,040. Delivery will be between 7 to 14 days."

If they're not, make sure they're asking it things like the height of a giraffe or weight of an elephant. 3-5 year olds seem to enjoy that.

They have to remove branding from Alexa "skills" so that more natural requests are possible. The #1 thing preventing me from installing most "skills" is that invariably I would have to speak "Alexa, ask <ridiculously-named product> to X" instead of just "do X". I should also be able to use the app to write the exact command text I want to use.

This is my general complaint about all of these voice controlled services. I want to be able to set my trigger word. I will not be unpaid brand promotion for Google, Amazon, etc. I can mostly deal with "Alexa" or "Siri" since those are actual names, but I'm not going to go around saying "Ok, Google".

Humans name things. Pets, cars, computers, houses, kids(!). We need to be able to name our AI pets, too.

"Ok Google" is one of the most annoying trigger phrases. I literally have a hard time saying it and have to deliberately slow it down to get it to realize I'm trying to trigger it[1]. This has led to me pretty much abandoning that service..

It's the difference between saying "ok goo-gul" and "ok googl". For whatever reason, the second "g" sound gets dropped or blurred when I say 'google' aloud, so I have to remember to slow down and say it the first way.

"Alexa" and "Siri" are simple, and have no glut of soft sounds stuck in the middle of the word.

Hah, I have the opposite problem, it triggers too easily for me. My roommate's name is Hugo, and we both have Android phones and a Google Home. If I say "Hey, Hugo" you can hear all 3 devices turn on.

I know the "OK, Google" detection is supposed to be linked only to your voice, but I find about 30% of the time other people can trigger it anyways.

That's hilarious. And a perfect example of why trigger words need to be broadly customizable.

Yep, my father tried to use "OK Google" on his new 5X and ended up triggering my phone instead.

Agreed - I listen to podcasts with gmaps running on my phone, and about once a week some random non-google phrase will trigger it.

I will never say OK Google (except when saying I will never say OK Google). Gross

How would you prevent this from being a back door into a person's house? Or, more directly, how do you prevent applications from stealing phrases from each other?

That is, you are basically saying that you want everything in $PATH. This would be like if git had decided that "log" should just do "git log". Certainly could make sense. And I agree that users should be able to allow this.

However, the applications? I'm not as sold. You are basically allowing a situation where the fundamental behavior of the system would change from installing a single skill. And it might not be clear on how or why it changed. (Certainly not to most users.)

I think it would work like the “default browser” or “default mail client” concept. While it’s possible to install conflicting apps, it’s certainly likely that I will only have one logical handler for a certain type of general request and likely that the app I most recently installed is the right choice when there’s overlap. (The Alexa phone app could be used to change that, when the default assumptions are wrong.)

Also, voice commands take significantly longer than typing and there is no real auto-complete. The cost of a wordy voice command is much higher, especially if you stumble at any point and have to say it all again (usually while trying to talk over one of Alexa’s wordy error responses).

It just doesn’t make any sense for commands to sound like marketing material. This is the “Windows 95 Start Menu” thinking where every app is under a “FooBar, Inc.” submenu instead of just getting to the point and showing you the app you want.

There is a lot of "best intentions" here. "Default" applications almost work, but really only exist because we have well agreed upon url schemes for specifying a few things. Mainly "mailto:" and "https:". The rest falls into the hell that is default application for file type. Which is the largest source of malware and other nonsense in consumer computers.

Seriously, let that settle for a minute. "Default" behavior for executable applications is the most preyed upon phishing vulnerability there is. I do not want that introduced to a home automation device. I hope we do better than that.

Perhaps they could let users set a skill alias.

I fully agree that should be possible. Apologies if I didn't make that clear.

This and custom voice training for device names are on the top of my wishlist for the Echo.

For some reason my Echo has a really hard time recognizing some of the rather obvious names I give to my devices, like "LG TV". It'd be great if we could train the voice recognition engine to associate certain pronunciations with a specific device name.

The Alexa app does have a generic voice training selection...

One exemple of a possible answer:

"alexa, add a task saying blablabla" "Ok Bob, the task has beeen created in your favorite todo app. Would you like me to transfer it to another app?"

"Oh, jeez, Alexa, just do it already without asking me stupid follow-on questions!"

One of the reasons I'm less sanguine about voice as an uber-interface. I'm not sure you can square the circle between a rich and capable interface and one that isn't interrogating you like that. Computer screens can pop up arbitrarily large amounts of data on the screen (entire EULAs) to be dismissed almost for free. The serial nature of speech is going to be a big challenge for most people. (There are, of course, those who use it all the time as their interface. Perhaps a few will even read this post. But I'd submit their usage doesn't end up looking or sounding much like the Star Trek ideal.)

I am not completely familiar with Alexa (I use Google Home) but can't you make a recipe with IFTTT to accomplish this? Though it still does not rely directly on the ALexa OS.

Alexa currently requires memorization of command sequences. Can take a while for older users to learn.

E.g. there's a skill (AskMyBuddy) that sends a preprogrammed text to a list of cellphones, typically a request for help. But the user needs to be trained to remember the exact command ("Alexa Ask My Buddy to send help"), which might be forgotten if someone is in a stressful situation.

Anecdote: tried to create a todo list with one item, "buy milk", but Alexa would not accept this item unless you setup the ability to purchase items from Amazon.

Weather does not work outside the US (needs a US address).

However, its NLP is quite good. Having my (Vietnamese-born) parents try the Echo Dot, I was struck by how awful their English actually is, at least in comparison to how my brain has come accustomed to their speech patterns. For one thing, they enunciate "Alexa" much differently than I do. And then where I would say, "What's the weather today", they say, "What weather today is?" or "What day this week will be snow?" To my surprise, Alexa actually understood their convoluted phrasing, at least for the built-in skills that Alexa has had well-honed out of the box.

Thanks for this comment, I hadn't really thought about how NLP may fail ESL speakers.

Do you think your parents would be more comfortable talking to it in Vietnamese if it was possible? I'm mostly wondering if handling good Vietnamese is easier than handling English with poor grammar.

Definitely. Their English even after 30+ years in America is at the level where our English-English conversations are probably at the elementary grade level (except with a few more proper nouns). I used to think that they were at least better in understanding my natural English conversation, but I've realized that I reflexively shift to using much simpler sentences when speaking to them. But when they're with Vietnamese friends or family, they have conversations (in Vietnamese) just as normal adults typically do in their native language.

Conversely, I've since realized that the Vietnamese that I think I understand is probably at the toddler level. For a very long time, I just assumed the Vietnamese language lacked features such as pronouns and articles. But then I realized that when my parents tell me to go wash the dishes, my brain just fixates on "wash dishes" and ignores all the other connective words. So I know a lot of verbs and nouns but very few words that are part of everyday conversational speech in Vietnamese. I imagine that's what Alexa feels like :)

Your comment made me curious how well the Google Translate app can deal with foreign languages. I was stunned to see that it could understand my attempts at Vietnamese. So other than a proportionately smaller dataset to learn from (Vietnamese usage vs English usage of Google or Amazon), seems like Alexa and Google Home could competently deal with foreign languages.

Handling poor English grammar is a necessity anyway if you want to hit mass market share, since not everyone speaks as if they read the Oxford English Dictionary every night even when they are native English speakers.

"What weather today is" Why is this considered an amazing NLP? I type any order of related words in Google (or any other search engine) and the intend of my malformed sentence is sufficiently understood to bring back relevant results. Even if I formed my sentence grammatically, the result is no better.

That's because most search engines use a form of NLP in their processing of user input for queries.

No. the most trivial search strategy just tokenizes the input and looks up those token in the index. A poorly formed sentence, is not a detriment to finding out the intend, as long as key words are there. It doesn't require complex NLP to achieve this end result.

Ironically, "What day this week will be snow?" doesn't give me great results on Google, but Bing actually shows a weekly weather graph.

"snow this week" will return the weather forecast for my location on google

Yeah, but that's grammatically valid. It's not a question, but it's a valid sentence.

I don't think it's a valid sentence, unless you interpret it as commanding someone/something to snow sometime this week.

So will "this snow week when"

Then why is it that you get just as good, if not better, search results when you type words in decreasing order of information content and omit common words, some of which are essential for parsing natural languages? E.g.

   Beatles walrus -> the song "I am the walrus" by the Beatles.
   Tintern vacuum -> the song "Vacuum cleaner" by Tintern Abbey.

If you say the intent activation word (weather / snow / light etc..) the intent will trigger. Phrasing can help for certain queries though.


Interesting. The phrases I typically use for Google Search are also not considered proper English grammar, but are highly effective in getting the results that I want. I wonder if that's the case as well here.

Most commands are pretty obvious and easy, but some of them aren't. For instance, the phone locator. The command is something like, "Alexa, tell trackr to locate my phone." But my wife and I can never remember that when we need to locate a phone, and we have to figure out it out all over again.

I assume that people have to call for emergency services less often than we misplace our phones, so I'm sure it's even harder to remember the exact verbiage for that.

Alexa added "Ham Bread" to my shopping list yesterday.

Not super on-topic, but I laughed when I saw it.

Weather works quite well for me here in germany.

Thanks. Looks like some countries work, others don't.

Weather works fine in the UK.

As of last week, Alexa cannot tell me how many inches per second is an atto parsec per micro fortnight.

Google can.

Alexa can tell me how many sides a hexagon has. But cannot tell me the name of a six sided polygon.

If that's the main criticism of Alexa, then I'm rushing out to buy one immediately.

I'm not interested in an always-on voice activated device so that I can ask it rarely asked complicated questions. I'm interested in how well it understands my voice, and how many day-to-day chores it can help with.

The ability to understand you involves understanding language. You have to understand the world to understand language. Google built a 70 node knowledge graph that drives their understanding of the world which enabled them to create a intelligent device. Or something that is more revolutionary versus Echo is more evolutionary.

I have both and purchased the Echo when first launched. The Google Home amazes me every day.

So with Home you say play Madonna song Sean Penn movie and live to tell starts playing.

Echo has a command of "goes like this". Never used it. But everyday Google figures out something.

Our understanding of the world allow us to talk in a condensed form as humans can infer a lot.

Google is doing the same so more and more I talk to Google Home in condensed format.

It is so early with all of this but Google has a incredible foundation. The future is going to be incredible.

Seconding this, Google voice search (on my Nexus phone) is incredibly good. If you've spent the last 18 years expecting valid results from Google searches even when you dump a bunch of vaguely related words into the text box then Google's voice search gives you the same experience.

I could be wrong, but isn't the voice search merely a voice-to-text layer across the top of the normal text search? My old BlackBerry had a search feature and voice command. To this day I think it is slightly below the quality of Google Home (played with it at a family member's home; don't own one myself).

My daughter does voice searches, asking it to display pictures of various pokemon. "Show me pictures of the pokemon XXXX". These things have crazy names, and she doesn't have very clear speech, but it always gets them right. It is changing it's dictionary of probable words you will say, based on what words you have already said.

If she unclearly said, "Show me pictures of Absol", it would fail miserably. But if she says "Show me pictures of the pokemon Absol", it kicks butt.

So I would say it's more than a voice to text layer on top of the normal text search.

Should have been 70 billion. :)

Sure, Google offers a better technical solution.

But do they have enough incentives ? A good business model to be willing to offer it for almost free for people ? The will to push it as hard as Amazon does ? It doesn't seem so.

Yeah, Google is only selling Google Home for $50 less than the Amazon Echo, and only have 87% smartphone market share for Android where Google Now can be installed, and only over a billion people using the Google Play marketplace. They have no hope of competing with Amazon in this space.

Maybe not as heavily as Amazon (literally adding items to your Amazon cart for items you need), but indirectly, they do. Since Google Home is integrated with all Google products (calendar, keep, maps, Youtube, chromecast, Android and more to come), this keeps the user in the Google eco-system.

Right. And it is a gateway to future Google services. Car hailing, flight booking, etc.

Google has huge incentives to get this into everyone's home.

I had fun asking Google, Siri, and Alexa, "is Jesus God's son?" Siri brings back web searches. Alexa says that she's sorry but she doesn't understand the question (even though the card in the app shows she heard it correctly). Google brings back the theology of Jesus from a muslim point of view. It then reads it out loud. The wikipedia entry goes so far as to argue against the Trinitarian doctrine.

Had fun with that last one during the Holidays at each of my conservative Christian families. I thought it was hilarious.

Hm well when I typed it to Google: http://imgur.com/sPNB3O2

And when I said it, Google heard "how many inches per second is Sarah parcak fortnight"

Regardless, why do these anecdotes have to do with the article's argument that "Amazon is building the operating system of the home — its name is Alexa"?

If you click on that suggestion, Google then does tell you that 1 Attoparsec / Microfortnight is 1.00433 inches per second.

Another problem with Alexa as an OS here is Alexa, Siri, and OK Google will never work for some classes of disability. Myself I'm Deaf/Mute, so I can't talk to any of these things, or hear their responses, so they are 100% completely useless to me. There is a good reason text interfaces are so ubiquitous, they can easily be changed into Assistive Devices (like say spoken word for the blind).

That may be true, but really they are a interface to decision making logic( AI ). The AI looks for text to power it, that might be spoken word, or in your case in a few years I can see a camera reading your signing and having the ability to sign back to you.

Would it sign back or simply display written text? Seems to me that signing back is just an artifact of the limitations of human body movements as an output channel... or are there many people who can read sign language faster than they can read letters? (honest question, I don't know the answer)

Reading is entirely different for the deaf and is far inferior when compared to seeing something signed. Take a look at https://www.jw.org/ase for an example of how a website can be tailored for the sign language community.

Seems to me a full-body language has got to be much more expressive than text on a screen. Probably text-to-signing will be just as flat as with text-to-speech, but there's more room there for higher bandwidth than text.

yes, many people can "read" sign faster than text. The language is not a 1-for-1 translation of English among other reasons.

Sometimes this goes horribly wrong: https://i.imgur.com/uxcATG6.jpg

Yes, I agree with you.. but are there any applications built for the echo that also have a text based interface? Last I looked the answer was 0.

I haven't tried it, but I think Google's Allo lets you do similar things using a text chat interface?

Funny I was thrilled at how smartphones could improve lives of blind people. I forgot that some people couldn't hear. Although there might be an efficient gesture/visual interface they can provide ?

Disabilities are varied! Deaf and Blind are just 2 of the many colored rainbow of the disability spectrum :)

>. First, no company could ever build enough phones for the world, and secondly, to serve every customer would ruin the profit margins that make the business model so successful.

Both seem wrong? They could produce more models at lower price points and lower costs and maintain similar margins, like Samsung. Probably the branding effect on margin would be smaller but would still be a healthy margin.

Am I missing the point here?

Don't think that the "operating system" for the internet of things is fully defined yet. Amazon is simply the first and the biggest mover. Google has also launched Home. Apple has Homekit. Interesting to see how it pans out

Homekit is really interesting when stuff works with it (which is not a lot), and when you can use your voice with it (which is limited).

If Apple could really get more people onboard with Homekit and build standalone Siri devices it could be in the pole position here, but Apple doesn't seem to get what it has with either Homekit or Siri.

Homekit makes it really easy to set up different scenes with different smart devices and interactions. Setting these up requires good user interfaces beyond voice. Apple gets this.

Unfortunately, Homekit is really limited right now.

This is a very good point.

I wonder also if we'll ever see one established and dominating others. There are a lot of other small players with a future: Tizen (on Samsung and other devices), RTOS, Kontiki, lots of proprietary stuff based on Linux (e.g.: Garmin stuff), QNX, ...

I can't talk about Elixir in front of Alexa. It is always chiming in "I'm sorry, I didn't get that."

I haven't had Alexa for long, but I haven't had it trigger off of Alexa-sounding words yet. I did do the training in the app a couple of times when I first got it, though. Have you tried doing the training yet?

It belongs to a friend, so I haven't trained it. But I will probably buy one soon, and then figure out how to use it well. I could always use a unique phrase...

"27ffe8a1-4cd0-4739-bc46-9ad51a9c14ba, turn on the living room lights."

No... you can't... there are only a couple pre-programmed wake-words "Alexa", "Echo" and something or another.

The third wake word is "Amazon".

Good to know. Thanks!

You could have it configured to respond to "Amazon"' IIRC. Probably a decent mitigation for users actually named Alexa or Alex (e.g. "Alex, hon, when are we going to dinner?")

"The concept of an operating system is pretty straightforward: it is a piece of software that manages a computer, making said computer’s hardware resources"

So the author luxuriously simplified a sophisticated piece of software like an OS to simply prove his point that Alexa can be classified as an OS because it is a software that also manages hardware resources.

Someone please correct but I was under the impression that Alexa is merely a facade OS and behind the scenes it is an amalgamation of sophisticated intertwining of web services and data crunching. Can Alexa still be at its 100% without internet?

Edit 1: Grammar

Does an OS necessarily need to function without being connected to the Internet? Is there anything wrong with a definition of OS that includes "Internet connectivity"? How does Internet connectivity differ from hard disk connectivity, RAM connectivity, floppy disk connectivity, etc? That brings up questions of what the word "function" means. My phone will technically function without any wireless connection, but then it's not really a phone, is it? The Echo will turn on without an Internet connection, at least far enough to be able to say "I'm sorry, I don't have an Internet connection" most likely.

It's the ages old question: is Linux an OS? No, it's a kernel. Is Ubuntu an OS? No, it's a distribution. Is Gnome an OS? No, it's a desktop environment.

You could say with a straight face that Alexa is an OS that's designed to run cloud software. The OS works without a connection, but the user land might not.

Alexa runs on top of Amazon Echo, which runs on top of Amazon FireOS, which runs on top of Android, which runs on top of Linux, which runs directly on the hardware. Linux itself is a rewrite of Unix, which is a very pared-down variant of MULTICS, a system developed in the 1960s.

Something which doesn't run directly on the hardware, but instead communicates with it through several layers of intervening software, isn't an operating system by any stretch of the imagination, however impressive it might otherwise be.

I'm disappointed that large high-tech companies such as Amazon, Google, and Facebook don't develop their own operating systems from scratch like Microsoft did, but instead just take advantage of the hard (but not particularly innovative) work put in by Linus Torvalds and other open source developers. Building something better, using lessons learned in the intervening decades, should be well within their grasp.

It is hard to bootstrap a new O/S ecosystem.

Check out Google's Fuschia for an alternative


Yes, that's different (and, with capability-based security, a step in the right direction). Thanks for pointing it out.

I'm not sure how Tansley's concept of ecosystems applies to software (it's just a buzzword to me), but if you mean how to get software running on existing systems to run on a new system, that would take time and effort. If the operating system is intended to run on specific devices (like Magenta/Fuchsia), it might need its own "ecosystem" anyway.

Windows is an amalgamation of drivers, system services and applications.

The only real difference is that Windows runs locally, Alexa doesn't.

As long as it's connected to the internet arguably you can't tell the difference.

Unless you are going to have an academic study about what an "Operating System" is Alexa is an OS since it's the "platform" your applications run on, and it's a unified interface with the same rules and expected behaviour.

Alexa is about as an OS as "Chrome" is (not talking about the linux part of Chrome OS), user's don't know about kernels, drivers, services, daemons and all that nonsense they can grasp what an interface and a platform is and for them it's indistinguishable from an operating system regardless if academically based on current software engineering convention it is one or not.

Technically ChromeOS isn't an operating system either, it's some offshoot of Linux that runs a Chrome Browser and all your apps run as Chrome Apps, you can run native applications to some extent on it as well but it's not the interface and expected behaviour users would experience.

I agree with you on the author's definition being weird, but I do agree with the author's conclusion that Alexa is indeed an "operating system".

I tend to define "operating system" as any software or collection of software implementing both of the following:

* Abstracted access to hardware (whether physical or virtual)

* Execution of other software / applications

In the case of Alexa:

* Alexa implements abstracted access to audio inputs and outputs by performing conversions between audio and text

* Alexa facilitates the execution of "skills", which - while very simplistic - probably technically count as software

I'd sooner call Alexa an "interface" rather than an OS, but calling it an OS is not incorrect.

I agree with you. Alexa is an interface that ties APIs together. Defining OS as software that connects hardware to interfaces seems so reductive that it seems any software could be regarded as an operating system. Is there any software that doesn't access any hardware resources?

I think it's pretty clear that the author is talking about the business, not technical side of things.

Marketing seeks to destroy the meaning of useful technical jargon.

The term Operating System is a good example. Most people have no idea what an OS actually is. They think it is all of the stuff bundled with the OS.

Similarly in the 1980's the term Relational Database was used in marketing materials to describe any micro computer database product that was for sale.

Alexa is 0% without internet. Even timers/alarms won't go off.

Slightly tangential to the post: Does anyone know how to make diagrams (watercolor style) like the ones in OPs article and blog?

I believe he uses Paper - https://www.fiftythree.com

This comment totally sidetracked my day. I've been searching for a way to quickly do diagrams, but without the use of full-blown Visio-like flowchart software. I downloaded Paper back in the day and remember when it came out with its "Smart Shape" feature, but didn't ever think to use it for easily sketching charts.

> More fundamentally, Amazon sought to sell the phone through hardware and OS differentiation, much like Apple, but the company could not be more different organizationally and culturally from the iPhone maker; you don’t make good products because you really want to, you make good products by fostering the conditions in which great products can be made, and Amazon’s deeply rooted culture of modularity and services was completely ill-suited for building a highly differentiated physical product.

Eh, Amazon hit it out of the park with something in the Kindle line, no matter your e-book-reading taste. They have good reason to believe they can design products for the masses.

How do I make diagrams and charts in this style?

Looks like that was done with Paper [1].

1: https://itunes.apple.com/us/app/id506003812?mt=8

I believe Ben draws them in the iOS app Paper

>"The Internet made the operating system of the computer used to access it irrelevant"

Hyperbolic statements like this one make me stop reading. There's still PLENTY of important Desktop computing.

I don't think he's saying desktop computing is irrelevant. I think he's saying which OS you use to access the internet is irrelevant.

I read your comment on a mobile device, and switched to my laptop to write this comment. Two different operating systems, same internet.

Yeah, I'm not saying smartphones aren't good or significant. Just that they haven't replaced desktop-computing -- just added to it. The article says things like

"Android and iOS have replaced Windows in importance"

But Windows still seems plenty-important, to me. I don't see how one replaced the other, it's just that mobile is under more dynamic development right now because it's new.

The article is not saying desktop computing is irrelevant, they're saying the choice of operating system is.

I think web apps made desktop OSs much less important for most people; I don't think switching to osx would have really been viable for most people if it wasn't for web browsers/standards.

For professionals and hobbyists yes. For most people, not at all.

Aren't most (many?) adults professionals? Most college-educated people I know spend most of their working hours in front of a computer not a smartphone.

I'm a professional, working a technical role for one of the largest tech companies in the world. I'm using a MacBook, but I've done a (proof of concept) week of work entirely from an iPhone SE. Some of my coworkers use Red Hat, some use Windows. I know one guy working on an iPad Pro, although he's a little less technical than the rest of us so most of his day is spent in PowerPoint.

But my job mainly consists of interacting with remote machines through a web browser or SSH. Thanks to the Internet, I don't need to worry about what physical form factor my machine is. If all you need is Word, Excel, a web browser, and email, you're basically set no matter where you go.

Desktops are less relevant today than ever before. For many people, the desktop OS is just a VM host and their browser is the real "OS".

I largely agree with you, but there are still a lot of common office functions (e.g. printing) that aren't well-served by browser-based computing. Added to which, many businesses still use custom or industry-specific desktop software that hasn't migrated to the browser yet. Personally, I'm with you -- I don't care about OS and do most of my work through browser/terminal. But there is still some critical scientific software that keeps me on Windows/OSX and I wouldn't dream of doing graphic design in a browser. I also agree that "Desktops are less relevant today than ever before" but they're far from irrelevant... Maybe in a decade or two.

Was at an Alexa meetup. One of the senior Alexa developers was speaking. What was said blew my mind:

We want you to use conversational UI, on any device, Apple's, MSFT, Google, anyone you like. We think in the end, you'll have the best experience with Echo.

How often does a mega tech company encourage you to use the competitions products?

Think Amazon hit out of the park. Hey Google, just does not cut it for me.

Alexa, and that sultry voice, just no comparisons. (IMHO).

Having worked on voice control systems, I was impressed by the consistency with which Alexa switches among tasks. If you are not attuned to what's going on, it feels perfectly natural to interrupt some tasks and return to them, while ending other tasks after briefly diverting to them. It's a voice/media/data access task management system that needs no visual indicators.

digerati type overhypes over hyped tech...news at 11.

How is the on the front page? There is nothing of merit in the article.

Edit: HN title has been corrected.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact