I'm wondering what type of technology can be used to build something like Google Home without the privacy concerns from Google or the government listening.
I'm not sure what it's going to be, but I know we're on the verge of another "Microsoft vs Linux" moment. This is going to be a major OS that almost all people will have in their homes (my echo is older than my son, he is growing up just expecting to be able to ask a computer to do what he wants). Bots will be as natural to him as mobile phones are to Generation Z, and the internet is to Millenials.
Right now it seems the proprietary systems are winning the race, which makes sense. Financial incentives are strong... but I firmly believe an open source alternative is better (from a purely capitalistic point of view, I personally compete against amazon... and the further we move from desktops the harder that is).
So, if someone asks for funding, and has a good background, i'll throw a few bucks towards them.
I think that the difference is that when you compete based on hardware or software, eventually, through accretion, a free alternative can grow up if there are sufficient people interested in the topic and prepared to work on it, and there's nothing inherent that means that what they create will be worse than what a large company would create.
Most of these new services are built on data, and large, preexisting companies have huge advantages for the datasets that they can call on to train their systems. It's going to be very hard to escape the data problem for open source software.
I'm also losing hope in some of this stuff, because everyone is obsessed with building products, when in fact the internet is more about protocols and standards for talking between hetrogenous systems. Chat is a depressing example of this.
The challenge is simple integration across many disparate components, which seems like a rich company's game.
The only chance for a open protocol is if it's so cool and immediately useful that people adopt it en masse, a la Snapchat/Whatsapp. Or the market gets regulated. But either of those face a chicken/egg situation because the current market is too small to bother.
No, I think there are going to be some tech guys who refuse to buy it for "tin foil" reasons. Generally speaking, I think "normal" people will continue to adopt it as the value increases.
My own experience though has pretty transformative. My mother, father-in-law, even my wife approached the device with extreme caution. However as they watched me naturally interact with it, they all gradually warmed up to it.... and now it's a must have. The other day I walked into my kitchen, asked Alexa what the weather was outside, but she was unplugged. It was weird... kind of like calling your grandma only to realize she's not alive any more.
To me it's the fact that Amazon / Google is already making more than the cost of these things in the data they extract from every sound in every house they're in. They should be paying people to use them, not the other way around. I'd be interested in a free alternative even if it's just to say "what time is it" when I wake up and "weather" before I leave every morning and not have to press anything. The technology itself seems very easy to integrate into life. I'm just not trying to pay someone to eavesdrop on me just so they can be there to say it's raining outside.
Doesn't both Siri and whatever-google-does have this very ability already? I would think that every household has at least one of those things even today.
There's a project called Jasper [1] that can use a variety of different speech-to-text backends and allows you to write plugins to process commands.
For me the biggest issue is actually capturing audio. Something like Google Home or Amazon Echo will have one or more very good microphones in and trying to source those separately ends up being surprisingly expensive.
Looking at Jasper, it uses AT&T or Google or Facebook (wit.ai) as backend (forget the other two, offline speech recognition is hard because it's a data problem and training your models is hard and your performance will be prohibitive).
If you intent is privacy, then using the backends offered by AT&T or Google or FB doesn't fly.
Directions can be done offline just fine using openstreetmap data and a library like graphhopper. Music could and should be locally stored. And yes, the news has to come from a server, but this could be done anonymously instead of with the full range of google tracking.
I'm starting to get really annoyed at the tendency to put everything online when it doesn't need to be. Online-integrated devices are brittle and have a built-in expiration date. We should prefer devices for which online integration, if it exists at all, is optional and easily replaced by your own server.
>We should prefer devices for which online integration, if it exists at all, is optional and easily replaced by your own server.
This is something that's only going to be important to the tech-savvy. Your typical consumer wants idiot-proof instant gratification, and cloud-powered apps and devices provide that.
Privacy is not something only cherished by the tech savvy. The problem is, there is a lack of openness about the deal being engaged in when using these services. Consumers do not know what is being done with their data or even that it is being collected. That may be obvious to us, as engineers who understand it, but it isn't to your every day user. They are not stupid. It's just that our industry is not transparent and to be honest pretty damn deceptive in how customers are engaged.
I would argue that you cannot make a cloud-powered device idiot-proof. Inevitably the network will hiccup and you will have to be tech-savvy to fix it.
I predict a consumer backlash against smart devices once people realize just how dumb they become without the cloud.
True, right up until the internet stops working in which case they really will be idiots for not having better, read smarter, infrastructure.
I, for one, want google as far from me as is possible.
Now you need a pretty beefy server in your home to have all of that data, and be able to return results relatively quickly. The music could be an integrated NAS device though.
All of that said, most home users aren't going to want to setup a home server with a database in order to get directions.
It's not so dire. OSMAnd discards data it doesn't use and has worldwide coverage in about 25 gigabytes. That data needs to grow a lot to have better coverage for stuff like POIs, but 100 or 200 gigabytes is going to go a long way.
On my phone with 1 gigabyte of memory, it can calculate a several hundred mile route in a few seconds. That's not as cool as the instant routes you get from real actual beefy servers, but it isn't unusable.
With just a little complexity you could do local routes on device and call out to a server for longer routes. That dramatically lowers the data requirement and tightens up the calculation times.
With Spotify and YouTube, I access dozens of new songs every day, songs I have never heard before. Would I have to purchase each of these songs, download them, and put them on the device in advance?
There's nothing wrong with subscribing to a music service, as long as the music doesn't stop when the network does. That's why I say music should be stored locally, even if just in a cache.
Well, you could still use Google services. I think the largest concern is over the active microphone. If you specifically request something that is different.
Also it would be nice to have control and choice over which services you use. Maybe Google or Amazon everything isn't the right choice. There's OpenStreetMap, DuckDuckGo, downloaded music... Google is very useful but it is far from the only resource out there. Thinking so would be delusional.
> If you ask it for directions, are you going to build your own mapping engine or just use google maps or another third party solution?
(In the following, "server" means computer owned by an outside entity, such as Google or Amazon, and "local" means on a computer owned by you, even if that computer of yours is acting as a server for your other devices)
It doesn't have to be all or nothing.
The maximum privacy leaking method would be to have every direction query handled entirely on the server end, which means you are telling the server the source and destination of every trip you ask directions for.
The maximum practical privacy preserving method would be to download map data for a wide geographical region, and then do all direction calculations locally.
A middle ground would be that when you ask for directions, download map data from the server for the city or county regions necessary for that trip, and do the directions calculations locally, and save the downloaded map data. For subsequent trips, use the saved data, occasionally checking with the server to see if there are updates. Someone snooping at the server will know the first time you ask for directions involving a given city or county, but won't find out where specifically you are going. Depending on how the update checks work, they may also subsequently get an idea of when you make other trips in that area, but again will not get specific origin and destination.
> If you ask for the latest basketball news, are you going to search google or are you going to build your own web crawler?
A couple possibilities to do this.
One is to pull a comprehensive news feed from the server, and then locally filter and sort it.
Another is to use Google to search for sources of basketball news, and then subsequently you get your basketball news directly from those sites.
One answer is that in each of those cases there are alternative mapping/news/music services. An open-source, privacy-first device could be an intermediary that wraps other services, and if it got significant adoption, serve as a incentive for those services to compete on quality/price/privacy-awareness.
There are even startups, like ours (Diffbot) that are building general purpose knowledge graphs for developers to use.
"""Search is cool, but Google-style universal search might be a bit overrated.
Google still relies on institutions to provide data in indexable form, and those institutions build their reputations in ways that only partially rely on Google traffic."""
(https://news.ycombinator.com/item?id=11729467)
The same could be said about Wikipedia. If we all chip in (code, governance and money), we can make it happen. I am part of two coops and one LLC that back FOSS projects. I even make a living out of it.
There may be data issues (Google can easily buy datasets), software issues (good algorithms and fighting spam) and probably many other issues. Many people probably do not feel enough incentive to re-create Google. It works well enough and is not irritating enough.
Then again, search and maps/directions are pretty much the only services I use. For mapping projects that do not require directions, I use OSM-based solutions. For email I use a home-hosted Zimbra VM, for chat I use Mattermost and IRC, for news I use a few websites (most aggregators suck and just become echo-bubbles). etc.
Well, depends on what you want to do. Good far field speech recognition requires on the order of 10k hours of training data to build a good acoustic model for large vocabulary problems. And it needs to be recorded with the same mic set up as you expect at run time.
I am continually frustrated by people that think speech recognition is a software problem.
It's really not the same thing, but I've just installed Home Assistant [1], and have been playing around with it today. It automatically picked up my Hue lights and Chromecasts, and I set up presence detection with Locative.
For voice recognition, it's just not worth trying to do something yourself. I'm still going to get an Amazon Echo, or maybe the Google Home when it comes out.
We at Athom, a smart home startup based in the Netherlands, are releasing the Homey very soon. It features an open app platform based on NodeJS, has support for Z-Wave, 868MHz, 433MHz, Zigbee, IR, WiFi and Bluetooth, not to mention it has some pretty decent voice recognition as well. It isn't strictly speaking open source as the core of our software is proprietary, but we're trying to contribute back to the community on every piece of open source software we do use. We've opened pull requests on various NodeJS projects, we have someone actively working on Linux kernel development (primarily drivers), and every protocol we add support for is open sourced on our GitHub account. We maintain the node-nfc npm module, and are contributors to a handful of other projects. We're about the closest you can come right now to a functioning smart home hub incorporating lots of open source elements, while still having the advantages of corporate backing.
Here's the basic question: if Athom is purchased by Amazon a year after I buy a Homey, how long will it continue to function after Amazon turns off your servers?
(The answer needs to be along the lines of: "the product will be accessible and configurable from the built-in web service and API until the hardware breaks. Extra services provided by Athom subscription will stop working, but you could provide many of them yourself if you have your own server and are competent to read our documentation.")
That is pretty much the answer, yes. Sure, some features will break, like out-of-the-box speech to text, but most of the core functionality will continue to work properly, and it definitely won't be a Revolv 2.0 :)
I look at stuff like Google home and wonder what problem they're trying to solve and if they've actually made it disappear or just hidden it behind a wall of ever more complicated marketing.
None of it particularly scares me, but the older I get, I think things need to get simpler and a voice-activated, centralized home control system isn't it. Especially when I'm 80 and have enough problems figuring out where my cats hid my glasses (I expect they'll probably have hidden them on my head - sneaky bastards).
Literally, the only product of the past decade that I can think of that nailed it was Nest. They decided to replace the thermostat and so they replaced the thermostat. Sure, it ties into your phone and whatnot, but the brains are in the little box you hang on the wall and you don't need the extra connectivity.
IoT sounds great, but, yeah. Light switch is simpler.
I was actually thinking about writing one meself. My plan looked something like this:
Raspberry Pi with a python script running that does microphone->speech recognition (I was going to use Google's dictation service, but you could do it however you want)->voice-API.
The voice API will look something like a regular old REST API does, but for humans: it'll have a bunch of trigger words that would emulate API endpoints (e.g. "play" for music, "turn the lights" for light control, etc.) with the rest of the query being fed as the parameters. And anything that is unrecognized would be assumed to be a question and passed to WolframAlpha or something similar.
Right now it seems the proprietary systems are winning the race, which makes sense. Financial incentives are strong... but I firmly believe an open source alternative is better (from a purely capitalistic point of view, I personally compete against amazon... and the further we move from desktops the harder that is).
So, if someone asks for funding, and has a good background, i'll throw a few bucks towards them.