Hacker News new | past | comments | ask | show | jobs | submit login

> It’s important to note that all of this runs on your own gateway in your house. Google or Amazon can’t see when you turn on the light using your voice.

Is the RPi gateway capable of local speech recognition that can compete with Siri, Alexa, Google? That seems unlikely, unless it has a dedicated processor for that purpose.

I would guess not, but Mozilla will probably wrap this in with https://voice.mozilla.org/ <- that project at some point.

Personally, I'm not one for using Pis as computers, because I have always running computers at home anyways. I'm quite confident my computer can handle adequate voice recognition, though most companies today have avoided developing solutions that run locally.

It's pretty nice that an RPi only consumes .35W though. :)

EDIT: Fixed typo

An RPi 3B running headless with wifi enabled consumes around 1.2 to 1.5 Watts.

You're out by an order of magnitude. 35W over a 5V USB power connection would be an insane 7 amps.

It was off by two orders of magnitude, not one.

Lol. *.35W I missed the decimal point.

https://mycroft.ai might be a viable option - I believe they are planning to use Mozilla's speech recognition work too.

> Is the RPi gateway capable of local speech recognition that can compete with Siri, Alexa, Google? That seems unlikely, unless it has a dedicated processor for that purpose.

We were doing local speech recognition and full voice dictation back in the 90's through products like this (https://en.wikipedia.org/wiki/Dragon_NaturallySpeaking). Back then your typical computer had a 100Mhz processor and would be lucky to have 32MB of RAM. So yes, with 20 years of progress the much more powerful rasberry pi should be able to handle it just fine.

Make no mistake, siri, alexa and google are spyware, they send your information to themselves because they want to, not because they have to.

There is a pretty big difference between the level of speech recognition that is offered by these smart devices, and what was being done 10-20 years ago. We're talking 80% accuracy (missing every fifth word), to now 95% accuracy (the entire sentence will likely be perfect).

I have worked with several speech recognition softwares, and the ones which can run on a low-powered device in real-time today are not the ones reaching 95% accuracy.

My experience has been different, I found dragon back in the day to be about 90-95% accurate, but today I find google to be about 50% accurate. An that's 50% accurate for some limited voice commands, which is a much lower standard than full voice dictation that dragon offered.

Dragon was trained for me, google is trained for some mythical "everyone".

Who am I to argue with your experience? I can only offer the current literature on the topic [0]. (They report a 5.1% error)

> These systems typically use deep convolutional neural network (CNN) architectures ... driving the word error rate on the benchmark Switchboard corpus down from its mid-2000s plateau of around 15% to well below 10%.

[0]: https://arxiv.org/pdf/1708.06073.pdf

Unless the OS image they want to use is some sort of custom distro that limits what you can run on the raspberry pi, it probably isn't all that hard.

If nothing else, you could probably just hook up the AIY voice kit to it. If the software for the AIY voice kit runs on that OS image, it's pretty trivial to do it. There're multiple tutorials about how to make that happen.

What's the plan for turning your lights on while you're outside your house?

You can access the IoT gateway through a tunnel with a https endpoint that you configure during setup.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact