
Rhasspy is an open source, fully offline voice assistant toolkit - reedlaw
https://rhasspy.readthedocs.io/en/latest/
======
synesthesiam
Author here, happy new year HN! Glad to answer any questions (also see
[https://community.rhasspy.org](https://community.rhasspy.org))

Bit of background: Rhasspy was originally designed for Home Assistant
([https://www.home-assistant.io](https://www.home-assistant.io)), but now
works with lots of home automation projects (Hass.io, Node-RED, OpenHAB,
Jeedom). Its sister project, voicej2son
([http://voice2json.org](http://voice2json.org)), is for command-line use and
has fewer options.

With Snips.ai being bought by Sonos, we're now focusing on compatibility with
its MQTT protocol
([https://docs.snips.ai/reference/hermes](https://docs.snips.ai/reference/hermes))
so existing plugins/skills will just work. Supporting Snips-like
number/duration/dateTime slots across over a dozen languages is going to be a
major challenge, so please reach out if you speak a language besides English*
:)

* Also consider donating to the Common Voice project: [https://voice.mozilla.org](https://voice.mozilla.org)

~~~
wavefunction
Just an aside, neither the logo nor the name make me think of "spee."
Definitely going to check this out though, thanks!

------
jszymborski
This looks like a lot of fun to hook up to something like this 6 mic hat for
the Raspberry Pi

[https://respeaker.io/6_mic_array/](https://respeaker.io/6_mic_array/)

~~~
davewongillies
The ReSpeaker Core v2.0 looks like it could be fun too:

* Debian-Based Linux System

* SDK for Speech Algorithms with Full Documents

* C++ SDK and Python Wrapper

* Speech Algorithms and Features

* Keyword Spotting (Wake-Up)

* BF (Beamforming)

* DoA (Direction of Arrival)

* NS (Noise Suppression)

* AEC (Acoustic Echo Cancellation) and AGC (Automatic Gain Control)

* All-in-One Solution with High Performance SoC

* 8 Channel ADC for 6 Microphone Array and 2 Loopbacks (Hardware Loopback)

[https://respeaker.io/rk3229_core/](https://respeaker.io/rk3229_core/)

------
cmer
This is really cool!

Does anyone know if there's a way to hack an Echo Dot and use it as the
speaker/mic for Rhasspy? Rolling out our own hardware that is as effective as
a Dot would probably be very difficult?

~~~
prpl
I was looking up this today without much luck, I’d actually like to try to use
either the echo dot or google home for generic videoconferencing (zoom)

~~~
jedieaston
Idk if this is something that’d interest you, but you can use Alexa for
Business in AWS for $3 per month and link your echo to zoom.

------
_underfl0w_
Similar project: Voice2JSON

Website: [https://voice2json.org](https://voice2json.org)

GitHub:
[https://github.com/synesthesian/voice2json](https://github.com/synesthesian/voice2json)

(I am not affiliated but am using it in my own pet project)

~~~
synesthesiam
Correct link:
[https://github.com/synesthesiam/voice2json](https://github.com/synesthesiam/voice2json)

For those wondering, Rhasspy and voice2json are from the same author (me). If
you want a command-line tool for voice assistant tasks (wake word detection,
speech to text, intent recognition, etc.), check out voice2json.

See the recipes for some interesting things you can do with voice2json:
[http://voice2json.org/recipes.html](http://voice2json.org/recipes.html)

------
ocdtrekkie
I recently came across Rhasspy, and while I haven't had time to play with it,
I'm super excited. Often I like the sound of certain projects but want to plug
in my own parts. Rhasspy appears to glue all the parts of a modern voice
assistant together, but let you swap out any of the parts.

~~~
synesthesiam
Going forward, Rhasspy is being split into multiple MQTT services that should
be compatible with Snip.ai's protocol
([https://github.com/rhasspy](https://github.com/rhasspy))

This should make it much easier to swap out parts, and distribute the
computing across multiple devices.

------
Clamydo
Have you considered Mozilla's DeepSpeech for Speech2Text? Since version 0.6 it
seems to be a viable option for a raspberry pi.

[https://hacks.mozilla.org/2019/12/deepspeech-0-6-mozillas-
sp...](https://hacks.mozilla.org/2019/12/deepspeech-0-6-mozillas-speech-to-
text-engine/)

~~~
synesthesiam
We have, though my tests prior to 0.6 were not as promising as I'd hoped. With
0.6, though, we're planning to add support for English and French (a German
model is apparently in the works: [https://github.com/AASHISHAG/deepspeech-
german/issues/3](https://github.com/AASHISHAG/deepspeech-german/issues/3)).

------
darepublic
This is inspiring hopefully I find some time in the new year to dig into this
stuff. I feel like there will be a sort of arms race between open source and
top tech companies around AI and privacy. Projects like this are needed imo

------
lallysingh
How good is the recognition?

~~~
catalogia
It seems it uses pocketsphinx or kaldi. I've never tried kaldi but I've tried
pocketsphinx before and didn't find it accurate enough to be useful.

~~~
woodson
Kaldi is much much better, but when used for low latency recognition on the
device, the accuracy will be lower than it could be because one would use much
smaller models adapted to the constrained proccessing power of the device.

It would still be much better than pocketsphinx.

------
brink
Does anyone know if this is viable on a pi zero?

~~~
potato_penguin
No, based off of their hardware requirements. See here.
[https://rhasspy.readthedocs.io/en/latest/hardware/](https://rhasspy.readthedocs.io/en/latest/hardware/)

~~~
magicalhippo
Apparently it's work in progress:
[https://github.com/synesthesiam/rhasspy/issues/61](https://github.com/synesthesiam/rhasspy/issues/61)

Sounds like it's not too far off.

------
drKarl
Mmm if there were only IP microphones to connect to something like a Raspberry
Pi (An Odroid H2 in my case) to have multiple mics, one on each room without
the need for multiple servers...

~~~
jdboyd
AES67 microphones exist. Shure, Audio-Technica, and Audix are just 3
manufacturers that make them. Some of them are called Dante mics and you have
to turn on AES67 in a confirmation menu. I don't know that there are any good
AES67 drivers in stock Debian or Raspbian, but a lot of AES67 devices can be
used with any software that supports rtsp. Whether one install of rhasspy can
handle multiple streams at once is a different question. Also, even the
cheapest AES67 equipment is expensive enough that a new raspberry pi or odroid
h2 per room would be cheaper.

------
donatj
How is that useful? Most of the things I use assistants for is locating data
online while I'm not on my phone.

~~~
0xdeadb00f
By "offline" I think they mean the speech recognition and processing is
offline and requires no third-party servers. I'm fairly certain you can assign
voice commands to do stuff online.

------
olabyne
Kalliope, another open-source voice assistant ([https://github.com/kalliope-
project/kalliope](https://github.com/kalliope-project/kalliope)) also has the
option to use an offline wake-word backend like snowboy.

------
iamsrp
Neat! This also looks a little related to a (very much) toy project of mine
([https://github.com/iamsrp/dexter](https://github.com/iamsrp/dexter)). I
might try to look to see if I can hook them together..!

------
GordonS
This looks great!

I'm curious about extensibility - would it be possible integrate with a C# app
running on Windows, for example?

I'm particularly interested for accessibility reasons, looking for ways to
control tools like JetBrains Rider without shifting my hands from keyboard to
mouse.

~~~
synesthesiam
I haven't ever tried running this on Windows. You may get lucky with Docker,
but audio input might be difficult. A workaround might be to stream audio in:
[https://rhasspy.readthedocs.io/en/latest/audio-
input/#gstrea...](https://rhasspy.readthedocs.io/en/latest/audio-
input/#gstreamer)

------
mister_hn
Similar project (quite old): CMU Sphinx

~~~
roel_v
Seems like this project uses sphinx as the actual recognition engine. This is
just a wrapper.

------
nunofgs
I too am super interested in an offline-only voice assistant but really don't
want to bother with setting up a mic connected to a pi. Even tough it's not
super hard, it'll never be as good as the commercially available options.

I think this project would really benefit from taking one of the excellent
existing voice assistant/speakers on the market (Google home, echo dots, etc.,
and flashing them with some custom firmware.

~~~
Iv
> Even tough it's not super hard, it'll never be as good as the commercially
> available options.

Arguably, being offline and keeping your recordings off the cloud, it is
already superior to commercially available options.

~~~
vunie
Exactly. I've never considered voice assistants because I don't want
recordings of me or my family being used for who knows what.

I've just looked over the docs. I'll probably be playing with this very soon.

~~~
pts_
Interestingly the storage on my phone was used up and Google Keyboard kept on
crashing, so I had to use voice for one whole night.

~~~
efreak
Having run into similar problems multiple times, the solution is to clear
cache on an app. If, like me, sorting your apps by storage usage never
actually completes[1], browsers are a good thing to check first[2]; Unity
games are a good second (those analytics pile up, even if the app is
firewalled). If you're rooted, check the analytics in Chrome's private data
folder, it was taking up 1.5gb on my tablet last time (not sure if this is
cleared when you clear cache or not)

If the primary issue is the same as me, that you download too much crap
without paying attention to available storage, just use the terminal emulator
to create a 25-100mb file you can delete when necessary.

[1] storage ️ internal ️ apps [2] If you use PWAs to cache data and avoid high
data bills, keep in mind that clearing browser cache clears PWA cache as well
(my most-used PWAs are for hn and xkcd)

