Hacker News new | past | comments | ask | show | jobs | submit login
Leon: An open-source personal assistant (github.com)
332 points by yvonnick on Feb 15, 2019 | hide | past | favorite | 64 comments

I highly recommend Rasa Core (https://github.com/RasaHQ/rasa_core) if you are looking for an open-source virtual assistant. Very active and helpful community, and lots of input channels for integrating your assistant with messaging platforms (https://rasa.com/docs/core/connectors/).

I am not affiliated with Rasa, just had a really good experience developing a few projects with it.

I'll +1 SNIPs or Rasa, they're both really nice. It looks like the NLU part of Leon is a logistic regression classifier (https://github.com/leon-ai/leon/blob/360d1020c4bd8bf1df37646...) so it's just doing intent detection, not any slot filling. Maybe someone can add calls to Rasa's HTTP API (https://rasa.com/docs/core/server/#) to integrate with Leon?

How well do these compare to Mycroft[1]?

[1]: https://mycroft.ai/

I have had a Mycroft Mark I for about 6 months, and have a Mark II on order since I ordered them as a special bundle about a year ago. My feeling is Mycoft is aiming to be more of a stand alone product, closer to the base level echo or home as opposed to a framework to build things.

I've looked at some of the other projects, but one thing that appealed to me about Mycroft was they offered a complete hardware device. The Mark I is just a raspberry pi in a 3d printed case with custom lights, where as the Mark II is going to be more custom hardware (I think) whenever its finished. If you buy one of their devices you you could get it running just using their web interface and buttons without any knowledge of git, or ssh.

They have some ambitious plans, but their execution leaves something to be desired. The base level functionality is pretty good and has been reliable for me on a daily basis. It has a fall back to Wolfram Alpha and Wikipedia so you can ask it general questions like unit conversions, ages of famous people, weather and such. The results are no where near as polished as Alexa, but surprisingly responsive on random subjects.

My main problem has been with the plugins. I got it working with Kodi and Home Assistant and life was good, but as they have upgraded the project at various times these plugins have randomly stopped working for months at a time, and delving into why really drags you into more complicated troubleshooting. Other times it will randomly update and hang until you login via ssh and apt update the system to get it going again.

That said since they got their plugin marketplace working recently it has been pretty reliable and easier to get new plugins working.

You can build their product on your own, I believe you could even build the complete Mark I, but they have some odd "premium" options like donating 2$/month to get their newer voices.

They have a goal of privacy, but I believe the default processing for voice is Google, so big brother isn't far away. They are working, on a private backend https://github.com/MycroftAI/personal-backend and you could choose other voice backends so it should be pretty decent from a privacy perspective. At least good enough for me.

I've played with it a bit over the last year, and I can't help but assume I'm doing something wrong, but I just can't get it to work reliably. I installed it on a Raspberry Pi 3B+, took a 7-mic ReSpeaker mic array, and it only recognizes the wake word around 10% of the time, and then only if I speak really slowly and enunciate as if I were talking to a small child. I tried just recording audio from the mic and playing back to see if that was the problem, but it sounds crystal clear.

In contrast, a Google Home (which I really want to replace with something I trust more) sitting in the same location recognizes its wake word nearly 100% of the time.

Haven't looked at MyCroft before. It looks like MyCroft exposes less of the nuts-and-bolts of modeling? I'm not sure where I would plug in a custom entity extraction or intent detection model, but I do see that it lets you add custom 'skills'.

Yes, this can all be handled with skills and the dictionaries that go along with skills. I don't even program much in python, but I've found it pretty simple to make add-ons for mycroft that have multiple entities for different intents.

Rasa seems to be text only though?

Or am I missing something?

Correct, Rasa is text-only. Core purpose of Rasa NLU (https://github.com/RasaHQ/rasa_nlu) is user text to intent translation (with entity extraction); for Rasa Core it is mapping user intent to assistant response . Speech-to-text is one layer above these and Rasa doesn't handle it.

The easiest way to use it with voice is to connect it to a platform that supports voice input.

The webpage seems to be missing a list of capabilities.

If you're looking for it you need to infer it from here: https://github.com/leon-ai/leon/tree/master/packages

Yeah, the whole page focuses on how it's built, not so much on problems it solves. From the link you posted, it seems to:

* say "hello"

* tell jokes

* generate random numbers

* download videos from a given YouTube URL

* check status of a domain

I can't see why any of those need AI.

> I can't see why any of those need AI.

The AI piece in virtual assistants usually doesn't refer to the skills themselves but the process of matching what is spoken (as audio) to a skill and passing that skill the appropriate context. Without having to have a developer program in all the hundreds of thousands of ways someone can ask for the same thing.

With that said, I'm not sure how well this assistant does that. None of these skills seem to use context at all. The sentence structure to map to a skill seems relatively limited or at least not challenged by the initial skills.

Here is the demo https://www.youtube.com/watch?v=p7GRGiicO1c

Doing speech-to-text and text-to-speech in an accurate and convincing way is extremely difficult and its cutting edge AI.

A big part of what the Leon dev has done is created a way to plug in to different providers for TTS and STT. In my opinion this is quite useful because it takes care of all of the hard parts. So its a good starting point for creating your own personal assistant with the features that you think are useful.

The thing that is tough though is that the open source stuff out there that you can install offline for free i.e. DeepSpeech and Flite or similar is pretty inferior to the cloud-based or proprietary stuff. However, having the provider framework is a good starting point for comparing different systems that can plug-in to fill that role. And hopefully with this packaging as a personal assistant ready for offline mode, it will bring more attention to those open source TTS and STT engines/voices/datasets that are out there and people will improve them.

I think the most interesting part of this is the offline capability. From looking at the source code, it looks like the text to speech is provided by http://www.festvox.org/flite/ and the speech to text uses https://github.com/mozilla/DeepSpeech.

How do those compare to the major online providers?

> I think the most interesting part of this is the offline capability.

I agree -- that's the aspect that makes this potentially appealing to me.

> He does stuff when you ask for it.

Is the name a homage to the Luc Besson movie starring Jean Reno? If so, he'd better be able to kill a process (if not a real person) when the user asks for it ;)

The video seems to allude to it.

I got excited when I saw the core was JS (as writing TypeScript/JS is something I enjoy) then I saw that all modules are written in python... That doesn't seem to make a lot of sense to me, allowing for any language or a subset of languages makes sense but writing the core and plugin system in different languages.

I was disappointed that this was implemented in JS and Python. If I decide to experiment with this, the first thing I'll do is port it to a (compiled) language that I prefer.

I did not see the Python code. I do see a bunch of JavaScript and a Node.js package.json. Where was the python code?

Unless I misread the docs, offline use would be restricted to text-only. Is that limitation temporary? I would surely have some uses for one of these things, but no way I'm putting a closed source or cloud connected one in my home, not even an open source one if it uses any external services I can't trust.

I looked at the source and there are scripts for offline that install Mozilla DeepSpeech for speech-to-text and CMU Flite for text-to-speech. I put links in my other comment in this thread.

I've been planning to set up something like this for myself for quite some time now. Can anyone tell me what makes this different from something like Snips (https://snips.ai)?

I have a longstanding view that the best personal assistant is one you write for yourself: It can cover your needs and respond to how you'd prefer to interact with it. A lot of sophistication isn't needed because it doesn't have to try to meet a lowest common denominator user.

Absolutely. I think hackability is the best part of open source.

I really hope SNIPS manages to launch some hardware like the Echo at an affordable price point. Their current kit is too expensive but the Echo sucks and I much prefer writing my own intents on an open source system that runs on-device.

Snips is totally Cloud based and proprietary as far as I can tell. Leon has offline capabilities and also the ability to switch between different cloud providers.

I got the impression snips was offline.

Right I just looked again and they have an online tool for building assistants or something but it does seem to run offline.

You might like project naomi.

1. How is it different from the existing open source naturagl language parsers?

2. what problem does it solve? Have you tried to create an NLU platform which can be extended? Have you tried to create a personal asst., if yes, then what specific use-cases does it address?

Does any of you know of a speech-to-text software that runs continuously, works on Linux and accepts a Pulseaudio input (i.e. microphone) rather than files and outputs text as a stream?

Kaldi has a couple open-source online LVCSR models which can definitely do live decoding. Though I'm not 100% sure if there is support for PulseAudio, you may need an auxiliary service to pipe it in.

Easiest interface for it is via gstreamer:


Oooh, this looks promising indeed!

So, I get that most folks aren’t tuned in to this and I will sound unreasonable, but I’m utterly tired of robot-like things being given gender. Every time I read “he”, “him” when learning about a piece of software I gag a little.

I know someone will tell me I am wrong. But I’m not asking or telling anyone to change. I just want to share what unnecessary gendering looks like to me. And for the record, all my robots are non-gendered (“Scout, Skittles, Rover”).

Perhaps this is unreasonable as well, but maybe ongoing developments in society have caused you to focus a little too much of your attention on gender.

Gender dysphoria is a recognized medical condition, for which there are suggested plans of treatment--up to and including hormone therapy and reassignment. Just go with the treatment, and be done with it, and accept that people who have done so were trying to make themselves better. I don't know why society is making it such a hot-button political issue.

(Actually I do know why... politicians need to find wedge issues to split and enthuse people, to get votes.)

I’m not motivated by politicians (I hardly ever listen to them or endorse them).

I’m motivated by the suffering I see in our society experienced by gender nonconforming individuals who feel uncomfortable in a world that believes in the gender binary.

It’s interesting that you mention treatment. I’m not sure if you know any gender nonconforming people, but “just go with the treatment and be done with it” doesn’t really reflect accurately the experience I’ve seen other people have. Hormones and surgery can treat the symptoms of dysphoria but the cause seems to be society and the expectation that all people fit in to a binary. (Please ask if you’d like me to elaborate on that.) So here I am proscribing a treatment for society: stop gendering things unnecessarily and stop supporting a hard gender binary.

There are less gender nonconforming individuals than amputees. Let's tackle the big ticket items first.

51% of the population are women, who suffer from these gender roles. That's a super big-ticket group right there.

As a woman, I can tell I don’t suffer at all from gender roles.

I am completely fine with gendered personal assistants, which was the original topic being discussed.

And I don’t buy all this political crap regarding gender.

Stop using all women to support your political agenda. Stop speaking for the 51%. Not all of us support you, don’t speak for all of us.

Believe it or not, the other 49% also suffer from these "gender roles".

That's an entirely different issue from what we're discussing

Actually it's because gender is a hot-button topic, and not just political. Some of us think that everyone pays too much attention to gender, and those who think they don't are generally male and don't have to think about it because everyone follows the "rules" so it all looks normal and no big deal. But it's ever-present, and I think it's quite reasonable to try to ungender things, particularly things that have no biological sex and are unnecessarily gendered in the first place. Gender is insidious and oppressive for almost everyone and I appreciate that the OP balks at it.

You can choose between "he", "she" and "it", at least in regular English, sans the XXI-century additions. "It" usually refers to things, so if you want to humanize your bot/robot - as many people tend to - you're stuck with picking a gender.

At least in English you don't have nouns automatically selecting gender for you, like in other languages.

> You can choose between "he", "she" and "it", at least in regular English, sans the XXI-century additions.

Or "they". People seem to think that singluar they is some kind of modern invention, but most people also use it all the time without thinking when talking about people that they don't know the gender.

Not to mention it's used by Shakespeare and the King James Bible.

Is it just me, or is humanizing these things so much itself kind of creepy? I prefer not to call Siri "she" for that reason alone.

I go all-in on the weirdness and make my Siri call me "master" just for shits and giggles

I too noticed the gender-ness of the description but I don't know if it matters. I bristled at it with the thought of "Why does it have to be a 'he'?" and then thought well if it was another "she" would I be thinking "Oh so only women can be 'assistants'?!". I get that non-gendered is probably the best but I wouldn't read too much into it. The author wrote this, it appears, for themselves and they can call it whatever they want as far as I'm concerned.

For my Google Assistant, I've made it a point to select a male voice, and I always refer to it as the Google Assistant or just Assistant. The choice in voice gender was so that my wife could keep the default female voice, and we would know it's responding under her account when she talks to a Home in our house vs defaulting to my account like it does for me or guests.

Having said that, I've never heard a genderless voice (although there have been times when I'm on the phone and cannot make out the other person's sex from just their voice). I think this idea that everyone should stop considering gender as binary simply due to the apparently increase in gender dysphoria is a bit ridiculous. Creating the voice for a digital Assistant is a perfect example of this, since the big companies involved have probably done a lot of research to try to determine the best way to make one that doesn't lend itself to a particular gender but always seem to default to a female voice / character (excluding this "Leon").

> I think this idea that everyone should stop considering gender as binary simply due to the apparently increase in gender dysphoria is a bit ridiculous.

But gender isn't inherently binary, even if many but not all cultures have tended to ascribe gender from a binary pallette.

> Creating the voice for a digital Assistant is a perfect example of this, since the big companies involved have probably done a lot of research to try to determine the best way to make one that doesn't lend itself to a particular gender but always seem to default to a female voice / character (excluding this "Leon").

They probably haven't aimed for genderlessness, and in fact has e probably avoided it, because real people have genders (whether or not aligned in the stereotypical way with external sex traits) and conforming to social expectations improves receptions (and, at least in America, there is significant research that, across genders of listeners, people on average respond more positively to feminine voices, which is why the default usually is for a feminine voice.)

> But gender isn't inherently binary, even if many but not all cultures have tended to ascribe gender from a binary pallette.

You say that, but based on what? It's not a construct that makes any sense, except with regard to mental disorders (gender dysphoria). The mental disorder context is particularly relevant when you look at the wildly abnormal rates of suicide attempts among gender dysphoria sufferers.

> I get that most folks aren’t tuned in to this and I will sound unreasonable

> someone will tell me I am wrong

I know "+1" style comments are generally dissuaded on HN, but just chiming in to say that not everyone thinks this sounds unreasonable. It's a topic well worth opening discussions on.

> And for the record, all my robots are non-gendered (“Scout, Skittles, Rover”).

All of my robots are named "Larry"[0]

[0] https://bugs.gentoo.org/27727

> I’m utterly tired of robot-like things being given gender.

I would actually prefer it if humans weren't given a gender either. Obviously we should recognise that people have different physical characteristics, but the abstraction of genders with different societal roles doesn't seem like a very useful one in this day and age.

I agree wholeheartedly! I make my robots non gendered as a kind of point that assigning gender to things that can’t or haven’t affirmed that is unhelpful. That includes baby humans. :)

I mean I think it's pretty unhelpful even for adult humans. If someone tells me that they are "man" or a "woman", then that tells me... pretty much nothimg about them. So why bother?

You can actually pretty much do what you want, regardless of your gender. It is a myth that your "sex" or "gender" forces you into certain roles. Your physical attributes may influence your roles, to some extent (giving birth, lifting heavy stuff,...).

There are definitely segments of society that have different expectations for people they categorise as "men" and "women" though. I certainly come across "you're a man, can you help me open this jar" (even though I'm physically weaker than a lot of the women I live with), and plenty of people thinking that because I'm a man that I won't be caring/empathetic.

The trend of the large assistants (Google Home, Alexa, Cortana) to all be female-voiced by default is pretty awful. Recently I tried to change the settings on an Alexa to try to get it to answer in a male voice, but I don't think it's possible.

I think the explanation is more practical and less nefarious than you are implying: female voices are often easier to hear and understand, especially with background noise.

Maybe it's different for other people, but that's certainly the case for me, in movies, lectures, radio, and synthesised voices.

Amusingly Amazon could have simply dropped the trailing "a" and had a gender neutral name.

Siri made some effort to be non-gendered, but I think most users still think of Siri as female.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact