
Intel open sourced Stephen Hawking’s speech system - btzll
http://blogs.msdn.com/b/cdndevs/archive/2015/08/14/intel-just-open-sourced-stephen-hawking-s-speech-system-and-it-s-a-net-4-5-winforms-app.aspx
======
joefreeman
Fun fact: the latest version of this software uses SwiftKey under the hood -
[http://swiftkey.com/en/blog/swiftkey-reveals-role-
professor-...](http://swiftkey.com/en/blog/swiftkey-reveals-role-professor-
stephen-hawkings-communication-system/) (Disclaimer: I used to work for
SwiftKey)

~~~
Nexxxeh
As a loyal SwiftKey on Android user, the prediction engine for SwiftKey is
unnervingly good. Glad to see it's put to a better use than helping me write
Facebook and HN posts.

It begs the question though, why isn't there a proper SwiftKey keyboard for
Windows? The OSK on 8.1 is awful compared to SwiftKey on Android. The Windows
10 is a improvement, but I'd still pay for a better one.

~~~
greyskull
Have they improved performance of SwiftKey? I had the paid version for years,
was even a VIP (won a t-shirt and everything), but I switched off last year in
favor of Fleksy as a faster, more lightweight alternative.

~~~
mtgx
I think the performance of Swiftkey improved about half a year ago. But I
think around the same time I started noticing a significant drop in Swiftkey's
accuracy. It "feels" less accurate to me than it was, although I use it with 3
enabled languages at once and I imagine that also brings lower accuracy by
default. Still, I think it has become quite a bit worse than before, and I
worry they did that on purpose as a compromise to improve performance.

------
nbevans
Bear in mind this project was started around the same time there was a ton of
uncertainty around the future of Silverlight and WPF. Alas, one did die, one
lives on for now. But nobody knew that at the time, including Intel, or
apparently Microsoft. WinForms has never faced any forward compatibility
uncertainty so it is a good long-term bet.

~~~
dr_zoidberg
You mean Silverlight died? I've seen that said over the years, but:

A) Microsoft still mantains and updates silverlight.

B) It's mentioned as one of the components of Windows 10[1].

So I'm really confused here as whether Microsoft will kill it already, or keep
it alive. They seem to be willing to kill it[2][3], but Netflix alone is
enough reason to keep it alive.

[1] [https://www.microsoft.com/en-
us/privacystatement/default.asp...](https://www.microsoft.com/en-
us/privacystatement/default.aspx)

[2] [http://www.digitaltrends.com/computing/microsoft-wont-
includ...](http://www.digitaltrends.com/computing/microsoft-wont-include-
support-for-silverlight-in-windows-10-edge-browser/)

[3] [http://www.windowscentral.com/microsoft-confirms-its-new-
edg...](http://www.windowscentral.com/microsoft-confirms-its-new-edge-browser-
wont-support-its-silverlight-player)

~~~
leetNightshade
Netflix supports HTML5, they don't "need" Silverlight.
[https://gigaom.com/2014/11/26/netflix-silverlight-
chrome/](https://gigaom.com/2014/11/26/netflix-silverlight-chrome/)
[https://help.netflix.com/en/node/23742](https://help.netflix.com/en/node/23742)

~~~
dr_zoidberg
Running Firefox 41 and it still loads the Silverlight plugin and explicitly
asks for it if disabled... Also, Netflix needing Silverlight drove the
development of Pipelight for Linux:
[http://pipelight.net/cms/about.html](http://pipelight.net/cms/about.html)

It got my attention that Chrome supports HD "up to" 720p, unlike the rest that
get up to 1080p. Why would that be?

------
tonyedgecombe
WinForms is still a great way to write desktop software if you don't need all
the features of WPF, I just started a new project with it and have been really
productive.

~~~
Pxtl
WinForms is _not_ simple. There are so many core classes that have
counterintuitive edge-cases and overcomplicated behavior, and so many things
you'd expect to work by default don't.

Databinding is a complete trainwreck, the Combo-box class is horribly
overcomplicated by its double-duty as text-entry and drop-down-list, the
DataGridView is a complete beast of leaky abstractions, and the layout engine
completely falls apart if somebody alters the DPI unless you obsessively test
DPI alterations yourself.

I don't blame Microsoft for any of this - it was 2000 and they were making a
wrapper around some terrifying legacy code.

But this thing should have been tossed in the dustbin of history a long time
ago.

~~~
duncan_bayne
Sure. But - serious question - what offering from Microsoft would you replace
it with?

~~~
Pxtl
I keep using it because I'm used to all its warts and idiosyncracies. So I
don't know what properly-supported alternative one should use. I just get
annoyed how many brand-new fresh-out-of-college developers I meet that use it.
They need something better.

~~~
duncan_bayne
Having used both WPF and Silverlight until 2010 (at which point I abandoned
the Microsoft stack altogether) I agree, but I don't think the answer is
either of those technologies.

Have you tried building GUI apps in Racket? That's the sort of thing I was
wishing for when using either Java or .NET to build Windows GUIs.

~~~
Pxtl
I've done some academic intro-to-FP stuff in Racket, but haven't really got my
feet wet with a non-toy application in it. So the GUI framework is good?

~~~
duncan_bayne
Yup. I haven't built anything of significance in it (yet) but it's proved
really easy to learn, and (again, in my limited experience) rock-solid stable
and fast enough:

A trivial example:

    
    
      #lang racket
      (require net/url
               racket/gui/base
               racket/sandbox)
    
      (define (menu-file-exit-click item control)
        (exit 0))
    
      (define frame
        (new frame% [label "Demo"] [height 480] [width 640]))
    
      (define menu
        (new menu-bar% [parent frame]))
    
      (define menu-file
        (new menu% [parent menu] [label "&File"]))
    
      (define menu-file-exit
        (new menu-item% [parent menu-file] [label "E&xit"] [callback menu-file-exit-click]))
    
      (send frame show #t)
    

... is all you need to create a basic GUI app with a File -> Exit menu option.
And that really is all there is - no resource compilation, no code-behind, no
separate languages for expressing the UI and the actions connected to it.

------
lorenzhs
Discussion of a previous article, focusing on the difficulties during the
development of ACAT and tailoring it towards Stephen Hawking:
[https://news.ycombinator.com/item?id=8686757](https://news.ycombinator.com/item?id=8686757)

------
btzll
Github repository:
[https://github.com/01org/acat](https://github.com/01org/acat)

------
dimman
To everyone who's interested and programming and are thinking; what should I
do/program? I'm sure there are a lot of small things or applications you can
do to help other people in need. See it as a learning experience and something
that might have a huge impact in other peoples life, how's that for a
motivator for something to do? Kudos and respect to all people behind this
project and to Stephen Hawking himself.

------
twotwotwo
My mom had ALS, and used single-switch input for a while, after typing and
writing on paper weren't possible. She wrote out little notes about what she
was thankful for, prayers, practical messages (she had Type 1 diabetes, and
told folks her insulin doses), and recipes this way. Eventually she had to
switch to giving messages to a human holding a letter board by looking towards
them for 'yes' and away for 'no'\--cameras or Hawking's infrared-laser-based
system weren't really feasible.

We looked at some software called EZ Keys from a company called Words Plus. (I
don't think she used it specifically, at least for long--I know she used
another program, a DOS-based one called Living Better that ran in 40-col. mode
that I can't anything about on the Internet.) EZ keys looked more or less like
Intel's thing -- scan rows, scan items in a row, completions/predictions over
at left. It even had an option to use a frequency-sorted keyboard like the
Intel one, with the common letters pushed to the top left (since those are the
first rows/cols to be scanned). Hawking apparently used EZ Keys, so it's
possible the Intel folks intentionally gave their thing a similar interface to
make the transition easy.

It is worth remembering that no user cares if it's WinForms or whatever. Some
folks might like a nicer voice if they haven't gotten used to theirs like
Hawking ;), but the main concern is just getting the message across. Intel
seems to have worked on the right stuff: better prediction (Presage
[http://presage.sourceforge.net/](http://presage.sourceforge.net/), which
looks interesting) and context-sensitive controls. The infrared-laser-based
input method sounds cool, too.

This is a neat space: an optimization/prediction problem where improvements
can be a significant help to someone. (There are also practical optimizations
that don't have much to do with the general word-prediction problem: sometimes
people have to say things about their care, food, etc., or generic 'hi' and
'bye', and it's good if those are fast.) A Web page or Chrome extension can do
a lot--how close can you get to smoothly operating the Web with just the
spacebar? the arrow keys and Enter? or plain old typing, but slowed down and
using 0-9 for completions?

I've heard that nowadays, people with communication trouble and enough
movement use text-to-speech on mobile gadgets with their nifty and highly
refined predictive input and that's awesome.

------
blackbeard
Thinkpad love there as well. His "custom" computer appears to be an X220
tablet in an enclosure.

~~~
noir_lord
Would make sense, easily available commodity hardware that is reliable and in
a decently small form-factor.

------
acqq
Is there Hawking's speech synthesis at all (there was an article that his
voice is based on some hardware device [http://www.wired.com/2015/01/intel-
gave-stephen-hawking-voic...](http://www.wired.com/2015/01/intel-gave-stephen-
hawking-voice/) )? I understand it's "just" a "navigation" system (replacing
the mouse and keyboard with the facial movement virtual key). If it's so, the
title (the " _speech_ system") is misleading.

The project also doesn't use SwiftKey but

[https://github.com/01org/acat](https://github.com/01org/acat)

"Presage, an intelligent predictive text engine created by Matteo Vescovi."

~~~
voiceclonr
@acqq: Shameless plug. This doesn't seem to have speech. However, I've tried
to build a text to speech synthesizer in www.voiceclonr.com. Appreciate if you
could try and leave feedback.

~~~
acqq
If I understood correctly the open-sourced version uses as an example the
Microsoft's Speech API. Searching for which I find the gems like this:

[https://connect.microsoft.com/VisualStudio/feedback/details/...](https://connect.microsoft.com/VisualStudio/feedback/details/664196/system-
speech-has-a-memory-leak)

"System.Speech has a memory leak - by eoghanoh

Status: Closed as Won't Fix"

I see your work is based on
[http://hts.sp.nitech.ac.jp/](http://hts.sp.nitech.ac.jp/) Can you tell us
what are your changes?

Edit: I see HN already commented your work:

[https://news.ycombinator.com/item?id=9812734](https://news.ycombinator.com/item?id=9812734)

~~~
voiceclonr
So much developer rage in that Status :) On the HMM stuff, it was pretty much
the baseline code from the link. The things I recall experimenting were more
about getting it done faster (threading some training phases, different gcc
options during synthesis etc).

------
jmpeax
If anyone is considering downloading this to use his voice to annoy your best
friend John with things like "Hello, my name is Steven Hawking. The universe
is big, but not as big as John's mother.", let it be known that this software
doesn't sound like Hawking.

------
datawaslost
Presage is great, but to clarify some other comments - it doesn't involve any
specific dataset, like SwiftKey - it simply does nice smoothed predictions
when given a large database of n-grams (groups of words) and their
frequencies. It's fairly easy to chop up a corpus into n-grams using NLTK or
other tools, and there's a good port for Python called Pressagio.

My startup Spoken - [http://spokenaac.com](http://spokenaac.com) \- uses
n-gram predictions to help users with aphasia or other language disorders
speak. The user interface challenges aren't quite as intense as Stephen
Hawking's binary input, but it's an interesting field if you're into design
and big data.

------
ris
Forcing a disabled man to use Internet Explorer. Surely this is the basest
form of cruelty.

------
andersonmvd
Now security researchers will analyze the code to find vulnerabilities to
exploit Stephen Hawking's speech system. Next headline Stephen Hawing's voice
sounds like Justin Bieber's voice, lol.

------
nimitkalra
The code in the GitHub repository [1] is pretty interesting to look around in.

[1] [https://github.com/01org/acat](https://github.com/01org/acat)

------
melling
Does it make sense to be more aggressive in predicting by giving the user a
second level from which to choose? For example, if he types 'b', then it could
offer to type 'black' or 'black hole'.

On the iPad, for example, if I type 'f', I get shown 'for', then if I accept,
I always see 'example' and 'instance'.

~~~
rjbwork
I believe there is a predictive typing keyboard based off of Tries that is
floating around out there for one of the mobile OSes.

------
sagivo
Amazon passwords:
[https://github.com/search?utf8=%E2%9C%93&q=filename%3Aaws.ym...](https://github.com/search?utf8=%E2%9C%93&q=filename%3Aaws.yml&type=Code&ref=searchresults)

------
almost_started
.Net WinForms? No wonder it sounds so bad.

