
Google launches voice typing in Google Docs - nandaja
http://googledocs.blogspot.com/2016/02/type-edit-and-format-with-your-voice-in.html
======
ACow_Adonis
Its pretty impressive. I haven't touched much upon trying such things for
quite a few years when I last had a quick play with Dragon Naturally Speaking.

It's getting PRETTY close, with a basic cheap laptop microphone, my Australian
accent, and no training.

But I did say pretty close. Not exactly right. And I'm only speaking a few
sentences.

I read out a passage aloud, and apart from not quite getting everything right
(but getting some remarkably difficult words/names correct), the thing that
struck me is how written word, and the art form of such, is in fact quite
different from a verbal stream of spoken words. The little sigils and breaking
up of a written text conveys a whole bunch of meaning and subtlety that a word
for word stream doesn't quite get.

For me, its just inaccurate enough to still be irritating. Everything's
working fine, and then I hit a mis-dictated word, have to do a bit of a double
take and ask myself "What the...what on earth does it think i'm trying to say
here...oh...i said THAT!" which breaks the flow of anything but the most basic
and primitive sentence.

But then that makes me wonder, you know what... how many humans can actually
follow everything I'm actually saying? It's not like I can peer into their
heads and verify the transcription of voice-to-words inscribed on their
brains.

~~~
danieltillett
Humans don't follow what you say :)

On this topic I have never managed to get my typing above 60 women because
this is as about as fast as I can think ahead writing.

~~~
smhg
I wouldn't complain if you can type as fast as 60 women! ;)

~~~
danieltillett
Lol. Got to love the auto correct of wpm to women :)

------
brotchie
Super impressive.

My wife just spoke full-speed Vietnamese into a Google Doc, 100% correct, even
the tone marks.

~~~
pbhjpbhj
Is there something particular about Vietnamese that makes it easier than other
languages for dictations? Like, does it lack homophones, are letters always
pronounced the same way? 100% correct sounds incredible.

~~~
baldfat
> Is there something particular about Vietnamese that makes it easier than
> other languages for dictations?

phonemic orthographies (These languages should work better) -
[https://en.wikipedia.org/wiki/Phonemic_orthography](https://en.wikipedia.org/wiki/Phonemic_orthography)

English is ___highly non-phonemic_ __the only language I would think is worst
is French /Greek with silent letters and the different accents and in Modern
Greek /i/ can be written in six different ways: ι, η, υ, ει, οι and υι. My
Italian and most Eastern European languages would be easier European languages
for voice translation.

This seriously reminds me of my days in college and grad school
learning/teaching dead languages. The dead languages (Latin, ancient Hebrew,
Classical/Koine Greek and Aramaic) we really don't know how they were
pronounced so we just made them phonetic, which tells you that we don't speak
them correctly since no language is 100%.

Vietnamese is interesting because there are several dialects that would trip
this up but doesn't appear to.
[https://en.wikipedia.org/wiki/Vietnamese_language](https://en.wikipedia.org/wiki/Vietnamese_language)

~~~
iamcreasy
What other languages are like this?

~~~
pbhjpbhj
The linked Wikipedia page on phonemic orthography uses Serbian as an exemplar.
This section looks like what you're after -
[https://en.wikipedia.org/wiki/Phonemic_orthography#Compariso...](https://en.wikipedia.org/wiki/Phonemic_orthography#Comparison_between_languages).

------
gz5
I first realized the voice to text improvement while watching my three-year
old daughter search on my Android tablet...she can't spell (correctly) but
used Google voice search quite effectively.

This should enable the very young, the very old, the disabled, etc. to
digitize (and therefore share) their words, stories and worlds. Powerful.

~~~
grahamburger
I've had the same experience watching my (now 4yo son) voice search on my
phone. It picks up what he's saying very reliably ... even when he's still
laughing uncontrollably from the results of his last search :)

In case you were unaware, you can listen to past voice searches here:
[https://history.google.com/history/audio](https://history.google.com/history/audio)
sometimes my wife and I will go back and listen to my son's searches for good
laughs all around ;)

------
sdrothrock
I'm sure Google is mining (anonymized) data from this -- I wonder if they can
use it to improve the transcriptions for Youtube automatic captioning by
seeing what kind of revisions people make to the voice transcription.

~~~
BinaryIdiot
I've often wondered this but it feels like they're very siloed. Google Now
seems to improve every year; it almost never gets anything wrong that I speak
nowadays but just a few years ago it could get rough if I'm trying to get it
to type a bunch of stuff.

But YouTube auto captions? They seem just as horrible today as they were years
ago.

~~~
eitally
There are a lot of checks and balances internally re: who's allowed access to
user data, even in cases where it seems logical & obvious. Regardless how it
appears externally, Google does take data privacy & security very seriously
(to the point where sometimes it's painful trying to get things done).

------
brockers
The Chrome integration is what is new right? I have been using Google Docs via
phone using voice typing for a LONG time now.

In fact, when I wanted to use voice typing at my desk I would just open the
phone, open the document on both the phone and Chrome and watch the text
appear on screen. Then edit it from the keyboard.

~~~
dharma1
Cool idea. You won't get voice commands though

------
eva1984
Google Now has been used as voice reminder keeper by me for a long time!

There are cases where it can still miss by a large margin, but general
usability has greatly improved over past 2 years.

I would love to see that this thing become smarter, that it can detect you are
saying a list of things then format them automatically. That well be truly
AMAZING!

~~~
simplemath
I am sure that the team who owns voice recog at G is working on it.

I wonder organizationally if it's structured that way, but I digress...

This particular integration is really impressive after some cursory testing.

If I were CTO at, say, Dragon... I would be having a sleepless night.

------
gregsadetsky
This had been announced back in September 2015, albeit from a Google Docs for
Education point of view:

[https://googleblog.blogspot.co.nz/2015/09/google-docs-
classr...](https://googleblog.blogspot.co.nz/2015/09/google-docs-classroom-
school.html)

Under the first video, see "With Voice typing, you can record ideas or even
compose an entire essay without touching your keyboard."

\---

It seems that what's new are the "Voice commands" to Select Text, Format, etc.
etc. That's really great! Take a look at the commands here:

[https://support.google.com/docs/answer/4492226](https://support.google.com/docs/answer/4492226)

------
james_pm
I can't stress how important this is for people like my daughter who at age 13
reads at a Grade 3 level thanks to a genetic disorder that affects her
learning. She's a great verbal communicator, but the reading/writing is an
immense challenge. Things like this will change her life and open up
opportunities that we're previously not open to her.

------
shanselman
Voice dictation is HUGELY useful but few people use it anymore. Most of us
tried Dragon in years past, it sucked, and we gave up.

I believe so much in this that I made a Siri->Windows bridge and dictate 80%
of my text with it. (except that hash rocket)
[http://myechoapp.com/](http://myechoapp.com/)

~~~
melling
David Pogue has been using Dragon for over a decade:

[https://www.youtube.com/watch?v=x0GXX-
SJuQM](https://www.youtube.com/watch?v=x0GXX-SJuQM)

John Siracusa used Dragon to write his 20,000 word Mac reviews.

Developers have even used Dragon to assist with programming:
[http://ergoemacs.org/emacs/using_voice_to_code.html](http://ergoemacs.org/emacs/using_voice_to_code.html)

I think voice accuracy is there but we need more integration with our apps and
operating systems. Consistency would help too. Google recognizes "undo" while
Dragon recognizes "scratch that"

~~~
shanselman
Point taken but I would guess that they use nice headset mics and extensively
train (and maintain and carry around) their voice databases. With Siri you can
get 99% accuracy if you dicate punctuation and there's no training or fancy
mics.

~~~
melling
I don't believe you need to extensively train the voice database. Most don't
require any training to start. Don't think expensive microphones either. Maybe
you should just read some of the links already posted.

~~~
throwaway049
Dragon strongly recommends training and using a good mic, but mine is good and
cost 30 BP which I don't think is expensive in this context. Initial training
takes about 15 minutes.

------
k-mcgrady
Is there a technical reason this only works in Chrome? It wouldn't work for me
in my main browser (Safari 9.1).

~~~
Klathmon
Safari doesn't support the 'getUserMedia/Stream API' so it can't work on
safari.

~~~
k-mcgrady
Thanks. A quick search shows 'getUserMedia/Stream API' is supported in Firefox
though and people on this thread don't seem to be able to use it in Firefox (I
haven't personally tested).

~~~
Klathmon
Hmm, then maybe it's using the speech recognition API which is really only
supported in chrome right now (it's in firefox but behind a flag).

Although i just assumed that google would be running the speech recognition
themselves.

------
petervandijck
Google seriously needs to build an Amazon Echo competitor NOW. Using Nest (if
that division can muster actually creating a new product) and it's learning
chops.

Google: please don't mess this up as a science experiment as you do so many
products - build it for real.

------
giancarlostoro
Amazing work, it is good to see Google Docs moving forward. I have written
plenty of essays in Google Docs in the past, hopefully this new feature is
just what some have been wanting. Thanks for the great work guys!

------
romaniv
Can't wait till Google starts data-mining background noises to "improve" your
experience. Yes, they can do it with other of their services, but this one is
the most likely to provide the most personal information. Also, it will
capture much more information, since you will be running it over longer
periods of time.

------
cableshaft
Looks like it's still not great for writing fiction. I didn't see anything for
"Quotation Marks" for dialogue in their Voice Commands, and even if it did, I
sure hope I wouldn't have to say "Quotation Mark What was that Question Mark
Quotation Mark asked Billy Period. Quotation Mark I said Poop Nuts Exclamation
Point Quotation Mark said Daniel Period."

This is much much slower than I can type. I know it'd be hard to infer
punctuation, but even just a one syllable shorthand command for each common
bit of punctuation would be nice (you can disable it by default).

I really want to use this stuff too, because it means I could write stories
while exercising at home, which is apparently how Randy Pausch wrote his last
novel (he dictated it and had someone else transcribe it... I'd like Google to
do that last bit for me).

~~~
melling
If you're willing to pay, Dragon Naturally Speaking has more features:

[http://whatsnext.nuance.com/connected-living/thursday-tip-
ho...](http://whatsnext.nuance.com/connected-living/thursday-tip-how-to-speak-
commands-and-punctuations-with-dragon-dictation/)

If you simply want a better free product, you'll have to wait a few more
years. In the meantime, the solution will work well for tens of millions of
people and Google can learn from them.

~~~
cableshaft
How are they nowadays? I tried their free demo app awhile back and had to
correct pretty much every third spoken word using an annoying user interface
to go back to my previous text, that it was easily taking less time for me
just to type it in the first place.

~~~
melling
Siri is powered by Nuance technology. They've been pretty good for a decade
now. David Pogue and John Siracusa have been using it to do their writing.

[http://arstechnica.com/apple/2013/10/os-x-10-9/23/](http://arstechnica.com/apple/2013/10/os-x-10-9/23/)

------
cellover
They have my emails, they have my documents, they have my pictures, they have
my location, they have my browsing history, and now they can have my voice.

Only lacks DNA, fingerprints, retina footprint and non-verbal communication.

Oh wait, it's only a matter of time.

~~~
oneeyedpigeon
I was waiting for this comment. What's the alternative? No voice-recognition?
Voice recognition that only runs client-side and lacks all the advantages a
centralised service can provide?An equivalent provided by someone else that,
for whatever reason, you trust more than Google?

Edit: I didn't downvote you, though, and I've upvoted to counteract whoever
did. The point you're raising can clearly contribute to an interesting
discussion.

~~~
cellover
I understand that my comment makes people uncomfortable as in "ha this is the
typical anti-Google-paranoid guy". I respect that, and I used a specifically
skeptic tone on purpose.

I love Google, they make great products, but at the same time I can't help
being afraid of this marvelous monster.

~~~
oneeyedpigeon
Oh, I hear you. Part of me totally shares your concern, and - just like pretty
much anyone else - I wouldn't want Google recording everything I say into my
microphone. But that's pure speculation right now; let's wait until we know
whether or not they're actually doing that, or until someone's at least gone
through the Ts&Cs with a fine toothcomb.

~~~
kybernetikos
[https://history.google.com/history/audio](https://history.google.com/history/audio)

------
eternalban
Biometerics [1]. Googling for Google's biometrics privacy policy.[2]

HN likes to talk about the layperson non-techies disregard for technology's
impact on our rights, but a grep of this thread turned up no hits on
biometrics.

[1]:
[https://en.wikipedia.org/wiki/Biometrics](https://en.wikipedia.org/wiki/Biometrics)

[2]:
[https://www.google.com/?gws_rd=ssl#q=inurl:google.com+privac...](https://www.google.com/?gws_rd=ssl#q=inurl:google.com+privacy+policy+biometrics)

~~~
thrownaway2424
This thread doesn't hit on "chemtrails" either.

~~~
eternalban
For obvious reasons.

Your voice is a biometric. Innovation is fine, but Google needs to address our
digital rights.

------
amelius
Why isn't this just built into the browser or the OS?

~~~
marak830
Windows does have a built in dictation tool, you can even us it in .net (i am
for my side project).

------
archiebunker
This is totally cool. Their translations are pretty good. Regular desk
microphones (condenser) work just fine.

Thank you for posting this.

------
mbrock
I would like to use this kind of thing even if I had to type the commands...
anything to avoid mousing, and holding down arrow keys, and all that tedious
stuff. Typing sentences is totally natural for me, I can do it all day. For
years I've been dreaming of a renaissance of ed-like editors.

~~~
dcvuob
Welcome to the future (of 2013):
[https://youtu.be/8SkdfdXWYaI?t=9m5s](https://youtu.be/8SkdfdXWYaI?t=9m5s)
[https://youtu.be/8SkdfdXWYaI?t=16m22s](https://youtu.be/8SkdfdXWYaI?t=16m22s)

Don't forget to see the entire talk (28 minutes):
[https://www.youtube.com/watch?v=8SkdfdXWYaI](https://www.youtube.com/watch?v=8SkdfdXWYaI)

~~~
shpx
One important note is that your throat gets tired talking, just like your
hands do typing.

~~~
dcvuob
I think it is an issue of convenience. Anecdotal evidence, I have never had
trouble doing either of those for 8 hours straight (and more).

Voice will need some serious software support if it is going to take off.
There is zero chance that users will learn those mnemonics, just as now almost
nobody knows how to touch type. The usage will be more in the form of general
commands, than specifying every action like we do it now with the
keyboard+mouse.

------
Theodores
Seriously folks, this is awesome. Does anyone have a recommendation of a USB
microphone/headset that will work with Ubuntu so that I can start using this a
bit more comfortably?

I did not know voice typing was possible before, this has made my day. All I
need now is to invest in a decent microphone!

~~~
joshdotsmith
Get a Blue Yeti.

~~~
NegatioN
Blue Yeti is good (and not too expensive), but isn't it a bit overkill for
voice typing?

If you generally feel like having a better microphone would help in other
cases as well (Skype, vlog, what-have-you), then I'd agree with the above
post.

------
JohnDoe365
Not available for me, menu item grayed out. Austria/Europe here. Language is
set to English/USA.

~~~
janus24
I have that on Firefox, it works with Chrome.

~~~
JohnDoe365
Alright, was with Firefox

------
athletics
Am I the only one that finds text to speech cumbersome and more of an
impediment than anything else?

I applaud the effort, but spreadsheets are difficult enough to navigate at the
best of times, let alone trying to manipulate with your voice.

~~~
ergothus
Depends on the context.

Running precise commands? Keyboard wins.

No keyboard? I love the voice search of my roku 4 over trying to search via
virtual keyboard and arrow keys, or using my Amazon Echo to find artists/songs

Long text? While I prefer typing, I remember one author I heard (Kevin
Anderson, I think) who dictates all his books (first drafts) into a mini-
recorder and pays someone to transcribe them. I find that hard to comprehend,
but I bet he's not unique.

Spreadsheets? Keyboard wins for data entry, but I bet voice would be
convenient for analysis. "Sum column B" "What is the average of column D,
excluding 0 values"

------
JulianMorrison
This sounds like it could be a massive step up for disabled and low literacy
people worldwide. The fact that it's in so many languages, and accessible
anywhere with internet, really distinguishes it.

~~~
conceit
To the contrary, I believe, while it will be helpful in a number of ways, the
orthography skills would likely suffer from lack of exercise as a computer
takes on the task. Writing helps a lot to have to think about the spelling.

------
nitin_flanker
Amazing, I always wanted a feature like this. Sometimes, we're so tired to
move our fingers to type something. I hope they make it as convenient as
possible, like introducing voice typing in Google Keep, and creating a
complete note just by saying OK Google.

Sometimes, things pop-up in mind suddenly and we need to note them down right
away.This is definitely the coolest feature. Also, on the other side, this is
also helpful to people who face problems with typing, or somehow can't type
(temporary injury in hand, or slow typing speed, etc.)

------
jimothyhalpert7
Is there any information about an API for this? I know that they abandoned
their old Voice to Text API, which is sad, because there aren't that many
alternatives out there. I'm using Watson right now, but it's really bad, and
the language range is very narrow. I'm also open to any suggestions for Voice
to Text online APIs or local server side solutions.

------
entwife
In response to multiple language support, an alternative use for the voice
recognition is learning to speak well in another language. I've already
improved my native English, by noticing which words are mistranscribed. And
assuming the problem is with my imprecise speech, not the bot.

------
funkaster
I tried it a few days ago and it was amazing, even in Spanish (latam accent)
and English with my non-native accent.

Anyone knows if this is available through API? I would love something like
this when writing text on emacs & org mode :-)

------
dharma1
This is really nice. I think laptops will need to start having array mics for
better far-field capture.

Tested it for a quick Todo list and it was about 95% accurate. I can see
myself using this

~~~
ghaff
I suspect that array mics can be important for accurate voice recognition. My
anecdotal impression, for example, is that Amazon Echo is MUCH better than
Siri on an iPhone at recognizing what I'm saying. I can tell Echo from across
the room with music playing to add something to a shopping list and it gets
things mostly right.

------
hobarrera
The menu item "Voice Typing" in "tools" is disabled for me, with no
explanation or hint as to why.

------
kull
This actually works, finally a 'machine' which understands my easer european
new york accent. Bravo!

------
imaginenore
dictation.io is another voice recognition system that impresses me, and it
understands tons of languages.

~~~
blackkettle
It's the same Google ASR on the backend.

------
jacktang1980
We may programming in voice in near future.

------
jacktang1980
we may programming in voice in near future.

------
Mohammedmeshaal
مرحبا

------
justsaysmthng
Tried with Romanian, got a bunch of random words .. about 50% right, the
resulting text is totally unintelligible. I was expecting a lot more
accuracy.. Apple's built-in dictation does way better than this.

------
pjmlp
I bet it doesn't support Portuguese in all its variants.

~~~
gregsadetsky
It does support both "Brasil" and "Portugal". Have you tried it out? Does it
work well in Portuguese?

~~~
pjmlp
Not in a place I can try it.

But usually such dictating software does a very bad job regarding Portuguese,
regardless of the language variant.

Apparently I offended some people, thanks for the downvotes!

~~~
NegatioN
The offensive part is that you're what you're stating is something you haven't
even bothered to examine before deciding to write down an opinion.

~~~
pjmlp
Because in 30 years of computing experience I am used to my mother tongue
being forgotten by American companies when they announce voice dictation
software.

I am not sorry if others cannot take critics from those used to be ignored by
them.

~~~
oneeyedpigeon
Even if you'd just asked the question "How good is Portuguese support?",
whilst you probably wouldn't have received too many upvotes, I think you at
least wouldn't have been downvoted into oblivion. Unfortunately, what might
have been an important point will now just get buried.

------
artumi-richard
What fun:

I have recently been reading The Hobbit with my 6 year old son Benjamin stop
he has become engrossed in the book somewhat and then just hearing about the
Adventures of Bilbo Baggins and Gandalf as well of course as mine growing are
a diary Norrie tilly tilly Bailey and wailing not forgetting the king Under
the Mountain himself starring oakenshield Gollum Gollum

