
RNN-Based Handwriting Recognition in Gboard - rey12rey
https://ai.googleblog.com/2019/03/rnn-based-handwriting-recognition-in.html
======
reubenmorais
The "Making it Work, On-device" paragraph makes it seem like TensorFlow Lite
will easily get your model running fast on-device, but in reality RNNs aren't
currently supported by the TFLite Converter and the TFLiteLSTMCell example is
super slow for training, so this is actually based on proprietary code not
available to mere mortals using open source TensorFlow. If you were to
actually try reproducing this work, you'd have to use several workarounds, dig
deep into the TensorFlow source code, and possibly still end up with a
suboptimal TFLite model.

Don't get me wrong, in terms of deployability and flexibility for production
usage, TensorFlow/TFLite is _really good_ , specially compared to other
frameworks, but Google tends to oversell the abilities of open-source
TensorFlow significantly in their marketing material, and you only find out
when you go and try doing it yourself.

~~~
lawrenceyan
For industry/real world work, TensorFlow is best in class. It is far superior
to any other existing framework. I agree that there are always areas for
improvement, but the way you worded your comment makes it almost sound like TF
is pretty subpar compared to other offerings.

The reality is more, TensorFlow is really the only option you have if you
don’t want to build everything from scratch again. Whether that’s a good or
bad thing, well at least it’s because TensorFlow is actually a good product
and not because Google is preventing others from building their own / pushing
others down.

~~~
ru999gol
what are you even talking about, tf is a mess. pytorch, mxnet, caffee2, etc.
are all superior fameworks

~~~
lawrenceyan
For research and/or hobbyist machine learning I agree. For real world
production use cases, you use TensorFlow.

------
modeless
Wow, it is really surprising to me that bezier curve control points produced
by an optimization process would be good inputs to a neural net model. Small
perturbations to the inputs could produce radically different bezier control
points depending on the decisions made by the curve optimizer, so this forces
the neural network to learn about the characteristics of the optimizer as well
as the input.

Neural nets usually thrive on raw high dimensional inputs, so dramatically
reducing the dimensionality of the input seems like a strange decision. I'm
sure it improves speed, but I would expect higher accuracy by processing the
raw input.

~~~
nielsbot
I don't know ML, but I was surprised too, since converting a series of points
into Bezier paths of a certain degree seems arbitrary...

------
appleflaxen
this is so awesome!

but how is it that we have RNN solutions for handwriting when we don't even
have a standard, canned RNN for OCR?

I know tesseract and related projects exist, but when I've tried them they
have been fairly brittle with lower accuracy than I was expecting. Accuracy
was especially problematic for letter combinations like "-ing" that would
consistently be recognized as "-mg".

Is there a good ML OCR library I'm missing?

~~~
pfortuny
Just a side comment: take into account that (as per the paper) there is
temporal input in Gboard (i.e. the timestamp of each stroke is important).

You do not have that for ing, so the software does not know that the dot is
“independent”).

------
AlphaWeaver
Really cool stuff! My phone isn't big enough to do handwriting on, so I'm not
really sure where this is supposed to be used? On a tablet I guess?

~~~
yorwba
I just tried tout for refirsttime, and although the keyboardspace on my phone
is barely large enough to cran five characters in there, the input scrolls
sideraysautomatically if you litt your finger long enoy gh. So longer words
can be entered as well. It doesn't seem to reevaluate previously decoded
segments baselon what follows, though, so you can end up with weird
misspellings at the beyinning of words. I dont think! Im going to use it from
now on, beaurthere cognition is balenow ghto requiresigniti cant editing and
the frictonisatittooncomfortable vihout a stylus.

Edited with the QWERTY keyboard:

I just tried it out for the first time, and although the keyboard space on my
phone is barely large enough to cram five characters in there, the input
scrolls sideways automatically if you lift your finger long enough. So longer
words can be entered as well. It doesn't seem to reevaluate previously decoded
segments based on what follows, though, so you can end up with weird
misspellings at the beginning of words. I don't think I'm going to use it from
now on, because the recognition is bad enough to require significant editing
and the friction is a bit too uncomfortable without a stylus.

------
arbie
How can I train an RNN to OCR _my_ scribbles? It would be the perfect mix of
physical paper and digital notes.

------
geophertz
I miss the time when in Gboard you could use the slide typing to type multiple
words at once. This was so useful and made people like me who are unable to
type quickly on a virtual keyboard (touch) to type very fast.

------
dvh
Isn't swiping inherently faster? With sweeping you need 1 angle (corner) for
letter. Typical letter uses much more than 1 corner.

~~~
cbhl
Only if you're writing words in the dictionary, and you're using the English
alphabet.

Handwriting recognition is way more impactful for users in, say, Chinese.

------
dajohnson89
I just switched from Android to iPhone, and Gboard on iPhone doesn't have the
translation function. It also doesn't have multiple languages -- if I want to
switch languages I have to exit out of Gboard and use the default iOS
keyboard. Anyone know why these features for Gboard are missing on iOS?

~~~
colde
It does have multiple languages. Clicking the settings icon right between the
button to switch to numbers and the emoji button switches languages. You can
also hold it down to go to settings.

------
fxfan
I used it on windows phone 5 years back for chinese- wonder if this is new to
android?

(on iPhone right now)

~~~
yorwba
Handwriting recognition has been available on Android for a while (especially
for Chinese). This article is "only" about incremental improvements.

