
Show HN: Neural Net Writes Fake Kanji in Your Web Browser, Stroke-By-Stroke - hardmaru
http://otoro.net/kanji/
======
tdeck
This is really interesting, and it reminds me of the Cangjie input method for
Chinese characters
[https://en.wikipedia.org/wiki/Cangjie_input_method](https://en.wikipedia.org/wiki/Cangjie_input_method).

It was designed to run within the disk+memory limitations of an Apple II where
they couldn't store a full character dictionary. Instead, it broke the
characters up into a few fundamental shapes and composition rules, and
generated the graphics on-the-fly. The encoding of a character was the set of
keystrokes that signified how to compose it, which meant you could "invent"
new characters by just typing random strings.

~~~
hardmaru
It's very interesting you bring this up, Canjie (倉頡) was the first Chinese
input method I learned, and I didn't know it was designed to run within
disk+memory limitations. To this day, I think it's still the best input
method, and the preferred choice for many professionals in the publishing
business in Taiwan.

For example, if I wanted to type the character 森, which means forest, I just
type three tree's 木木木

The logical ordering flows very well as well. for example:

door: 門 ＝ 日弓

question: 問 ＝ 門+口 ＝ 日弓口

What I find interesting with the LSTM+MDN neural network was that it has no
ideas about the concepts of the radicals themselves, and has to come up with
this concept, and then an even more abstract concept of combining radicals to
form a Kanji.

I wonder if we can use some sort of sparse-encoder technique with neural nets
to make a more efficient version of Cangjie (which was designed using human
heuristics), like DVORAK vs QUERTY.

------
hardmaru
A blog post on how this algorithm works:

[http://blog.otoro.net/2015/12/28/recurrent-net-dreams-up-
fak...](http://blog.otoro.net/2015/12/28/recurrent-net-dreams-up-fake-chinese-
characters-in-vector-format-with-tensorflow/)

