

The Unreasonable Effectiveness of Character-Level Language Models - woodson
http://nbviewer.ipython.org/gist/yoavg/d76121dfde2618422139

======
bcoates
This is exactly the letter-based version of the "Dissociated Press" Markov-
chain algorithm, right?

I'd suspect some of the perceived quality at higher orders (particularly for
the source-code example) is just coming from the transition graph becoming
sparse and deterministically repeating long stretches of the input text
verbatim.

~~~
unhammer
Goldberg says parenthesis-balancing and indentation etc. would require lots of
non-trivial human reasoning to implement, but I found
[https://news.ycombinator.com/item?id=9585080](https://news.ycombinator.com/item?id=9585080)
's example even more amazing: the NN actually learns meter and parts of speech
– I don't find it very likely that you could get a _character-based_ markov-
chain to generate completely novel grammatically correct constructions (with
correct meter!) from such a small corpus, since you always have to weigh
correctness (high order) up against ability-to-generalise (low order).

~~~
SilasX
That same article (that you linked) showed how the RNN learned to balance
parentheses.

~~~
unhammer
? I tried linking to a comment …

EDIT: Oh, misunderstood. I know the RNN can do the parens-balancing, and
that's why Goldberg said that parens-balancing was impressive, since with his
method you'd need to add other hacks around it.

------
JoachimSchipper
HN discussion of the blog post this is a response to (890 points, 204
comments):
[https://news.ycombinator.com/item?id=9584325](https://news.ycombinator.com/item?id=9584325).

------
rectangletangle
I did a project that used a similar technique, but mapping only a single state
transition. For its simplicity it was very effective.

[https://github.com/rectangletangle/atypical](https://github.com/rectangletangle/atypical)

------
flipp3r
I also did a project that used a similar technique which scrapes text from
4chan, mapping N words to the next possible word.

[https://github.com/skphilipp/humanity](https://github.com/skphilipp/humanity)

~~~
transpy
Why 4chan?

~~~
flipp3r
I browse it myself, it has a really simple API and the content is refreshed
quite fast on active boards.

------
wodenokoto
People have been working for age trying to generate passable text by looking
at n-grams for whole words. It is quite surprising to see that all along we
could have done better by using a simpler model!

------
transpy
Yes, I guess computers are better than us at deciding how to organize
characters in words, judging by the author's typos: 'liklihood',
'Mathematiacally', 'langauge', 'immitate', 'somehwat', characer', 'commma',
Shakespear', 'characteters', 'Shakepearan'.

