
Markov Chain Text Generation - ingve
https://blog.demofox.org/2019/05/11/markov-chain-text-generation/
======
parabiii
You can use Google search results as a Markov chain (with the ranks translated
to probabilities). Then your chain is fitted to all of the visible web.

Another cool use for ML is for categorical and textual data. A fitted Markov
Chain can give the probability of a string occuring. Strings that are more
random (such as spammy text) get a low probability. Strings that are similar
(such as certain user agents) get a similar probability, without having to
directly compare these.

------
monokai_nl
Years ago I used Twitter streams as input for markov chain text generation.
It’s interesting to recognize your own writing style in what is essentially
gibberish:
[http://yes.thatcan.be/my/next/tweet](http://yes.thatcan.be/my/next/tweet)

~~~
globuous
really good work !! I had a great laugh reading generated text from Trump and
Obama ^^

------
mnagel
I did something simliar at university and (amongst other things) ran it over
(simple) music scores in Lilypond format. The resulting texts could be
converted to MIDI format and played by a synthesizer. It was quite interesting
to listen to those. The code is still online at
[https://github.com/mnagel/markov](https://github.com/mnagel/markov)

------
nafizh
Considering RNNs have completely taken over text generation with language
models, are there any papers that comprehensively compare the power of RNN and
Markov models?

~~~
xapata
In a sense, they're equivalent. RNN's create models of probabilistic state
transitions. In other words, they create Markov chain models. You might argue
that a Markov chain doesn't have memory, but it's easy to incorporate "memory"
by expanding the definition of a "state" to include recent history or any
function of the past.

~~~
raverbashing
No, they're not. And if it was "that easy" to incorporate memory RNNs would
not have replaced Markov Chains

As an example try to build a character level Markov chain for text generation
and see how that goes.

~~~
aesthesia
The point is that an RNN or other language model describes a stochastic
process which can be seen as a Markov process with state space given by the
internal state of the RNN. This doesn't say anything about the ease of
implementing or learning such a model. Just that from a purely mathematical
perspective, they are equivalent in power.

~~~
raverbashing
Ah I see, I agree with their equivalence in the mathematical sense.

~~~
srean
Am curious about what other sense of 'equivalence' you had in mind when you
stated with great authority that RNNs are not Markovian.

~~~
raverbashing
In the practical sense, implementing a RNN is easier than a Markov Chain
(libraries, etc)

Also RNNs can "evolve" (or be adapted) more easily to other architectures like
LSTMs.

~~~
mratsim
If you try implementing a RNN from scratch with the same base as the OP
N-order Markov Chain and without Numpy or a deep-learning frame work, I
guarantee you that you will not find it so easy.

For reference his code his 400 lines and this is what he is importing:

    
    
        #include <stdio.h>
        #include <vector>
        #include <string>
        #include <unordered_map>
        #include <random>
        #include <algorithm>
        #include <array>
    

We can assume that <math.h> import is reasonable for comparison

To implement RNNs for text you need to implement tensors/ndarrays of 3
dimensions with slicint and all that jazz, efficient matrix multiplication,
implement the RNN, implement the RNN backprop properly.

RNNs stands on the shoulder of giants.

------
herval
I have a Twitter bot that reads text from random Project Gutenberg books and
spits out book passages:
[https://twitter.com/markovian_lit](https://twitter.com/markovian_lit)

In general, it’s a bit all over the place and convoluted, but sometimes it
generates true gems. People find it via hashtags, and some folks even interact
before realizing it’s a silly bot.

------
menixator
Theres /r/SubredditSimulator/ on Reddit which uses markov chains. Never fails
to entertain me.

------
wiggler00m
Ask HN: Markov Chain book recommendations?

~~~
tastroder
Bayesian reasoning and machine learning by Barber (2012) had nice introductory
texts on most of the mechanisms at work iirc if you're looking for a lecture
oriented resource.

