
Markov Chains Explained Visually (2014) - aogl
http://setosa.io/ev/markov-chains/
======
ryeguy_24
This is a phenomenal example of how to teach math. You can go through theory,
formulas and proofs all day long (yes, sometimes rigor is needed), but this
type of teaching is sticky. It takes a lot more time to display information in
the way the author has done, but it reaps massive benefits for those trying to
learn. In my mind, math is about concepts and there is no reason why we can't
start teaching math like this.

Amazing work. The collective intelligence of the world has just gone up.

Another amazing example of visual math lectures,
[https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw](https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw).

~~~
tralarpa
I don't know. Is there really a massive benefit? The visual explanation of the
Markov chains on that web site looks elegant and simple because... it is, i.e.
it's not really a contribution of the visualization, it's because DTMCs are
actually easy to understand. In a lecture, the teacher can easily simulate a
Markov chain on a black board ("Now we are in this state, now we have to roll
a die to decide whether we will go to state X or state Y"). Same for sealf-
learners: In a book, you can tell the reader to start at some state X, roll a
die, etc.

Honestly, I find that the Markov chain website is an example how you should
NOT do it. You are replacing the active intellectual process in the head of
the student by some trivial animations with scrolling texts and jumping tokens
which look nice and dynamic but...what are you learning?

Of course, this doesn't mean that visualizations are useless, but they should
be carefully designed with some learning goal in mind. Not easy! For example,
the visualization could slowly gray out states that are not visited. In that
way, the student could get a feeling for concepts like recurrence etc.

~~~
activatedgeek
I probably understand where you are coming from. But, you are may be missing
the point?

> The visual explanation of the Markov chains on that web site looks elegant
> and simple because... it is

I strongly resonate with this statement. However, I disagree with the claim
the "You are replacing the active intellectual process". In fact, it is the
opposite - it is stimulating.

Let me present an anecdote which will perhaps resonate with many people. Often
times, when venturing into a new topic, one tends to read the reference texts.
Let us be frank - reference texts are not beginner friendly no matter how much
they claim (they are reference for a reason). To get to the meat of the story,
a reader has to hold lot of context from past knowledge. This is hard and
often towards reaching the actual results, the reader is already exhausted.
Instead, such intuitive visualizations provide context to a new reader. The
next time the reader goes back to the reference text, this "lookahead" (from
the visualization) is valuable. It is easier to internalize that where in the
grand scheme of things a particular topic lies.

> Of course, this doesn't mean that visualizations are useless, but they
> should be carefully designed with some learning goal in mind. Not easy! For
> example, the visualization could slowly gray out states that are not
> visited. In that way, the student could get a feeling for concepts like
> recurrence etc.

This I believe is nitpicking.

~~~
thanatropism
Your intellect can be "activated" in the pursuit of a superficial knowledge of
Markov Chains to tell your boss or remember in the future that it might help
you with problems. EZ-learning stuff will help you then.

Or your intellect can be "activated" in the sense that further engaging it
leads you to deeper, more general knowledge. Is this website doing it? I don't
believe so.

------
DoritoChef
This really drives home a sentiment I've acquired during the course of my
college education: CS/Math is often considered "hard", but I feel that's just
because we've struggled with getting good visual/verbal communicators to
dedicate their lives to CS/Math education. I really feel that when explained
properly (and the definition of "properly" sometimes need to be adapted from
person to person), topics like Gibbs Sampling or Fourier Transforms or
Backpropagation aren't topics that should take entire weeks of self-study to
grasp in 2018. Yes, they require some math background, but there's some strong
intuition behind them. Maybe I'm just slow or thick in the skull.

------
jihadjihad
Discussion from 4 years ago:
[https://news.ycombinator.com/item?id=8103240](https://news.ycombinator.com/item?id=8103240)

~~~
ddeck
and 2 years ago:
[https://news.ycombinator.com/item?id=11323122](https://news.ycombinator.com/item?id=11323122)

~~~
wlll
Weirdly today too. What a coincidence!
[https://news.ycombinator.com/item?id=17766358](https://news.ycombinator.com/item?id=17766358)

------
lewis500
keeps getting posted and going to the top, almost like HN is a memoryless
process... :)

glad yall are still enjoying it.

~~~
lillesvin
I'm certainly glad it was posted this particular time. I've never seen it
before and now I learned something today. :)

~~~
xenihn
I'd also never seen it and find it very useful.

~~~
2020-3030
Confirms what I expected - not new in the world, but new to many/most people.

------
uoaei
And Hidden Markov Models (HMMs) are just ones where the symbols are emitted
upon interaction with an edge of the graph instead of the node. The nodes are
still the "states" of the system, but in general you can't know which state
the model is in just by observing the symbols emitted by the edges as there
may be equivalent structures in the graph associated with different states.

------
ejlangev
Really great visualization here, kudos to the author. Really makes is clear
what's going on. Made me think that things like this could be useful in
textbooks as well if they were more digital. Maybe in the next few years there
will be a better way to integrate this kind of stuff into courses, I guess
today they could always happen in lectures.

~~~
joshuamorton
Take a look at distill.pub (which wsa also featured on the HN frontpage
today). Its essentially a Machine Learning Journal with a focus on clarity and
visualization, and is therefore interactive and visual. Colah is around on HN
and might comment explaining it better.

------
rtkwe
Is there a standard way to build markov chains with deeper memory than just
the current state? For example in the rainy sunny markov chain our rainy
probability should/could decrease as the number of rainy days in a row
increases. In a pure state machine this would require an infinite number of
states for SS...SSS and RR...RRR.

~~~
ColinWright
Lots of other commentators are saying that the property of MCs is that they
are memory-less, but in fact it's trivial to make a MC have deeper memory: Use
tuples.

So if you have observations A, B, C, B, C, A, D, A, ... then you form pairs:
(A,B), (B,C), (C,B), (B,C), (C,A), (A,D), (D,A), ...

Each node in the MC is one of these pairs, and (X,Y) -> (Y,Z) if Z is the new
state that appears.

The number of states grows _very_ quickly, but if you make the system dynamic
then it works fine. This is what you use to randomly generate text using the
previous 3 words as a predictor for the next. Then (w0,w1,w2) predicts w3, and
your next "state" is (w1,w2,w3).

And so on.

~~~
ubershmekel
Is this equivalent to each node representing a memory state?

Seems like it would not solve rtkwe's concern. In their example using tuples
make a node state equal (S, COUNT_OF_DAYS), (R, COUNT_OF_DAYS)? You would
still need an infinite amount of nodes to represent an infinite precision
integer state.

~~~
ColinWright
You don't store count count of days, each node is the sequence of the previous
_k_ states. Usually _k=1,_ but set _k=2_ and you get something where each
state is predicted based on the previous 2 states.

So no, you don't need an infinite amount of memory, it's all still finite
(although rapidly gets large).

------
anonytrary
This is how we were taught it in our Linear 2 class; the professor used static
pictures via chalkboard, but it was still super helpful. I feel like most of
the benefit is gained in drawing the graph and talking about "hopping between
states". Animating it might help some people who think more mechanically.

------
ajeet_dhaliwal
This truly awesome, the visualization really helped cement the idea in my
mind.

------
geoffreyhale
@lewis500 is your code for the live visuals publicly available?

~~~
vicapow
yep! see [https://github.com/vicapow/explained-
visually](https://github.com/vicapow/explained-visually)

------
s-shellfish
This is very a good depiction of a Markov chain. However, it's always a coin
flip on your emotions to anthropomorphize math.

Pronouns.

------
matachuan
Expand a little then it becomes the probabilistic graphical model.

------
0xffff2
Am I the only one who finds it absolutely impossible to read text witch
constantly moving animations on the screen?

~~~
Raphmedia
You are not alone. That's why there is even a media query for it!
[https://css-tricks.com/introduction-reduced-motion-media-que...](https://css-
tricks.com/introduction-reduced-motion-media-query/)

------
Giho
Isn't this affected that its difficult to get true random by computers? Its
more pseudo random and therefore the chance is not real chance but
predictable.

~~~
Sohcahtoa82
While true, there are cryptographically secure PRNG's that are incredibly
difficult (Theoretically impossible) to predict.

In any case, unless you're using Markov chains to generate data that needs to
be kept secure (Like generating a password that could be pronounced), the
predictability of random numbers isn't really relevant.

