
A tutorial on the free-energy framework for modelling perception and learning - eli_gottlieb
https://tmorville.github.io//finprojects/free-energy/
======
boltzmannbrain
For the probabilistic graphical model and belief prop perspective, check out
Friston et al. (2017) "The graphical brain: Belief propagation and active
inference":
[https://www.mitpressjournals.org/doi/pdf/10.1162/netn_a_0001...](https://www.mitpressjournals.org/doi/pdf/10.1162/netn_a_00018)

For the neural corollaries of predictive coding, check out Shipp (2016)
"Neural Elements for Predictive Coding":
[https://www.frontiersin.org/articles/10.3389/fpsyg.2016.0179...](https://www.frontiersin.org/articles/10.3389/fpsyg.2016.01792/full)

For a state-of-art CV framework that fits with the free energy principle,
check out the Recursive Cortical Network from George et al. (2017) "A
generative vision model that trains with high data efficiency and breaks text-
based CAPTCHAs ":
[http://science.sciencemag.org/content/358/6368/eaag2612](http://science.sciencemag.org/content/358/6368/eaag2612)

~~~
rm2040
How does RCN model uncertainty?

~~~
boltzmannbrain
Max-product BP inference propagates local uncertainties in the model to arrive
at a globally coherent solution. An example of where this is particularly
useful is resolving border-ownership amongst neurons representing a common
contour.

------
eli_gottlieb
Disclaimer: I am not the author of the page. I've read the tutorial paper they
refer to, and work with related material. In our lab, we've had a _very_ hard
time finding accessible material on predictive coding that's actually amenable
to equations and code, and so this is something of a godsend.

------
elcritch
Anybody familiar with this method? It looks very intriguing as it looks to be
combinging Bayesian posteriors with the neural activation function. Haven’t
had time to dig into myself, and it would be nice to know an outside
perspective on it.

~~~
rm2040
A few years back I spent some time reading and following the equations in
Friston's papers, maybe understanding like 90% of it. you have to know
dynamical systems, differential equations, multivariate matrix stuff, etc.
Basically stuff physicists are good at. Seemed that the theory wasn't detailed
enough to inspire the next deep NN revoluation, basically the form of the
generative model Friston used is very general. My impression was that the
coded matlab examples reqired you specify known quantities, like velocity or
position. But that requires a human to input, not like a NN where you can just
point it at some data and it learns. Would love to be shown otherwise...

~~~
elcritch
Nice! Finally could put my physicist training to use... I skimmed the cited
papers but they seemed very generalized as you mentioned. I’ll have to find
the matlab code examples to understand how a complete system model would look.
The linked post seems to deal with just a single perceptron/activation
function AFAICT. The embedded constants don’t bother me too much as both
physics models have embedded constants and I presume evolved neurobiological
systems would have been tuned over time to incorporate the appropriate
constants for dealing with useful quantities (force, mass, etc). Could be fun
to port the matlab code to Julia!

~~~
rm2040
OK, here's what I was referring to:

[https://www.fil.ion.ucl.ac.uk/spm/software/spm12/](https://www.fil.ion.ucl.ac.uk/spm/software/spm12/)

in this package there are (was?) some scripts for running dynamic expectation
maximization. Cheers

------
bra-ket
That’s not how the brain works

The proposed method is wasteful in terms of energy spent to get an answer,
specifically in this step : ‘ sum the whole range of possible sizes‘, even
with approximations and clever algo

Perception is much more economical as it’s done via memorized heuristics that
restrict the search space very quickly.

As a rule of thumb, If your method requires many iterations to converge on
some minimum it’s a wrong method to model perception. Brain doesn’t solve a
mathematical optimization problem.

~~~
ReadEvalPost
> The proposed method is wasteful in terms of energy spent to get an answer,
> specifically in this step : ‘ sum the whole range of possible sizes‘,

Er, the entire approach is motivated by the fact computing p(u) is
intractable. That summation is explicitly not done in active inference...

------
tomjakubowski
> The non-linear function that relates size v to photosensory input u is
> assumed to be g(v)=v^2

I am having a hard time understanding this sentence - how does g(v) = v^2
relate the size v and the input u if the expression mentions only v?

Is it meant to be v = g(u) = u^2? Is it u = g(v) = v^2?

~~~
yorwba
"We assume that this signal [i.e. _u_ ] is normally distributed with mean
_g(v)_ and variance _Σ_v_."

That means that there's no deterministic function that fixes _u_ for a given
_v_ , but only a distribution of possible values. (It's closer to _u = v^2_
than the opposite, though.) The precise relationship is expressed symbolically
in the likelihood function given in the next part.

~~~
tomjakubowski
Thank you!

------
concernedstats
I don't think this post is correct. The Bayes theorem is mixing up variables v
and u in both the source code and the equations.

------
nickthemagicman
This is a pretty site. Anybody know how they did the code snippets?

~~~
kittiepryde
[https://mmistakes.github.io/minimal-
mistakes/](https://mmistakes.github.io/minimal-mistakes/)

And looks like the content is in markdown.

~~~
nickthemagicman
Thank you!

------
tedmiston
Can anyone ELI5 what’s going on here for the layman?

