
Kalman Filter via a Simple and Intuitive Derivation [pdf] - tim_sw
https://www.cl.cam.ac.uk/~rmf25/papers/Understanding%20the%20Basis%20of%20the%20Kalman%20Filter.pdf
======
Tarrosion
Maybe it's just because I'm not the target audience for this paper, but I'm
finding this very tough going. I'm a PhD student in a mathematical field
(operations research) but have only the faintest idea about Kalman filters -
something about updating beliefs based on noisy measurements in a way that
feels intuitively similar to Bayes' Rule. I'm not a tutor intending to teach
Kalman filters to anyone in the near future, fair, but I should have the
mathematical background to get through this. Not sure why it feels like a
slog. Maybe it's because the measurement and prediction and update equations
appear before any intuition about what a Kalman filter is, what the stages of
the algorithm are, etc.?

Edit: all the way through now. Certainly this would be much more useful as an
intro to Kalman filters (rather than an introduction to introducing Kalman
filters) had some intuition been given.

~~~
jtolmar
Here's the intuitive explanation I usually use.

Sequential Bayesian Filtering is how you apply repeated evidence to a moving
target. There are three steps:

1\. Predict: Using some Markov process, move your prior distribution forward
in time so it's compatible with your new evidence. (Intuitively, everything
becomes less certain as it's free to move around. Mathematically, doing this
with continuous probabilities tends to mean an incredibly gross integral.)

2\. Update: Using Bayes' Rule, update your probabilities with the new
evidence. (Intuitively, this bunches the distribution back up. If the
predict/update don't vary in time/quality, this tends to asymptotically reach
some sort of balance. Mathematically, this tends to also be gross.)

3\. Notreallyastep: Recycle your results as the priors in step 1 next time.
(Note this means your result needs to be in the same format as your old priors
if you don't want to re-solve all the math every update.)

If you get around the gross math by doing everything in finite space and brute
forcing it (integrals become summations), you get a hidden Markov model.

If you get around the gross math by dingo a Monte-Carlo approximation, you get
a particle filter.

If you assume your priors are normal, your evidence is normal, and your update
function fits in a matrix multiplication, then you're in luck: all of the math
works out so your result is also normal. That's a Kalman Filter.

~~~
goodbyegti
Further to this, here's a very simple application of a Kalman filter which
plots a graph (in Python):

[https://github.com/dougszumski/KalmanFilter](https://github.com/dougszumski/KalmanFilter)

------
kolbe
I've worked with all the different types of Kalman Filters over the past
decade, and one lesson I've learned is that there is no way to make it simple
and intuitive. It requires extensive background knowledge in math and stats,
and even then, it's very difficult for intelligent people to keep track of all
the moving parts. Papers like this one are an exercise in futility.

~~~
akssri
KF is essentially a symbolic unrolling of a LU decomposition. There is a
graphical algorithm which makes all the minutae trivial [1].

[1]
[https://arxiv.org/pdf/1508.00952.pdf](https://arxiv.org/pdf/1508.00952.pdf)

~~~
akssri
> "'We have a new theorem--that mathematicians can only prove trivial
> theorems, because every theorem that's proved is trivial.'" \- Richard
> Feynman

Sigh.

KF is essentially solving a QP with equality constraints (Boyd's course is a
good place for details), which can be solved exactly with a single
decomposition of the KKT system - picking an ordering is all that matters for
complexity.

This is essentially the principle on which all of Sparse Linear algebra and
Graphical models work. There is nothing special about the structure of KF, nor
in LQR, nor in their non-linear generalizations.

One can symbolically unroll Schur complements multiple times to make block-LU
appear opaque and sophisticated, but it really is not (this of course is not
to say this is done deliberately). KF can also derived from the Bayes' network
model, but extending this to non-linear forms like EKF, and to things which
are not first-order becomes rather troublesome (or impossible).

I'd have appreciated a post asking for details rather than infantile derision.

~~~
smallnamespace
I'm sorry that you can't take a joke. Personally, I found it funny that you're
blessed with the requisite mathematical knowledge and perspective to be able
to view the details as 'trivial minutiae'.

I did spend 5 minutes perusing the paper you linked, and couldn't make heads
or tails of it. For myself and all other mere mortals unfamiliar with this
mathematical machinery, I can assure you that details are far from trivial.

Thanks for taking your time to explain.

~~~
RBerenguel
Well, it's a mathematical paper. I skimmed it, and although it's not my field
(well, I wonder what my field is now that I work as machine learning engineer,
but it used to be dynamical systems in Banach spaces) it seems pretty much
readable given a mathematical background in research.

I can assure you, this is way more readable than many other papers. I can
totally understand the "minutiae" comment there.

------
kqr2
Another classic paper is _The Poor Man 's Explanation To Kalman Filtering_ :
[http://digi.physic.ut.ee/mw/images/d/d5/Poormankalman.pdf](http://digi.physic.ut.ee/mw/images/d/d5/Poormankalman.pdf)

------
murbard2
Understanding what the Kalman filter does and how it does it can be intuitive.
Understanding that it can all be done efficiently with matrix operations: also
intuitive. Getting a good intuition for the specific update equations? Still
beats me. I'm a quant/statistician, I focus on filtering. I've implemented
this a dozen time and I still have to look these equations every time.

~~~
soVeryTired
100% agree. One useful exercise (once you understand all the algebra) is to
show that when your measurement noise is extremely small, your estimate is
just your current observation. Then show that when your dynamics noise is
small, your estimate is just your prior prediction.

Easy enough in one dimension, surprisingly hard in several dimensions.

Even after all that, could I explain what the Kalman gain is to a ten year
old? Not a chance.

~~~
murbard2
Bonus points, give an intuitive explanation for why it is the dual of an LQR
control problem.

------
abeinstein
My favorite explanation of Kalman filters is this tutorial in Python:
[https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-
Pyt...](https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python)

It's geared towards software engineers and doesn't assume much math
background. Highly recommend.

------
RandomOpinion
My preferred explanation for Kalman filters is sec. 1.5 of Maybeck's "
_Stochastic Models, Estimation, and Control_ ".

[https://www.cs.unc.edu/~welch/kalman/media/pdf/maybeck_ch1.p...](https://www.cs.unc.edu/~welch/kalman/media/pdf/maybeck_ch1.pdf)

------
justinvoss
There's another explanation of Kalman filters here:
[http://www.bzarg.com/p/how-a-kalman-filter-works-in-
pictures...](http://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures/)

------
chewxy
I echo the sentiments of Kolbe. There really is no way to make a Kalman filter
simple or intuitive. What I have found helps though, is to write one yourself
based on the math before using the libraries you find.

Write one, print out every intermediate value to see how the matrix changes.
It will be not-quite-correct, but it will give you insights to how exactly a
kalman filter works

------
RangerScience
I've encountered a few different attempts at "simple explanations" of the
Kalman filter, but what I really want is a "simple explanation of
_implementing_ a Kalman filter".

Anyone know of attempts at that? (Or applying one to a real situation?)

~~~
akssri
Perhaps this will help ? [https://akssri.github.io/blog/Kalman-
Filtering.html](https://akssri.github.io/blog/Kalman-Filtering.html)

