e.g., do I need to know bayes' rule? bayesian inference? etc.
Not specifically iPython notebook versions of them.
>The key insight to this entire book is in the next paragraph. Read it carefully!
Okay. What is it!!!
>If we only take data from the measurement then the prediction will not affect the result. If we only take data from the prediction then the measurement will be ignored. If this is to work we need to take some kind of blend of the prediction and measurement (I've italicized the key point).
Unless I am missing something, this "key insight" is redundant and has essentially zero information content, apart from the assertion that to get a good result it needs to be a function of two variables, measurement and prediction. Not one. not the other. But two variables, and that's what they are called, yessir.
And the new uncertainty is predicted from the old uncertainty, with some additional uncertainty from the environment."
Crystal clear - great article, thanks!
I also recommend Ramsey Faragher's lecture notes on teaching the Kalman Filter:
I don't know why people jump through hoops to start out with a multidimensional Kalman filter, and even sometimes dive deep into complex topics. The subject seems to suffer from the same terminology and confusion gap as pre-Kolmogorov statistics.
Is Kalman filtering computationally more efficient (obviously particle filtering is stochastic and so trades off accuracy for compute) or does it have some other advantage?
It's a rough heuristic that to achieve a certain accuracy for a linear/Gaussian system with a particle filter, you need a number of particles exponential in the number of dimensions of the system. I feel like this could probably be stated more formally and shown, but I don't think I've seen anything in that vein. The Kalman filter, being simply matrix operations, should scale as the number of dimensions cubed.
So yes, Kalman filtering is computationally more efficient, and (obviously) more accurate.
I also wouldn't discount the fact that the Kalman filter is, in a sense, simpler than the particle filter for a linear/Gaussian system; you don't need to worry about resampling or setting a good number of particles, and you don't need to compute estimates of the mean/covariance statistics (which are sufficient since the posterior should be a Gaussian).
Most models are not correct which is why particle filters perform much better than people expect.
Basically they're both implementations of a recursive Bayesian filter, but the Kalman filter requires very strong assumptions about the distribution (all Gaussian) and the particle filter requires none.
The Kalman filter is optimal for the Gaussian case (and is very efficient to calculate), whilst the particle filter can use more accurate distributions but is far less efficient to calculate.
You kinda use them in different places - a Kalman filter is useless for pedestrian dead reckoning (step made + estimate of direction), whilst a particle filter would be similarly dumb on submarines.
Also, for determining the position of your own robot the efficiency isn't a big deal. But if you're tracking a lot of objects in your environment then it becomes more important. And since you're looking at the objects and are tracking relative position the distribution is uni-modal too.
It's optimal for linear Gaussian systems and computation is polynomial in measurement dimensionality k and state dimensionality n:
O(k^2.376 + n^2)
(Yes I had to look that up.)
For nonlinear systems, we use the extended Kalman filter, which works by simply linearizing the predictions and measurements about their mean.
I would recommend looking at an Unscented Kalman filter:
which sucks a lot less.
When I was playing with different compass implementations in F-droid I recognized many of them uses Kalman filter for reducing the noise from the raw sensor data. Some of them (maybe only one) had some problems near the angle 0, where the sensor data jumps between almost 2pi and slightly above 0 frequently. The problem that the assumption that the measurement uncertanity is gauss-distributed is breaking there badly, since it will be a half gauss near 0 and an other half near 2pi. I don't know what is the general approach to solve this. I would solve this with either:
- Convert the angle into a unit vector and use that as measurement input. Then predict the actual vector and use its orientation for the compass.
- Move the periodic window boundaries with a slow relaxation. So if I hold my compass in 0 angle direction, then all angle data is transformed into the [-pi,pi) range. If I hold it to pi direction then the raw data transformed to [0,2pi) range.
TL;DR: Be careful when applying a Karman filter on angles (or more generally R/Z).
Kalman trained as an electrical engineer (MIT and Columbia) so that's the lingo he was used to. There's a brief note in his Wikipedia page about having trouble publishing the technique at first. I'd like to hear the story.
x(k) = h * x(k-1) + (1-h) * u(k)
To use the same language as the blog post, at time k-1 our best estimate of the state is x(k-1). Here we don't know anything about the system dynamics (but if we do we can incorporate it as an FIR as well) so our projection for time k is simply x(k-1). Our measurement at time k is u(k). Based on the same criteria of combination we should combine these two sources of information with appropriate ratios controlled by h. Without an adaptive algorithm we appeal to our prior knowledge of the signal and noise statistics to fix h. Kalman filter as presented in the blog, on the other hand, can estimate the covariance of the signal to determine the best h at any time and therefore adapt to changes of the signal.
Nav applications are the ones you see most often; it would be interesting to see an example from a completely different domain.
I've been close to applications that estimate the state-of-health and state-of-charge of lithium-ion batteries. I myself worked in volatility estimation of financial returns using Particle Filter (very similar principle, but usable with non-additive and/or non-Gaussian noise).
Here's some work on applications in Failure Prognosis: http://icsl.gatech.edu/aa/images/e/ed/Pfafdfp.pdf
Another big difference is that Viterbi works on a discrete state space (DNA base pairs, bits in a convolutional code, etc), while the Kalman filter works in a continuous space (e.g. position).
Optimal State Estimation by Dan Simon helped, too:
Little wonder that beginner physicists spend so much time mastering matrices and linear algebra.