How a Kalman filter works, in pictures (2015) 213 points by panic on Jan 21, 2017 | hide | past | web | favorite | 32 comments

 For anyone who read this blog post and thought "Wow that's interesting, but I don't understand half of this. I need a whole youtube series to explain this to me."Well we're in luck. https://www.youtube.com/watch?v=CaCcOwJPytQ.It's a great set of 55 video lectures on Kalman filtering. (I'm only on number 5 but so far they've been great)
 Wow, this lecturer is great! I think this is the most intuitive explanation of Kalman filters I've seen yet. I've seen other lecturers try to quickly jump into state space and lose students with big matrices.I would have paid good money for this if I wasn't already familiar with the material. The clarity is up there with The Essence of Linear Algebra: https://www.youtube.com/watch?v=kjBOesZCoqcTrading math videos on the Internet--definitely nerds.
 Unfortunately those lectures are not complete.
 Well there are at least 42 videos, so that's fairly complete. Nearly a semester already...
 Good work putting the colour highlighting on the formulae. It does make them easier to follow for someone who is not a complete wizard with algebra. Without this little formatting touch I would find the article to be mostly a sea of symbols that I would likely skim over and still not properly understand.
 I wish more math papers did this. It's a nightmare for someone who doesn't read equations all day for a living.
 For those familiar with least squares estimation, there's a good answer here that relates it with the Kalman filter http://dsp.stackexchange.com/a/2398It's a little simpler to derive the least squares smoothing function http://stats.stackexchange.com/a/138342
 Pardon my ignorance, I'm just wondering about some context, since the Kalman filter was invented in the 60s. Are Kalman filters still highly relevant, or are they (in practice and/or in theory) obsoleted by other techniques, such as general ML?
 -- Kalman Filters vs HMM (Hidden Markov Model):"In both models, there's an unobserved state that changes over time according to relatively simple rules, and you get indirect information about that state every so often. In Kalman filters, you assume the unobserved state is Gaussian-ish and it moves continuously according to linear-ish dynamics (depending on which flavor of Kalman filter is being used). In HMMs, you assume the hidden state is one of a few classes, and the movement among these states uses a discrete Markov chain. In my experience, the algorithms are often pretty different for these two cases, but the underlying idea is very similar." - THISISDAVE-- HMM vs LSTM/RNN:"Some state-of-the-art industrial speech recognition [0] is transitioning from HMM-DNN systems to "CTC" (connectionist temporal classification), i.e., basically LSTMs. Kaldi is working on "nnet3" which moves to CTC, as well. Speech was one of the places where HMMs were _huge_, so that's kind of a big deal." -PRACCU"HMMs are only a small subset of generative models that offers quite little expressiveness in exchange for efficient learning and inference." - NEXTOS"IMO, anything that be done with an HMM can now be done with an RNN. The only advantage that an HMM might have is that training it might be faster using cheaper computational resources. But if you have the \$\$\$ to get yourself a GPU or two, this computational advantage disappears for HMMs." - SHERJILOZAIR
 Kalman filters are never going to completely go away since they are optimal[1] under certain assumptions. But these assumptions are often not met so you have to start approximating (with EKF or UKF).In those scenarios, once computational power allows, I think other approaches such as particle filters which can handle arbitrary distributions(e.g., multimodal) will start taking over. But we're not there yet(?).
 I agree both EKF and UKF are what you'll find in real world applications (because computing Jacobians is a pain!). I find the work of Jeffery Uhlmann both incredibly useful and hilarious (in that the "Unscented Kalman Filter" came from a stick of deodorant on a co-workers desk).
 Yes, they are a staple of modern robotics and other fields. If the Kalman filter assumptions about the system hold (linear model, Gaussian noise), the Kalman filter is an optimal filter and you can't do better. There are also more complicated variants like the extended Kalman filter and the unscented Kalman that can do better when the assumptions of the Kalman filter are not accurate.It's also worth noting that the Kalman filter follows the EM pattern of many ML / statistical models.
 In practical robotics, how often do the assumptions hold? Is it really usually true that the variables have only a linear relationship?
 People are generally happy with the Gaussian noise assumption. In some cases, there are acceptable linearizations of the dynamical model. One example is linearizing the quadrotor dynamics about the hover state (no roll or pitch). This approximation works well enough for basic flying, but you won't be able to pull off any flashy maneuvers because any hard bank will move you too far away from the linearization point for it to hold. A better choice would be a more complicated KF (I've used an error-state KF in the past).The real draw of these filters, though, is that they are very fast. In my experience, most of the compute time every update cycle is spent on sensing because your sensors dump a ton of data that you need to process as part of your CV / SLAM / whatever pipeline (the outputs of these then go into your KF). The dream is to get a 10ms update loop so your control algorithms can do a good job, but this is easier said than done.
 No, but nonlinear systems can be linearized about any particular operating point very easily, and the linearized system is often reasonably close to the whole nonlinear one.
 KF has absolutely nothing to do with EM; no latent variables, no Jensen, no surrogate... nada.Please stop saying that.
 Can a KF be implemented in or explained as a RNN?
 No.
 They're not really applicable to the same problem areas. Kalman filters are most often applied in small/embedded system control environments where latency matters more, control is continuous, and you don't have a cloud to host your autopilot.
 >Kalman filters are most often applied in small/embedded system control environments where latency matters moreFor example, missile guidance.
 Beyond what the other replies are saying, the general pattern is also highly applicable. You can keep O(1) memory, update it on every new piece of unreliable data, and continually have a better understanding of what the truth is.Consider standard deviation. You can calculate the standard deviation of a stream of numbers without storing all of them, or knowing where the stream will end. 'The standard deviation so far', in effect.
 Exactly, the UKF and EKF filters are very highly optimized for memory size and computation while dealing with noisy measurements non-linear and difficult to compute functions. They certainly don't require normally distributed errors.That makes them applicable to a wide range of embedded applications. As generalized ML tools, I'm pretty doubtful unless you wanted to create something in hardware as a large set of coupled noisy state-spaces.
 > Consider standard deviation. You can calculate the standard deviation of a stream of numbers without storing all of them, or knowing where the stream will end. 'The standard deviation so far', in effect.Yes, you are right, but this is both a good example, and a bad example, because my old Casio calculator could do that too :)
 Yes they are used for radar and sonar stuff. We use them in our products.
 They're used in brain-machine interfaces: http://cs.brown.edu/~black/Papers/nips02Final.pdf
 Yes it is used. But there is a whole wealth of more general additional techniques based on probabilistic graphical models. A good reference for those curious is probabilistic robotics: http://robots.stanford.edu/probabilistic-robotics/
 I saw Kalman filters being used in comma.ai's self-driving car code.
 My ML professor once took hidden markov model and arrived at equations of a kalman filter. That just completely blew my mind, I could never think the concepts from estimation theory machine learning could be related so beautifully.
 The Kalman filter algorithm is an EM algorithm.
 At that point, I did not have an idea of the family of techniques that these belonged to.
 This is really great! I kept encountering Kalman filters in my research during graduate school, but they didn't directly affect my research so I never made the time to understand them. What a fantastic explanation!I'm also a huge fan of the use of colors to understand all the different concepts at work. Yesterday I actually asked the secretary of my department to get my an 8 pack of multicolored pens for this exact purpose (red, blue, and black aren't enough!).
 That was a really coherent explanation. The coloring was helpful, also the way in which the details were introduced one by one.

Search: