Thanks, I'll see about Box-PF when trying to get IMU-filtered indoor-localization working (once I hopefully get that far with the UWB firmware [0]). Accounting for clock drift across the "satellites" is going to be "fun", but at least it's both useful in practice and of manageable complexity/scope.
PF is a great tool for UWB. Even without IMU data and instead adding uniform diffusion of the particles between updates tracking worked well in a 2D environment. It sounds like you're working on TDOA for UWB?
The problem with this idea is that deriving all the propagation and measurement functions and associated jacobians is 99% of the problem. Once that's done you can implement literally any filter from them using Wikipedia.
If Q and R are constant (as is usually the case), the gain quickly converges, such that the Kalman filter is just an exponential filter with a prediction step. For many people this is a lot easier to understand, and even matches how it is typically used, where Q and R are manually tuned until it “looks good” and never changed again. Moreover, there is just one gain to manually tune instead of multiple quantities Q and R.
> If Q and R are constant (as is usually the case), the gain quickly converges,
You also need for measurements to be equally spaced. Often they are – you might get an alternating pattern of measurement and observation – but often they're not, in which case the Kalman filter gives extra weight to new measurements coming in if it's been a while since it last had one (because that will have allowed the uncertainty to grow).
The Kalman filter also allows you to take into account measurements that are more uncertain in one direction than another. Think of cameras with visual recognition, which tell you a precise angle but only a rough distance estimate. If you have a couple of those and suitable measurement error matrices then the Kalman filter will automatically do a sort of triangulation.
Add a bonus, you can also use the covariance matrix of the target as information in its own right. But, as you say, often parameters are tuned for getting a good result rather than reality so the target uncertainty isn't always especially meaningful.
It took years after I learned the Kalman Filter as a student, until I actually intuitively understood the update of the covariances. Most learning sources (including the OP) just mechanically go through the computations of the a-posterior covariance, but don't bother with an intuition other than "this is the result of multiplying two gaussians", if anything at all.
Figured I can save you a click and put the main point here, as few people will be interested in the rest:
The Kalman filter is adding the precision (inverse of covariance) of the measurement and the precision of the predicted state, to obtain the precision of the corrected state. To do so, the respective covariance matrices are first inverted, to obtain precision matrices. To have both in the same space, the measurement precision matrix is projected to the state space using matrix H. The resulting sum is converted back to a covariance matrix, by inverting it.
I've seen the Kalman filter presented from a few different angles and the one that made the most sense to me was the one from a Bayesian methods class that speaks only in terms of marginal and conditional Gaussian distributions and discards a long of the control theory terminology.
I succeeded in understanding the Kalman filter only when I found a text that took a similar approach. It was this invaluable article, which presents the Kalman filter from a Bayesian perspective:
Meinhold, Richard J., and Nozer D. Singpurwalla. 1983. "Understanding the Kalman Filter." American Statistician 37 (May): 123–27.
i don't know know why the hell people are so obsessed with it. like why aren't there recurring posts about how to solve a separable PDE or how to perform gram shmidt or whatever other ~junior math things.
Kalman filters are useful in data processing and interpretation, I used them heavily in continuous geophysical signal processing four decades past.
My guess is that many computer data engineers encounter them and find their self taught grasp of linear algebra and undergraduate math challenged by the theory behind K-F's .. they seem to come across as a bit of a leg up over moving averages, Savitzky–Golay, FFT applications, etc.
There are many more people dealing with implementing these things than have had formal undergraduate lectures on them.
My gut feeling is that most are more likley to encounter K-F applications in drone control, dead reckoning positions when undergound or with flakey GPS, cleaning real world data, etc. than to find themselves having to solve PDE's ..
I posit the existence of some form of pragmatic Maslow's Hierarchy of Applicable Math.
I do agree though that HN has odd bursts of Kalman filter posts.
> Kalman filters are useful in data processing and interpretation
vaguely - plenty of other imputation approaches that are simpler/better/more accessible.
> F applications in drone control, dead reckoning positions when undergound or with flakey GPS
these are not things 99% of devs encounter. literally
> dead reckoning positions when undergound or with flakey GPS
is the domain of probably like 100-1000 people in the entire world - i know because i actually have brushed up against it and am aware painfully aware of the lack of resources.
i really do think it's just a programmer l33t meme not unlike monads, category theory, etc - something that most devs think will elevate them to godhood if they can get their heads around it (when in fact it's pretty useless in practice and just taught in school as a prereq for actually useful things).
The assertion was not that these examples are common rather that currently they are more common to generic app developers than manipulating PDE's
As K-filters in data processing and interpretation, that depends thoroughly on the data domains, a good number have biases and co-signals that are more easily removed with an adaptive model of some form.
Eg: magnetic heading effect when recording nine axis nano-tesla range ground signals. The readings returned over a specific point at a specific time of day are a function of sensor speed and heading. Repeated flying over the same point (hypothetically at the same time) from North to South Vs East to West returns different data streams on each of the nine channels.
To get a "true ground reading" both the heading bias and the diurnal flux must be estimated and subtracted.
> plenty of other imputation approaches that are simpler/better/more accessible.
If only we had some way to predict when these bursts would appear. But, I guess it would probably depend on a lot of factors, and it might be hard to guess how they all influence each other…
In the realm of autonomous vehicles, early sensor fusion systems relied heavily on the usage of Kalman Filters for perception.
The state of the art has now been supplanted by large deep learning models in the present day, primarily relying on end-to-end trained Transformer networks.
This may be familiar to you in the context of LLMs which have recently become popular, but they were actually first successfully utilized in autonomous vehicles (invented by researchers at Google and implemented in production at Waymo almost immediately).
As a developer I always found these maths-first approaches to Kalman filters impenetrable (I guess that betrays my lack of knowledge, I dare cast no aspersions on the quality of these explanations!). However, if like me, it helps with the learning curve to implement it first, here's a 1-dimensional version simplified from my blog:
function transpose(a) { return a } // 1x1 matrix eg a single value.
function invert(a) { return 1/a }
const qExternalNoiseVariance = 0.1
const rMeasurementNoiseVariance = 0.1
const fStateTransition = 1
let pStateError = 1
let xCurrentState = rawDataArray[0]
for (const zMeasurement in rawDataArray) {
const xPredicted = fStateTransition * xCurrentState
const pPredicted = fStateTransition * pStateError * transpose(fStateTransition) + qExternalNoiseVariance
const kKalmanGain = pPredicted * invert(pPredicted + rMeasurementNoiseVariance)
pStateError = pPredicted - kKalmanGain * pPredicted
xCurrentState = xPredicted + kKalmanGain * (zMeasurement - xPredicted) // Output!
}
It's not your fault, these can get messy very quickly. Infer.NET was started because Tom Minka and other Bayes experts were tired of writing message passing and variational inference by hand, which is both cumbersome and error prone on non-toy problems.
It helps to take a more abstract view where you split the generative process and the inference algorithm. Some frameworks (Infer.NET, ForneyLab.jl) can generate an efficient inference algorithm from the generative model without any user input. See e.g. https://github.com/biaslab/ForneyLab.jl/blob/master/demo/kal...
For the lesser developer gods here, can someone give an example of a real life business case where (s)he has effectively used this? Explain like you're talking to a guy who has done CRUD most of his life.
https://cecas.clemson.edu/~ahoover/ece854/refs/Djuric-Partic...
https://eprints.lancs.ac.uk/id/eprint/53537/1/Introduction_t...
https://ieeeoes.org/wp-content/uploads/2021/02/BPF_SPMag_07....