It starts with the familiar formula for on-line computation of the sample mean (e.g., https://en.wikipedia.org/wiki/Algorithms_for_calculating_var...) --
mu_n <- mu_n-1 + (1/n) (z_n - mu_n-1)
The paper then uses this simple expression to motivate the Kalman filter update equation, in which a new observation z_n is used to compute an innovation (or error signal) y:
y = z_n - H x_n-1
x_n = x_n-1 + K y
"In both models, there's an unobserved state that changes over time according to relatively simple rules, and you get indirect information about that state every so often. In Kalman filters, you assume the unobserved state is Gaussian-ish and it moves continuously according to linear-ish dynamics (depending on which flavor of Kalman filter is being used). In HMMs, you assume the hidden state is one of a few classes, and the movement among these states uses a discrete Markov chain. In my experience, the algorithms are often pretty different for these two cases, but the underlying idea is very similar." - THISISDAVE
-- HMM vs LSTM/RNN:
"Some state-of-the-art industrial speech recognition  is transitioning from HMM-DNN systems to "CTC" (connectionist temporal classification), i.e., basically LSTMs. Kaldi is working on "nnet3" which moves to CTC, as well. Speech was one of the places where HMMs were _huge_, so that's kind of a big deal." -PRACCU
"HMMs are only a small subset of generative models that offers quite little expressiveness in exchange for efficient learning and inference." - NEXTOS
"IMO, anything that be done with an HMM can now be done with an RNN. The only advantage that an HMM might have is that training it might be faster using cheaper computational resources. But if you have the $$$ to get yourself a GPU or two, this computational advantage disappears for HMMs." - SHERJILOZAIR
Smaller opportunity cost missed and more likely to complete the goal than just taking a sabbatical.
It's not like I don't have the time, it's the fact that I'm way too tired after work to do any substantial amount of learning. I think I could manage something that's much easier, say learning some languages or learning how to knit. But learning math, physics, etc just requires more than I have to give after work.
That said, I managed to get a firm understanding of orbital mechanics up to a level where I could probably ace all the tests for freshman-level courses at an aerospace university and I did all of it on Saturday afternoons (but it took me 5 years to do so).
You'll often hear about famous writers, hollywood-writers, etc. doing this type of technique of doing their best creative work before the chaos of the day sets in.
Sooo...... you play KSP? ;)
In my study project, I learned thoroughly how to solve different two body problems (initial value, boundary value, closest approach) and wrote a C library  that's got all the orbital mechanics code you need to write your own Kerbal Space Program clone (but I don't intend to do that). During the process, I read 3-4 books and an inch thick pile of research papers.
I fully believe this can work. Especially with a talented practitioner. However, the question is if it will lead to more maintainable (e.g., by needing fewer iterations?) and faster (e.g., by lots of people using it) output with fewer bugs (e.g., by surviving the lindy effect).
That is, my question is entirely empirical and will hinge on repeated success. Not initial success. I am personally hopeful for it, but I have reasons not to trust my hopes.
Among the typical "scientific computing" languages, Wolfram is the only one that has "cool" functional programming tricks as the standard way to solve a problem in it.
P.S. Seriously! The amazing functional style of Mathematica is pretty much the one thing I like about it.
In fact I think bookmarks are underrated, I tried various other ways of organizing information (Zim, Org, etc.) and bookmarks are by far the simplest and most effective. For example just tag this blogpost with 'math, concepts'. Then you just type 'math concepts ' in your Firefox address bar and see all related pages. If you setup firefox sync then you can bookmark pages (even from other apps by sharing the URL with firefox and using the 'add bookmark' action), and tag it later on the desktop.
A Kalman filter isnt really a filter. It keeps a state estimate of a system. The system state is updated on a regular basis using control inputs. That runs open loop, with a model of the drift gradually reducing your confidence in the open loop estimate. As sensor inputs arrive, they are incorporated in the state to refine the state estimate. The sensor noise model is your confidence in the measurement.
The Baysian arithmetic comes into play when incorporating measurements. The confidence in current estimate and confidence sensor are used to decide how much of each to average into the new estimate.
x_post = (1-KH)x_prior + K*z