If you're not shy of a few swear words, Nat's What I Reckon has some great recipes that are straightforward. Here's link to his bolognese - https://www.youtube.com/watch?v=Sw_Ze9zIafM
I do think that there's some merit in sticking with probability on discrete spaces for a while. Once you start dealing with continuous spaces, soon you're talking measure theory and you can wade deep into the technical details and miss some understanding of what's going on. I go back and forth on this as I think it's largely down to the reader to figure out what works for them, but I think probability is one of those fields where developing intuition early on is a must if you want to go further.
The actual requirement for measure theory is overblown. As long as you've taken single and multivariable calculus, you can study continuous probability without any problems and without even knowing what measure theory is.
Agreed, not knowing measure theory never stopped me from computing a conditional expectation. Some courses and books overemphasize rigor in probability and, while it obviously has its place, I've seen newcomers to the field become obsessed with doing everything via measure theory. Further to your point, volume two of Feller is pretty light on measure theory IIRC.
To the programmer, developer or casual visitor looking at this and wondering whether it's worth the time and effort to dig into this, it is. Most of what's covered here can be understood with undergrad calculus, and will give you a solid basis for understanding and modelling random phenomena you may encounter in your studies, work or hobby.
Fun fact to get you started, Nakamoto suggested in the original Bitcoin paper that blocks would be added to the Bitcoin blockchain according to a homogenous Poisson process (spoiler alert: it's definitely not).
> Fun fact to get you started, Nakamoto suggested in the original Bitcoin paper that blocks would be added to the Bitcoin blockchain according to a homogenous Poisson process (spoiler alert: it's definitely not).
It's definitely not a homogenous Poisson process, mainly because of the random changes to mining difficulty and propagation delays. There's a good paper here looking at block arrival times and fitting some different models - https://arxiv.org/pdf/1801.07447.pdf
Is Figure 8 an unconditional empirical CDF of inter-arrival times? Apart from the heavy right tail (which covers ~0.01% of the data), it looks pretty exponential to me. If I'm understanding what I'm seeing, it sounds like like the homogeneous Poisson assumption was pretty solid. Especially considering its purpose. Maybe it would have been more accurate to say "there's a mixture of two Poissons: the bulk and the network disruption". But I think that possibility would occur to most people reading the paper at the time.
Also, Figure 7 seems to show very little change in mean block inter-arrival time.
In fairness the authors say, "Performing the Lilliefors test on the LR data rejects the null hypothesis that block mining intervals are exponentially distributed, at a significance level of α= 0.05." But this isn't physics. We want to know how useful the approximation is, and whether there is a similarly tractable one with better predictive power.
> Is Figure 8 an unconditional empirical CDF of inter-arrival times?
My understanding is that it's the inter-arrival times after some cleaning and resampling. If I've understood correctly, when they resampled the data, they did so uniformly between the neighbours of the points they omitted, which would actually make the data appear more like an exponential distribution.
> Especially considering its purpose. Maybe it would have been more accurate to say "there's a mixture of two Poissons: the bulk and the network disruption".
Could be. Could also follow a power law or a phase type distribution.
> But this isn't physics. We want to know how useful the approximation is, and whether there is a similarly tractable one with better predictive power.
It's worse, it's math :-) I take your point though, it all comes down to what you're trying to do. If inter-arrival times did follow an exponential distribution with parameter $\lambda$, then we'd have finite variance and I'd be pretty confident that I could build a performant predictive model. The presence of a heavy right tail makes me think otherwise.
Its been a while, but I just read an article arguing the case that Len Sassaman was a Satoshi. It was a neat article, so I watched one of Len's Defcon talks about remailers from waaay back in the day.
In his talk, Len mentioned that most remailer security analysis assumes homogeneous Poisson email arrivals. He pointed out how bad an assumption that is for email.
I still think it was a solid assumption in the Bitcoin white paper.
Regardless of whether the collection of the data was incidental or not, this is a huge breach of public trust given the assurances made by the Australian government.
However, all that the various Agencies are saying is - 'Yeah sure, it's in there but it isn't something we cared about enough to have looked at so we haven't' (paraphrasing here obviously). Whether that is true or not is another matter entirely of course.
To point 3 - Protect what matters most - please do check out Peergos (https://peergos.org). We provide private and secure online storage that collects no metadata and is not dependent on DNS or TLS.