> The little Enterprise thought-experiment and anecdotal impressions of what "looks" and "doesn't look" disordered is nonsense.
I think you're misreading it, the author is trying to make the point that you can't meaningfully answer that question just by looking at the dots. It's a trick question, but it's used as a part of teaching how you compute entropy (which involves knowing the distribution). Yes, this is essentially information theory, but he's a physics teacher.
I can't tell you the "randomness" of a bunch of points any more than I can tell you the location of sunspots by looking at our star with the naked eye. That doesn't mean that there isn't a way of calculating the entropy, though (just how there is a way of making out sunspots with the right equipment).
Is that really the idea here? You're right: maybe I'm just not getting the point.
The "equipment" needed is a model of how the points came about. You may already get the point but permit me an example:
From an information theory perspective, consider 'apple' vs 'aaaaa'. I submit that if you have nothing besides the string itself, you'd conclude that 'aaaaa' is lower entropy than 'apple', but if your model is word frequency on the web, it'd be the other way around. There isn't anything inherent in the strings that makes one or the other true.
There are interesting cases in-between where you could derive a model with more math, e.g. the first million digits of pi has very low entropy, but in general exactly how much is a matter of the uncomputable Kolmogorov complexity.
With a physical system, if you know the configuration of the system and the laws of physics it obeys, you can compute this in a meaningful way, and it can tell you what direction the system "wants to go" to satisfy the second law of thermodynamics.
If you know the exact state of a system, then it has zero entropy regardless of what that particular state happens to be. In this case, you also don't need thermodynamics to model the system's evolution, because you can predict its future exactly.
When you only know summary, average statistics of a system (such as its temperature), its entropy is nonzero, because your information about the system isn't enough to isolate a particular microstate. Instead you just have a distribution of many possibile microstates, any of which would match the measurements you know. The more possible system microstates there are that would agree with your data, the higher its entropy.
I think you're misreading it, the author is trying to make the point that you can't meaningfully answer that question just by looking at the dots. It's a trick question, but it's used as a part of teaching how you compute entropy (which involves knowing the distribution). Yes, this is essentially information theory, but he's a physics teacher.