Hacker News new | past | comments | ask | show | jobs | submit login
If you overlay all atomic spectra, you get a Planck distribution (wiley.com)
129 points by hakmem on May 11, 2020 | hide | past | favorite | 55 comments



I've seen this blow up in several places now, but there seriously is nothing to this paper.

Essentially all they're pointing out is that the set of atomic transition energies (1) is positive, (2) has a smooth distribution, (3) goes to zero at zero and infinity, (4) has a maximum somewhere in between, (5) is skewed right. All of these things are completely mundane and well-understood, and not at all unique to the "Planck distribution". Statisticians probably know of tens of other distributions with these properties, which would fit their curve about as well.

Their claim is like saying that any function that goes from -1 to 1 smoothly must be a logistic function, or any function that goes to 0 at infinity but slowly must be a power law. That's not a paper, that's a hunch. If the researchers wanted to be serious, they could have run a statistical test to quantify how well the data fit the Planck distribution (just like tests of normality are routinely done in statistics). But they didn't, and the reason probably is because the test would fail.


Posting this verbatim from my reddit comment, but I ran a few quick fits and the (arbitrarily scaled) Planck distribution looks better than standard fits of some other distributions. Of course I ran absolutely no statistics on these or tried any sort of real analysis that should have been included in the paper in the first place, so take it with multiple grains of salt:

I fit a few distributions to the data and actually the Planck distribution at 9000K seems to be noticeably better. I did no further statistics and haven't even looked at the fitted parameters for this. Note: I am an earth scientist, not a physicist. https://imgur.com/Qg4ixF2

Edit: More distributions: https://imgur.com/IrF6lsV

Edit: Filtering out sodium and potassium to try to account for some of the low-wavelength counts doesn't seem to help fitting the distributions either: https://imgur.com/dHDUS9Z

You can get the data here: https://physics.nist.gov/PhysRefData/ASD/lines_form.html

And here's the (garbage) code to reproduce my plots:

    %pylab inline
    import pandas as pd
    import scipy
    import scipy.stats
    
    mpl.style.use('seaborn-muted')
    mpl.rcParams['figure.figsize'] = (18, 12)
    mpl.rcParams['font.size'] = 16
    df = pd.read_csv('nist.csv')
    
    def filter_string(x):
        x = str(x).replace('=', '').replace('"', '').replace('*', '').replace('+', '').replace('(', '').replace(')', '')
        if len(x) == 0:
            x = 'NaN'
        return x
    
    asl_comp = df['ritz_wl_vac(nm)'].apply(filter_string).astype(float)
    asl_ver  = df['obs_wl_vac(nm)'].apply(filter_string).astype(float)
    sp_num = df['sp_num'].astype(float)
    
    df_neutral = df[sp_num == 1]
    asl_comp = df_neutral['ritz_wl_vac(nm)'].apply(filter_string).astype(float)
    asl_ver  = df_neutral['obs_wl_vac(nm)'].apply(filter_string).astype(float)
    sp_num = df_neutral['sp_num'].astype(float)
    
    asl_ver = asl_ver[~asl_ver.isna()]
    
    def planck(wav, T=9000.0):
        h = 6.626e-34
        c = 3.0e+8
        k = 1.38e-23
        a = 2.0*h*c**2
        b = h*c/(wav*k*T)
        intensity = (a / ( (wav**5)) * (1 / (np.exp(b) - 1.0) ))
        return intensity
    
    
    dist_names = ['beta', 'lognorm', 'pearson3', 'gumbel_r']
    x = np.arange(0, 1500, 5)
    intensity = planck(x / 1e9)
    size = len(asl_ver)
    
    yvals, xvals = np.histogram(asl_ver, bins=300, normed=True)
    
    for dist_name in dist_names:
        dist = getattr(scipy.stats, dist_name)
        param = dist.fit(asl_ver)
        pdf_fitted = dist.pdf(x, *param[:-2], loc=param[-2], scale=param[-1])
        plt.plot(x, pdf_fitted, label=dist_name, linewidth=4,)
    
    plt.hist(asl_ver, bins=300, density=True, color='grey');
    plt.plot(x, intensity / 1.1e17, linewidth=4, color='red', linestyle='--', label='scaled planck (9000k)')
    plt.legend()
    plt.xlabel(r'$\lambda (nm)$')
    plt.ylabel('Density')


Neat! Still, I wouldn't say the Planck distribution is noticeably better. The reason the other 4 distributions appear to have the peak too low is because they're doing a tradeoff: they do that to better fit the values of the left hand side of the graph.

The Planck fit gets the peak height right, but mainly because it has no chance of fitting the left hand side at all (since it increases much more slowly than the other distributions), so it doesn't even try.

This is why I said it would be better to have actual statistical tests -- we shouldn't have to have this kind of qualitative discussion when it's already a solved problem.


Can you link the discussion on Reddit?



Seemingly multiple people downvoted this. Why??


I agree. To me, it sounds like random matrix spectra.

Just because it has the shape of the Planck distribution, it does mean it comes from it.


The correlation between global distribution of all experimentally known atomic spectral lines to the Planckian spectral distribution associated with black body radiation at a temperature of 𝑇≈9000K is indeed "funny". The match seems close enough to be worth investigation.

On the other hand the observation that "This value coincides with the critical temperature of equilibrium between the respective densities of radiation and matter in the early universe" seems spurious and is unsupported by anything in the paper.

I would expect rather there is some quirky statistics that happen with the quantum mechanics of orbitals that gives a similar shaped distribution of frequency of occurrence of spectral lines to the Boltzman distribution.

There is probably an interesting statistical story to tell, but I don't see the connection to the early universe as a supported thing here.


Part of explaining this "funny" coincidence is going to be how one motivates their selection cuts, like the choice to use only one database instead of crossmatching several for better robustness [1]. The choice to ignore higher ionisations, the choice to ignore any systematic effects from the fact that the heavier elements, with their shorter wavelengths, are just downright harder to work with experimentally.

Finally, what do our current best atomic models predict that this distribution should be? These authors seem to think nobody models atomic spectra...

[1] See here for one such effort of comparing various databases: https://www.aanda.org/articles/aa/full_html/2018/04/aa31933-...


> The correlation between global distribution of all experimentally known atomic spectral lines to the Planckian spectral distribution associated with black body radiation at a temperature of 𝑇≈9000K is indeed "funny". The match seems close enough to be worth investigation.

"The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka' but 'That’s funny...'" —Isaac Asimov


The universe is at equilibrium in the sense that transitions from energy to matter and vice versa are more or less balanced at the scale of the universe. So the critical temperature of equilibrium between densities of radiation and matter in the universe writ large would appear to be well-approximated by the global temperature of the universe at any point in time except for those times near the Big Bang itself.


Thinking out loud:

When the universe was at 9000K, the vast majority of these elements did not exist or only existed at negligible concentration. Look up “Big Bang nucleosynthesis”. It would be interesting to see if the result is reproduced at all when looking at only light elements.

Of course the bin width makes little difference. Bigger bins would just smooth the curve.

There is probably a huge bias in that this looks at transitions that are interesting to the NIST database. As the authors allude, there are huge numbers of transitions that almost, but don’t quite, ionize at atom. Similarly, there are huge numbers of X-ray transitions in which inner electrons are kicked to very high levels or removed entirely. I don’t know to what extent the latter is well represented in the database.

For that matter, there are transitions between bound states and unbound states. Imagine that you light up Hydrogen at 13.6 eV plus a little bit. I think you can still eject elections — the excess energy can be carried away as kinetic energy. (There can be issues with simultaneously conserving energy and momentum.). The unbound states are genuinely continuous.

I didn’t look for real, but the NIST data has too many entries to represent just the spectra of cold atoms. I have a sneaking suspicion that researchers are measuring emissions from hot gasses or plasmas, perhaps heated near 9000K.


Certainly for the ions lines (hugely important modelling stars for instance) you are talking about trying to measure on very hot gasses, and I've been told the problem is that it's hard to make a gas very hot while also dense enough you get detectable radiation out of it.


That makes sense.

Part of my point is that the authors found a temperature scale in the NIST data. One plausible source is experimental considerations: if enough of the experiments are conducted at similar temperatures, you might expect to see something related to those temperatures in the data.


The authors have a look at the spectral line database of the NIST and make the surprising finding, that all atomic spectral lines together approximate very well a black body spectrum at a temperature of 9000K.

The authors not yet have an explanation for this conundrum, but also note, that this temperature plays a role in the formation theory of the universe.


Don't know why you have been down-voted - thanks for the TL;DR.


From the same paper, I'm not sure this passage helps its case. It seems off-topic, and rather breathless in it's speculation, although I respect their positive imagination:

An entirely different yet equally fascinating possibility would be that, in an abstract sense, the scientific community itself can be interpreted as a thermodynamic ensemble. In this line of thinking, the individual members would be subject to a Boltzmann distribution in “curiosity” associated with a “temperature” determining how likely each researcher is to carry out research more or less closely tethered to a specific area of interest. In turn, a type of entropy could be associated with the amount of information contained in this ensemble, or exchanged between sufficiently large subsets of it. If correct, the implications would be truly profound, and could reshape the future direction of science in ways never before imagined. Understanding the mechanisms with which to influence the “curiosity temperature” would allow wise policy makers to implement suitable conditions that foster scientific progress, and usher in a new era of discovery...[goes on at length]


+1 Full text of the article is available for free.

Challenge to HN community: Let's make a serious effort to understand exactly what the paper says before either throwing rocks or talking about how awesome it is.


Further challenge - if someone does understand exactly what the paper says, would you mind adding an ELI5 for those of us who don't?


A spectral line is a special key frequency that can unlock and electron in an atom and cause it to move to a higher energy level. Several thousands of these lines are known. The authors made a bar chart that showed how many spectral lines were known for each frequency. They found that the shape of the bar chart looked like the shape of the spectrum of the light that would be emitted by a glowing stove element at 9000K (admittedly this would vaporize the stove element but just imagine the red-orange-yellow-white thing extended a bit further). The "spectrum" of the light means a chart of how bright each color of the rainbow would be if you made a rainbow out of the light by diffracting it through a prism. (For example, if I made a rainbow out of an Edison lightbulb's light, the blue and purple part of that rainbow would be much dimmer than if I made a rainbow out of sunlight.)

There is no particular relationship known between these two things. The authors are curious about why the charts look the same. The authors forgot to pull the old trick where you publish your speculation separately from your experimental results[0], so HN is complaining about their speculation.

[0] The trick works because physicists are mainly interested in remembering right answers, so if your speculations are wrong they will remember only the experiment, and if your speculations are right they will remember both.


> [0] The trick works because physicists are mainly interested in remembering right answers, so if your speculations are wrong they will remember only the experiment, and if your speculations are right they will remember both.

This made me smile!


Could this be used to hypothesize about the existence or characteristics of new elements?


Whether or not an element is stable is determined by what goes on in the nucleus - the electrons are like little flies that show up and hang around something thousands of times heavier. Now, there are such things as nuclear spectral lines, which tell us important things about what goes on in the nucleus. One idea the authors might want to pursue would be checking whether or not their chart similarity also happened for nuclear lines.


It says: Hey, we plotted two things on top of each other, and get them to kind of fit, (might be) evidence that "scientific community itself can be interpreted as a [ideal gas]".


Years ago, i read a paper that looked at a database of all known organic molecules, and found that a disproportionate number of them had an even number of carbon atoms. I can't find it now, of course.

The paper was a similar "that's funny, i wonder why" sort of piece. The tentative explanation i remember is that a lot of those organic molecules are natural products, and the nature of biosynthetic pathways is that they tend to add carbons two by two. Which i don't think is even true - terpenoids are built five carbons at a time.


The authors describe an observation of similarity between two plots of completely different things. That could be an interesting paper, if they investigated this similarity with some skepticism and the similarity was shown to be robust, independent of the particular choices they made (bin width, choice of the subset of all spectral lines).

But they apparently did not do their job. Instead, they indulge in speculations about even crazier connections with a past state of the universe or with behaviour and biases of scientific community.

This kind of half-baked observation and speculation is an interesting discussion topic for a lunchtime that can potentially lead to something substantial, but really should not be published as scientific paper.

Also this paper is a good example of what is wrong with physics academia (and perhaps other academic workers as well). 4 authors, 18 references to other people work, and a statement of conflict of interest.

Sad state of physics, year 2020.


Do you think there should be more, or less authors? And more, or less references? I don't get your objection.


Uh, they could as a start include the theoretical predictions for all spectral lines?

Instead they take the subset of lines that have ended up in one (out of several) databases on spectral lines and, without any real motivation, declare this to be a complete sample.

Finally, putting the main result (Figure 2) before the section on data collection ("Experimental Section") is just rude.


Also, is it bad that the conflict of interest was disclosed?


It's a meaningless section.


Are you saying that conflict of interest disclosures in general are meaningless? Or this one is just poorly implemented? I'm having a hard time interpreting your meaning here.


I do not know in general, but in case of paper on theoretical physics, I have trouble seeing how such a statement provides anything to the reader.


In the context of what this paper brings to the reader, this should not have been published. If the paper at least did the chores and proposed some scientifically valuable hypothesis, then it could be OK to publish. But. If one guy had the original idea, worked his way through the required steps and the other discussed with him a little, he should have published himself, possibly with acknowledgement of the others. There is no work for 4 people here. The references are ridiculously long and nobody is going to read them all in order to "get" this paper.

Clearly, this is a product of the cultural, societal and economic pressure to "get published" often, quality or value be damned. My cousin who knows a little about science or academia once told me scientist should publish at least once a month, otherwise they do too little work. Obviously, the academia agrees.


What are the "two plots of completely different things?" You have two plots of intensity vs wavelength, where one is spectral intensity, and the other is a histogram binning.

On its face, there's zero reason why these plots line up as well as they do, this goes beyond conincidence.


"two plots of completely different things" is not meant as criticism, just a description of what the paper is suggesting. It's ok to seek and investigate similarities between completely different things.

They are plots against wavelength axis, yes, but the "things" plotted on y-axis are completely different concepts.


Sounds like someone in defence of citation circles.

I welcome the fresh view of the team. A genuinely new observation brought to paper instead of dead-chewing (albeit rigorously citing) papers of academic brethrens.


I think those references are too trivial and too numerous.


They also just take NIST as the holy source of truth, which seems highly suspicious if one was out to get some sort of fundamental truth in the wavelength distribution of the lines.


Eh, one of their proposed explanations (really, their only explanation beyond the vague "coincidence" and "unknown physical law") is specific to how the NIST database is compiled, so disagree they're taking them as a "holy source of truth".


I can't find any description by the authors of how NIST is compiled besides "comprehensive", so I don't understand what you refer to.


They're referring to this argument for where the distribution may come from (where the authors "model" the scientific community -- namely the contributors to the NIST database -- as a Boltzmann distribution in possibly the most hand-wavey argument I've seen in a paper):

> An entirely different yet equally fascinating possibility would be that, in an abstract sense, the scientific community itself can be interpreted as a thermodynamic ensemble. In this line of thinking, the individual members would be subject to a Boltzmann distribution in “curiosity” associated with a “temperature” determining how likely each researcher is to carry out research more or less closely tethered to a specific area of interest.

It's a cute idea, and maybe something I'd enjoy arguing over a drink. But the obvious problem with such a model is that there is no theoretical basis to argue it from -- if only because Boltzmann distributions (as with most thermodynamic effects) only start to apply when you have so many indistinguishable particles and thus so many microstates that multiplicative factors on the scale of Avogadro's number become trivial. Scientists are neither indistinguishable, nor are they this numerous.

To be fair, the effect described here is something that I wouldn't expect a-priori. I'm just disappointed the paper doesn't really offer much of a conclusion (or even hints at a decent argument).


> To be fair, the effect described here is something that I wouldn't expect a-priori.

True, but I'm not that convinced the Boltzmann really is the most natural distribution to claim fits. Especially on the blue side it looks like the residuals could be pretty atrocious. Why didn't they try to some other right-skewed distributions? You could try a log-normal for instance, that would have a much simpler interpretation.

I think this crazy thermodynamic idea is really the entire motivation for the paper, and explains why they didn't really spend any effort exploring it.


I guess my point is that I wouldn't necessarily expect there to be any clear trend to the set of all spectral lines. The Boltzmann fit looks fairly iffy to me as well -- it seems like they just picked an arbitrary distribution (which had a cute pseudo-explanation) that was unimodal with a log tail.

One possible explanation I thought of (which I'm surprised the paper doesn't consider) is whether this is just showing the distribution of wavelength ranges of spectrometers that researchers are using. I tried to find some examples online, but I guess you'd need to be involved in the field to know what exactly to search for.


> I wouldn't necessarily expect there to be any clear trend to the set of all spectral lines

Me neither, but admittedly mainly because I've never even considered the question.

Some reflection gives me the expectation that there should be fewer high frequency lines due to conservation of energy, and very many low frequency lines.

Then I imagine experimental limitations mean it's very hard to see all the low frequency transitions, but honestly I have no idea how the cross sections of the transitions go with energy so I should stop speculating.


Or it can be seen as "we need some standard, so we'll use NIST".

Then the data is at least normalized with respect to NIST.


You'll have to explain why using a single database, when there are several available that focus on being extra reliable on different elements, is a defensible choice.

I fully understand "let's just use NIST because that's easiest", but that's not a serious attitude if you want to claim something about reality rather than the NIST database itself.


I see that you only attacked the data source, NIST. You didn't attack/discuss normalization of said data, which was 1/2 of what I said.


What normalisation? There's certainly no clear talk of normalisation in the paper.


A couple of days ago, there also has been a disussion on reddit on this topic.

https://www.reddit.com/r/Physics/comments/gf1kbd/if_you_over...


Very cool finding that IMHO goes beyond coincidence. Some are saying the authors compared "two completely different plots" - this is a strange thing to say. Both plots have an X axis of wavelength, that's obviously the same. There's some ambiguity of the units on the Y axis, since the BB is in W/m^3 while histogram is unitless.

However it's really not a stretch to consider the BB as some relative intensity. So it's totally reasonable to overlay these plots.

A possible way to reconcile this would be to model some gas mixture composition and determine the aggregate spectra of that, and that would be in W/m^3.

E: downvote(s), do you have a rebuttal, or just think I'm wrong?


It's hard to come up with a thought-experiment where this radiation would manifest.


Worth keeping in mind that the blackbody emission curve is generated by some pretty simple equations, multiplying the higher density of states at higher energies with the lower probability of states being occupied at higher energies. Not surprising that something similar (exponential suppression of X via the pontryagin dual of X) would show up in other contexts, so I think “coincidence” is actually pretty likely.


Jesus, physics has gotten REALLY BAD indeed. This is clown car tier.

If you overlay all atomic spectra, you expect the result to be something like Gaussian Unitary Ensemble (aka eigenvalues of a random matrix), which looks much like the Planck distribution. Nothing to do with matter in the early universe; most atoms didn't exist in the early universe. TLDR; contemporary physicists fail at elementary statistical distributions, and, like, common sense.


Thanks, I remembered GUE and was wondering if it is related.


[deleted]


Citations needed. Also, do you even math, bro?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: