The old calculation relies on older experimental results that have been verified...

evanb · on April 8, 2021

I'm a lattice QCD practitioner. What I'll say is that the BMW collaboration isn't named that by coincidence---they're a resource-rich, extremely knowledgeable, cutting-edge group that is the envy of many others.

They're also cut-throat competitive, which is very divisive. Grad students and postdocs are forced to sign NDAs to work on the hot stuff. That's insane.

What's worse, from my point of view (as an actual LQCD practitioner) is: they're not very open about the actual details of their computation. It's tricky, because they treat their code as their 'secret sauce'. (Most of the community co-develops at least the base-level libraries; BMW goes it alone.)

OK, so they don't want to share their source code; that's fine. But they ALSO don't want to share any of their gauge configurations (read: monte carlo samples) because they're expensive to produce and can be reused for other calculations. So it'd be frustrating to share your own resource-intensive products and have someone else scoop you with them. I disagree with that, but I get it at least.

My biggest problem, and the one that I do not understand, is their reluctance to share the individual measurements they've made on each Monte Carlo sample. Then, at least, a motivated critic could develop their own statistical analysis (even if they can't develop their whole from-scratch computation).

Because of the structure and workflow of a LQCD calculation it's very difficult to blind. So, the only thing I know to do is to say "here are all the inputs, at the bit-exact level, to our analysis, here are our analysis scripts, here's the result we get; see if you agree."

This is the approach my collaborators and I took when we published a 1% determination of the nucleon axial coupling g_A [Nature 558, 91-94 (2018)]: we put the raw correlation functions as well as scripts on github https://github.com/callat-qcd/project_gA and said "look, here's literally exactly what we do; if you run this you will get the numbers in the paper." It's not great because our analysis code isn't the cleanest thing in the world (we're more interested in results than in nice software engineering). But at least the raw data is right there, we tell you what each data set is, and you're free to analyze it.

BMW does nothing of the sort. They (meaning those with power to dictate how the collaboration operates) seem to not want to adopt principles of nothing-up-my-sleeve really-honestly-truly open science. So their results need to be treated with care. That said, they themselves are extremely rigorous, top-notch scientists. They want you to trust them. Not that you shouldn't. Trust---but verify. That's currently not possible. I bet they're vindicated. But I can't check for myself.

maxnoe · on April 8, 2021

> OK, so they don't want to share their source code; that's fine.

No, it is not. It is the exact reason why rheir results are not trust worthy.

Publish the code, let it be checked by the peers.

Closed source code has no place in science and most journals now rightly demand open code for the publications.

evanb · on April 8, 2021

I appreciate the absolutist position. However, everybody agrees on what code _must do_. If you need to solve a (massive) system of linear equations (as happens often in LQCD) you can then take your alleged solution and plug it in and check. A variety of those sorts of things prevent you from doing anything too wrong. If you screw up gauge invariance, for example, you will get 0. There are agreed-upon small examples. Plus other benchmarks---they computed the hadronic spectrum. They computed the splitting between the proton's mass and the neutron's mass.

If you spend hundreds or thousands of man hours optimizing, for example, assembler for a communications-intensive highly-parallel linear solve, it's fair to be reluctant to give it away. If you do others will get the glory (publications / funding). Some people do [ eg. this solver library for the BlueGenes https://www2.ph.ed.ac.uk/~paboyle/bagel/Bagel.html ]. Most people are happy to let others do the hard work of building low-level libraries. But they COULD decide to write custom software that'd go faster. If their custom software reproduce results that community-standard libraries produce that's not nothing.

dguest · on April 8, 2021

I think this clarifies a misunderstanding I had from your original comment.

It sounds like the "secret sauce" for this collaboration includes a set of numerical libraries. They would get relatively little funding, few publications ("glory", as you say), and at best be reduced to a citation (if people remember to cite their libraries) if all they did was improve the backbone of lattice QCD with better software.

So instead they keep it internal. It's a bit sad that there's so little glory in writing better numerical libraries, but it's a common problem across the sciences (and in the open source community in general) so I can believe they'd be reluctant to share.

evanb · on April 8, 2021

> It sounds like the "secret sauce" for this collaboration includes a set of numerical libraries.

Indeed. There are really only a limited set of (physics) choices when making these libraries. As long as the discretization you pick goes to QCD in the continuum limit, you can make whatever choices you want. Some choices lead to faster convergence, or easier numerics, or better symmetry, or whatever---at that point it's a cost/benefit analysis. But if your discretization ('lattice action') goes is in the QCD universality class ('has the right continuum limit') you're guaranteed to get the right answer as long as you can extrapolate to the continuum.

> It's a bit sad that there's so little glory in writing better numerical libraries.

Agreed, but physics departments (by and large) award tenure for doing physics, not for doing computer science. It's hard to get departments to say "yes, your expertise in optimizing GPU code is enough to get you on the tenure track".

> It's a common problem across the sciences. [...] I can believe they'd be reluctant to share.

The larger community does center around common codes. The biggest players are

USQCD http://usqcd-software.github.io/ quda http://lattice.github.io/quda/ grid https://github.com/paboyle/Grid/

but there are others, and there are private codes (like BMW's) too.

As part of the SciDAC program and now exascale initiative the DOE does fund a few software-focused national lab jobs. But not many.

dguest · on April 8, 2021

Saying they treat their code as secret sauce is a pretty damming accusation for scientists. I've seen a few other cases where relatively closed groups of otherwise top-notch scientists claim an interesting discovery [1,2], and then turn out to be wrong. It rarely ends anyone's career and the fiasco tends to fade in a few years, but it leads to a bit of a media circus, for better or worse, and is mostly distracting for the field as a whole.

I know nothing about this collaboration, but if what you say is true this isn't good science.

[1]: https://www.math.columbia.edu/~woit/wordpress/?p=3643

[2]: https://en.wikipedia.org/wiki/DAMA/NaI

evanb · on April 8, 2021

I really want to stress: it's excellent science, and that's why they hold their code tightly. You can say 'no, a true scientist publishes everything' but---says you.

As someone in the field let me assure you: everything, of course, is more complicated than you make it out to be. I understand the absolutist position. But in a world of finite and ever-shrinking resources (grants, positions, etc.) it's fair to try to push your advantage. If funding were plentiful, adopting standards of publish-every-line-or-it-doesn't-count would be fair. People would have plenty of time and resources to get that done. As it stands there are basically no incentives to behave that way and being strapped for human resources puts the issue at the bottom of the list compared to actually getting results.

dguest · on April 8, 2021

I'm not an absolutist, and don't want to come off as one. I'm just not in lattice QCD :)

What degree of data sharing is considered normal there? Across experimental physics it varies a lot: astronomers are often required by the funding agencies to make the data public, whereas particle physics experiments have traditionally shared very little (although pressure from funding agencies has started to change this too).

Given the ways you described this collaboration, my questions are:

- As an experimental physicist, when will I be able to believe them? Do we wait around for someone else to cook up a batch of similar secret sauce to confirm the result? Will they release their gauge configurations after some embargo period? Or should we believe them just because they are top-notch? I've seen top-notch groups like this fall before, so it seems quite reasonable if experiments aren't citing them now.

- Should funding agencies be attaching more importance to openness in science? From what you describe (and sorry if I'm misinterpreting you) there is very little incentive to share things that would make their results far more useful. Of course nothing is simple, but I've seen collaborations reverse their stance on open data overnight in response to a bit of pressure from the people writing the pay checks.

evanb · on April 8, 2021

> Do we wait around for someone else to cook up a batch of similar secret sauce to confirm the result?

It took you folks 20 years to redo the experiment. Independent lattice calculations have already been underway for some time; I would expect (but I won't promise, not working on the topic myself and not having any particular insider information) results on the year-or-two timescale.

> Will they release their gauge configurations after some embargo period?

BMW probably will not do this. In their recent Nature paper they do say that upon request they'll give you a CPU code BUT when they provide a nerfed CPU code that produces the same numbers, rather than their performant production code. ... annoying.

> Or should we believe them just because they are top-notch?

Well, maybe? Why do you believe the theory initiative's determination of the vacuum polarization or the hadronic light-by-light? Some how it's more sensible to back out those things by fitting experimental data than by doing a direct QCD calculation? There's no free parameters in a QCD calculation, but fitting... well, give me a fifth and I can wiggle the elephant's trunk.

> I've seen top-notch groups like this fall before, so it seems quite reasonable if experiments aren't citing them now.

I think it's wrong not to hedge the experimental results and it's wrong not to cite them, but I understand why experimentalists wouldn't take their result as final either.