Totally agree. I'd go even further and make free licenses on scientific source and datasets mandatory. Research that is funded by public money should lead to public code and data.

So, $1 of public funding triggers public code release? Or is there a threshold?

No one has been able to tell me why the need for reproducibility requires software freedom.

Consider the program 'nauty'. It is available in source code form for anyone to review, but it cannot be used for military purposes. That's not free, certainly. But isn't that enough to call it good science?

Similarly, consider the clause "only for use in verifying the result of paper X". That's also not free. But it serves the goal of letting others be able to verify X.

Also, you haven't gone far enough. It's not only the license that matters, but access. You have to mandate that either anyone can get access to the code for no/low cost for some years (since I can sell my GPL'ed software for $30,000 or take down the download link once published), or link publication with a required submission to some repository with the mission of keeping all that source and data around, available to anyone, at no cost.

Yes, 1$, no threshold necessary.

For projects where the cost of publication of data is not worth the dollar, people will just not accept the dollar. (That's where a 'natural threshold' comes in.)

So, if I have a summer intern who is paid by the government, and helps on my otherwise non-government-funded project, then is that enough to trigger the requirement?

Or, if I use a government facility, like a supercomputing center, then that also trigger the release requirement? Even if no money changes hands? What about a government network?

It sounds very tricky. If I get a government grant to work on a project, and that grant buys equipment X, which I also use for another project, then must both projects be subject to this form of release?

Even if the second project is 10 years later?

If in working on project X, but in the process of working on X I find some new knowledge Y, and publish it, even though that has nothing to do with the grant from the original X, does that count? (Think AT&T's observation of the cosmic microwave background, when their goal was to reduce noise in terrestrial microwave communications.)

It seems like a very complicated scheme.

No it's very simple. If you want "the people" to pay for you to do things at all, you owe them all of the things or you don't get their money. Your objections make it seem complicated but this feels like an effort to muddy fairly clear waters.

I'm pointing out that it's impossible to work this way. People don't like it. The accounting systems aren't set up for this. The entire post-war research system isn't structured this way.

The entire post-war research system is set up to create endless streams of non-replicable studies based on incorrectly-applied statistics all the while convincing itself that it's doing something useful while wasting billions upon billions of dollars.

I'm not to impressed by "well, that's not how it works right now". The whole problem is "how it works right now". That's what we're discussing, the need for it to not work that way.

You want to change the system. I understand that.

We have many systems to go on over the last few hundred years of science. We have the pre-war system, primarily funded by private philanthropy. We have the communist system.

None of them seem to create the stream of highly replicable studies you want.

That may indicate something deep about how people work and how science is really done, and suggest that your admirable goals are not tenable.

I tend to agree, actually. In some sense the real solution is take science off its pedestal, as it does not generally deserve to be up there. The 17th-19th centuries were in some sense a fluke of low-hanging fruit, and the science of most of the 20th and the 21st centuries do not deserve to be regarded with the same worshipful gaze, a word I choose carefully. By taking it off its pedestal and subjecting it to a lot more scrutiny, we'd all win, including science in general.

This is not necessarily because we're worse people than them, but because the problem is now much much harder. It's always better to acknowledge that hard problems are hard, rather than trying to solve them by pretending they're easy.

As for the model I would propose, I believe all funding models are fundamentally flawed, and the best model is all of them at once, so hopefully the flaws at least sort of cancel out. At the moment, that generally means seeking a decrease of the current government funding strategy and breaking the peer review monopolies, not because either of them are necessarily especially bad, but because they are too powerful and their flaws are coming to define the flaws of science in general.

Some of this would just be a mindset change, to recognize that "research" isn't isomorphic to "producing peer-reviewed papers" and that there's nothing wrong with setting up some equivalents of Xerox Park in other disciplines. Potentially with government money, since my point is more about multiple models than the literal funding sources. If "science" as it is practiced today was less pedestalized, this would be a much less horrifying suggestion.

"Science" is no longer on a pedestal. The PR campaigns against conclusions for leaded gas, smoking, acid rain, global warming, and vaccine safety, and the scientific development of leaded gas, ozone-depleting CFCs, Agent Orange/dioxin, etc., plus concerns like GMOs and Monsanto, mobile phone safety, plasticizers/hormone disruptors, and more make for a decidedly mixed view of science by the general public.

As you can see from http://www.pewforum.org/2013/07/11/public-esteem-for-militar... , the military, teachers, and medical doctors are on higher pedestals than scientists.

That said, I'm all for the mixed development model.

I think it's on a pedestal where it matters, where the funding decisions are being made. The public's opinion only matters in the long term. Though, admittedly, the long term is probably coming up pretty quickly. There's a lot of things that have had their opinion trending negative for so long that people have become blase about the negative trends suddenly coming due this year.

On reconsideration, I'm wavering on my belief. NIH and NSF get a lot more support than, say, the NEA. HST, fine as it is, was extremely expensive.

I still maintain that the military is still on a higher pedestal than science, in terms of funding and prestige. You hear stories of people buying military people in uniform their meal, to honor their service. That's much less common for scientists.

No, it isn't.

You calculate the fair market value of the public resources you used, and subtract what you paid the public for them. If it is positive, you have a publicly-supported project.

So if you use that government-paid intern for several hours, you ought to pay their agency or department $7.25 for each. You pay for what you use, and there's no problem.

If you work in a government-built facility, and you pay rent for your space, there's no problem. It doesn't matter that the public is your landlord. The space has a market value, and you pay it. There is no net transfer of value to your project at the expense of anyone else. If someone else could make better use of your space, they could have paid higher rent to get it.

If you're accepting a grant, that makes it a bit more difficult for you. If you get $50000, you would have to pay back $50000, plus the interest and the administrative overhead for processing your grant request. And then there's the value of the risk premium and moral hazard. You would have to find some other source of funding to "close" the research, and it would have to do it before starting work. Otherwise, potentially profitable projects could get privatized just before triggering the public release requirement, and the money sinks would be left as public.

If you use public funds to buy equipment for a publicly supported project, and then later want to use it for a private project, you have three options: lease it from the public project, or pay the depreciated value of the equipment to buy it outright, or make your private project publicly-supported.

It isn't any more complicated than the GPL copyleft. If you use GPL'ed code, you have to make public everything you do with it. If you don't want to do that, don't use GPL'ed code.

> "You calculate the fair market value of the public resources you used"

Which is very hard for things with no market.

I use government libraries. They are free to me. What is the fair market value of that? There are private and subscription libraries, so it's not like no market exists.

What is the fair market value of time on Hubble?

> "if you use that government-paid intern for several hours ... You pay for what you use"

I think you mean $0, not the $7.25 you estimated. Under the Fair Labor Standards Act, an internship is "for the benefit of the intern", not the company. An internship is not supposed to improve the bottom line of a company. An intern may even get in the way, and cause negative value.

And that's my point. The public gains more than can easily be counted by simple, direct market valuation. What is the worth of having students with industrial training? What is the worth of having broad public access to the literature?

Or, for a more real-world case, companies might not be interested in tropical disease research because the revenue won't justify the development costs. But the US military would like to be able to send troops to places with an endemic tropical disease, so they want some way to be able to prevent or treat the disease. The US foreign diplomatic policy would also like the good-will of those countries. The US could, by subsidizing tropical disease research, tilt the "fair market value" so is more weighted towards its military and diplomatic policy goals.

That assumes that part of the corporate revenue comes from subsidy, and part comes from being able to sell the drug on the market. But now, if part of the revenue comes from the government, the company cannot seek patent protection. This reduces the profit expectation, which means the government will need to subsidize the project even more to get a company to be interested in the effort.

The public does not want its funds diverted into private profits, period. Allowing any exceptions or leeway is a wide-open door to a trough filled with corruption and deceit.

You receive no special additional benefit from a library by being a researcher. Everyone can read the same materials as you do. Time on the Hubble costs more than any individual astronomer could pay. If you are keen on closed, private astronomy, you would need to check the NASA budget figures.

If you derive useful benefit from work done at your request, you need to pay the person doing it. If the intern is working for the government for no pay, how would they not just laugh in your face when you ask them to do work for you? You invented the hypothetical; I won't fix it for you.

If the military or state department could derive some benefit from subsidizing private research, they can bloody well do the research on their own. "US Army cures Dengue" would be great for both operations and PR, and would be a much better use of funds than a smart bomb that can stalk you on Facebook and blow up all your friends at the same time as you. If you as a private company want to sell a cure for Dengue on your own, then don't go begging the government for money. Fund it yourself!

> "The public does not want its funds diverted into private profits, period"

Sure. But no grants allow the diversion of fund into private profits, so I don't know what you're referring to.

Take the SBIR grants. It's a way for the government to help small, for-profit companies do the R&D that might lead to results that will benefit the overall US economy and policy. The hope is for the companies to commercialize the results and do well.

It's not money that the SBIR recipients can use to party on Maui. The SBIR system has accounting and oversight in place to help prevent that.

Or, take the (infamous) Bayh–Dole Act. Quoting from https://en.wikipedia.org/wiki/Bayh%E2%80%93Dole_Act :

> The key change made by Bayh–Dole was in ownership of inventions made with federal funding. Before the Bayh–Dole Act, federal research funding contracts and grants obligated inventors (where ever they worked) to assign inventions they made using federal funding to the federal government. Bayh–Dole permits a university, small business, or non-profit institution to elect to pursue ownership of an invention in preference to the government.

That sounds very much like that the public, through its elected officials, don't actually want what you say they want, because what we had was more like what you say we should have, and they decided to change it.

Just set up some very clear and simple rules. Don't worry about small threshold when setting up the rules, simplicity trumps.

(Because corner cases where small amounts of funding would trigger the requirements can be worked around by just not using that find. As long as the circumstances triggering the requirements are easy to predict.)

What about defense research?

Well, there's the GPL-styled approach: anyone with access to the results must also have access to the associated data. This doesn't mean it is mandatory to make it public, though you'd have to restrict the redistribution freedom.

I recently used a large dataset of tweets in a research project. As far as I know, I do not have the rights to distribute these.

I also used a dataset consisting of newspaper articles. It cost me $1.000 to get access to, and I definitely do not have the rights to redistribute it.

As long as you provide a detailed enough description of the source of your dataset that I can reproduce it myself then that is fine. So in your first case tell me what criteria you used to select your tweets and in the second tell me where to send my $1000 and what to ask for.

Unfortunately not everyone reports this information. Here is a study that we did of over 500 papers using online social network data: http://tnhh.org/research/pubs/tetc2015.pdf While most authors would report high-level characteristics (e.g., which social network they measured), fewer authors reported how they sampled the network or collected data, and very few people reported on how they handled ethics, privacy and so forth.

In that vein, what about {industrial|scientific|...} espionnage?

