Hacker News new | past | comments | ask | show | jobs | submit login
Why Is NumPy Only Now Getting Funded? (numfocus.org)
358 points by numfocusfnd on June 27, 2017 | hide | past | web | favorite | 108 comments

We have this problem in the .NET world. Accord.NET is written by a brilliant academic and programmer. It's well written, and has a good API, but it is largely the effort of this one dude, with minor contributions from a smattering of other fellows.

Again, it is great in general, but it has bugs and rough edges here or there, and a lot of people don't trust it for production. I wish there was a way for people to be properly compensated for building and maintaining such vital scientific and mathematical computing software.

That's because people don't actually value free things, until they come to depend on them, and then there is no way for that to continue unless they pay. And even then they will pay the bare minimum to keep it going. That's the whole psychology behind open source. Companies could pay for it and those using it can more than afford to do so, but it's free, and apparently it's going to stay that way so why not ride that gravy train?

.Net had always had a problem with open source because Microsoft implemented APIs and kept them closed source. Users ended up paying for it via licensing, MSDN, and Visual Studio.

I often see a lot of really mice Java frameworks with a .Net clone which gets much less attention (Hibernate, some of the Spring stuff, etc).

Whole I understand Microsoft is going their best to open up .Net, I'm hesitant to believe they'll ever get to the point where original implementations are done in it, and then ported to other langs - it just has a deservedly bad rep.

I started programming C# professionally in 2013...the ecosystem has grown and been improved by leaps and bounds since then. I'm hopeful, personally.

I think one thing that keeps C# things from being implemented elsewhere is that there are just very different styles between even C# and Java. Some fancy libraries will do lots of reflectiony things, build syntax trees, and compile on the fly for performance, allowing complex configuration to be simplified and get fast code at runtime. There's also a big culture of LINQ/functional style, and fluent interfaces in libraries, etc. which doesn't seem to be present in something like the Java community (at least until streams), whereas more dynamic/weakly typed languages don't have, for obvious reasons, a culture of strong typing and letting the compiler do the work for you.

Then when you go to the hardcore functional languages like Haskell, ML, etc. you don't have a culture of OO much, not to mention they just aren't used much in production environments.

So, you see a lot of things from elsewhere ported and brought to C#/.NET, but not a lot of things moving from C# to other ecosystems.

Also a problem for Octave. Remember this?


It also stings a little when people say that it's completely obsolete because Matlab itself is "legacy" and we should all be abandoning the language, Octave included... and yet, even though I like numpy and Python and matplotlib and Julia and R, I still find myself reaching for Octave whenever I need a quick visualisation of some data.

Octave’s “business model” is “give away for free an inferior clone of a commercial product”. Why is anyone surprised that that isn’t lucrative? Anyone who has the money to spend on Octave is likely to just pay for Matlab. The defining feature of Octave users is that they want to use Matlab’s library ecosystem but they’re too cheap or broke to pay for a copy of Matlab.

Because Matlab is expensive. Very expensive. Especially if you want to run it on servers or clusters, which is something people want to do. We're talking a single company paying millions of dollars per year for Matlab licenses, versus paying thousands for Octave (this does happen, and some companies do pay for Octave, but very few). My last job consisted in fixing Octave enough to be able to run some classifiers in servers, which would have been prohibitively expensive in Matlab and it would have also been much too expensive to rewrite in a different language. It was cheaper to pay me to fix Octave just enough for this code to run.

As to the "inferior" part, it really depends. Some people really like the Octave-exclusive features, but most are unaware they even exist.

What exclusive features are there? I'm interested.

It's really nothing major, but it's the small little cherries on top:


It's a bit of a problem to innovate too much with Octave, because Matlab may decide to implement a feature we did first, but they'll do it slightly differently, forcing us to redo our work to match theirs.

You're crazy if you're spending millions on matlab licenses. Just buy a copy of the matlab "compiler" mcc (not much of a compiler - more of packaging system after compilation to byte code, as I understand it). It's quite expensive, but you only need one license for your whole cluster.

I was unclear, sorry. The companies that I know that are paying millions are more concerned about copies of Matlab that they can give to their engineers (actual engineers that work on hardware, not Silicon Valley "engineers" that work on ads), of which they need hundreds to thousands, and they keep running out of copies to give to their employees.

On the other hand, one of the other major data analysis platforms went the other way: R is a free implementation of S language. TIBCO still sells S-PLUS, but my impression is that R is much, much more popular.

My impression is that Octave always lags a little behind Matlab, both in terms of features and performance, since the matlab language is essentially whatever The Mathworks says it is. R and S-PLUS both have a somewhat formal definition in the form of books and articles, which might help keep them in sync.

R's and Octave's story is very different. Essentially one of the key originators of S, John Chambers, started to work on R. That would be like Cleve Moler being a core Octave contributor.

At what point did Chambers became involved in R?

The very beginning. Chambers was involved with the initial design of R.

How much money does R make? Are the main R contributors working as processional academics, do they have corporate jobs that pay them to work part-time on R, or are they funded full time by donations/grants, or is R just a hobby side project for them?

I’m not saying there’s anything wrong with Octave, or R, or GIMP, or ..., only that it’s difficult to make money on it.

This is especially true for Octave though, because someone who wants to do exploratory numerical computing and doesn’t have a specific need for Matlab’s library ecosystem is likely to use Python or Julia or R or ...

I'm not sure either, but I would guess it's a mix. It seems like it's most of Hadley Wickham's career, for example, and there's clearly some work being done by the RStudio and Revolution Analytics folks, but a lot of the packages are clearly the product of people working on their academic or corporate jobs. My impression is that this is also true for most matlab toolboxes (though obviously not the core language), which doesn't quite explain why people use Matlab (but not Octave), versus R (and not S-PLUS)

Why should you need a business model? You should get a grant and then a tenured position at a publicly funded institute for providing a massive public good to the science community. Unfortunately the science community as it currently is, isn't designed to do that. Luckily there is hope for change in the future.

Really? Why is that? In my mind, it is tough to beat ggplot

Partly it's more familiarity with Octave than ggplot, but partly also that the Octave syntax is just about the easiest there is out there. Type something, immediately see some graphs. Type some more, gradually modify that graph. The simplicity is a big selling point. Julia's syntax comes pretty close, but I'm not as familiar with it yet.

Remember that Matlab's and Octave's biggest audience consists of not-a-programmer programmers, that is, people who consider using Matlab or Octave as being something different than using a "real" programming language. I call them not-a-programmer because that's what I've often heard them call themselves, even as they write Matlab or Octave code.

Matlab/octave also include everything but the kitchen sink, especially if you're on an academic license. You can certainly replicate this with a python stack, but it involves a fair number of external libraries.

You might like matplotlib for visualization. It's a Python clone of Matlab's plotting tools. And it's compatible with numpy if you need to do linear algebra or operate on big datasets.

He mentions matplotlib by name.

"The problem of sustainability for open source scientific software projects is significant."

Yeah, William Stein can tell these stories too: http://sagemath.blogspot.cz/2015/09/funding-open-source-math...

Yeah, every open source scientific software seems to have an uphill task before it gets recognized. Sagemath more so than others because of its scale and having a huge code base. Having to integrate multiple open source projects isn't helping either.

But its a fine project and one that I use. For a collection of other projects it works better than most other individual projects.

"Every successful science library has the ashes of an academic career in the making" - Same guy

Is the idea that "foundational work" (in any field) can be done without "huge sacrifices" widely accepted?

It sounds a tad unrealistic to me, unsupported by history.

It's as if people want to have it both ways: Create innovative SW, but also don't take risks or make sacrifices.

Offer software "for free" (and belligerently oppose even something like GPL), but also get paid (preferably by the government, so the people actually footing the bill have no say in it) and be long-term sustainable.

What's next: get paid, but also don't pay income taxes? Give away project control, but also keep it? :)

All understandable desires, but a little schizophrenic.

Disclaimer: I am a big fan of open source and NumPy in particular. I mentor students and OSS newcomers, I even pay one full-time dev to work only on OSS. It's just that I try not to kid myself about where the time&money comes from and where it goes, and I try not to have random people pay for my hobbies.

Extremely relevant previous HN conversion on this topic:


I think you're missing the point.

The authors of NumPy (and other scientific software) have made it possible for many, many people to do better, faster research, and their career trajectory ought to reflect that contribution.

Right now, it does not.

The NSF will certainly give you money to develop asymptotically faster matrix multiplication algorithms. They are much less interested in funding efforts that save an equal amount of researcher time by writing clearer documentation. Tenure committees would rather see 10 papers or 100 citations than a thousand pull requests, even if the last has a much bigger impact on the state of the entire field. If the authors of NumPy were totally rational, they would have written just enough code and documentation to publish something like "NumPy: A Python library for linear algebra" in the Journal of Statistical Software (or something), then moved on to something else entirely. All the work beyond that (assuming it's not supporting a future paper) comes at the expense of their academic careers. People certainly do it anyway--sometimes out of pride, or a sense of helpfulness--but they're certainly not rewarded for it.

They could be though. Funding agencies could give grants for the on-going development and maintenance of software that helps their grantees. Universities could consider contributions to the broader community as part of their hiring and promotion process, and so on.

The odd part is that this would probably be more cost-effective too. Even a $10M/5 year grant is just a drop in the bucket compared to what all NIH/NSF/etc grantees pay for Matlab or Prism licenses.

With all due respect, the career trajectory of Travis Oliphant (&co) has been nothing but spectacular! He's extremely accomplished, respected and (I hope) compensated.

You're kinda undermining your own point there :-)

In any case: "force other people to finance my hobbies" is not my favourite ideology. I find the idea of some "tenure committee" deciding what I should do with my money mildly disgusting.

You're very cavalier throwing millions around, but you do realize this is money other people had to earn and tax first, right? And that these people might have preferred to contribute and support a different cause instead, perhaps something closer to their own heart and interests?

Travis Elephant has done very well with Enthought and Continuum Analytics. Before that though, he was an assistant prof at BYU. He's written several times about how he essentially gave up his academic career to continue working on NumPy. For example, he alludes to it in this post he wrote http://technicaldiscovery.blogspot.ca/2012/01/transition-to-... that says "[my wife] watched me sacrifice my tenure-track position by writing NumPy instead of more papers."

You agree that NumPy has been a huge boon to the scientific community and that BME (his previous field) has reaped the benefits too, right? Let's say NumPy has made minor (~5%) contributions to, say, 500 papers; this is probably a huge underestimate. If he had instead published (say) 25 papers, he would have a thriving academic career and yet his actual impact on the world would be a lot lower. Academia is really bad at rewarding work that has a broad, diffuse impact. In a better world, administrators and funders would recognize that his work on NumPy was sufficiently valuable that he wouldn't have had to choose.

I'm not sure this is the right article to get into "research funding==financing hobbies" debate, but...given that some research is probably going to be funded anyway, wouldn't you rather it be done in an efficient way? Right now, most academic software is a bit of a waste: it's written so that the authors can write a paper about having written it. There's no incentive to produce code that others can use or reuse, no reason to maintain it or update it, and so on. Individual labs can usually figure this out, but that also costs time and money which is coming out of the same budget one way or another.

I like this format. Now that the post slid off the main page, it's more quiet here and we can discuss at leisure :-) Sustainability of open source is a topic close to my interests and line of business.

First, I'd question that a "sacrifice" that leads to such spectacular success is really a sacrifice. Perhaps "investment" is a better word?

I'll concede there are risks involved, outcome uncertain, stressful days... but that's in line with calling it an investment. These go hand in hand. You simply cannot create anything of value without a struggle, and to think otherwise is to ignore the human nature and the history of pioneering new grounds.

In any case, my original point above was that success cannot come without some sort of personal investment. I believe removing risk from potentially high-reward activities is an oxymoron, economic nonsense. Risk and reward are two sides of the same coin.

And if the activity is low-reward, it's just a hobby.

Masking this risk-reward connection by forcing other people to cover the risk side for you just creates perverse incentives and social aberrations. Think bank bailouts, for an extreme but inevitable example of where this line of reasoning leads to.

Or let me phrase this differently.

You seem to have this image of an honest academic, toiling away on OSS at nights, standing up to committees like Galileo, the proverbial Travis. You'd like to help him lessen his burden by a better redistribution scheme of grant money -- a commendable goal.

What I say is if the toil is worthwhile, Travis will be rewarded anyway.

In addition, there are throngs of less honest people you don't seem to consider. These people exist too and always jump out of the woodwork, given the right incentive of less-risk-more-reward. Simple economic arbitrage, a law of nature.

What I'm saying is, such system cannot work, no matter how lofty its ideals: it doesn't help the good, it promotes the bad and unscrupulous, and morally corrupts those on the fence.

Maybe not quite that extreme, but yes, I want academia to reward good software at least as well as it rewards mediocre papers.

NumPy is unusual because it's so broadly useful: physicists and neuroscientists can use it (like original authors), but so can quants and ad analytics folks and other people with serious budgets. Enthought and Continuum exist because they can tap into those markets, and, as a result, Travis (&ct) ended up doing very well--and justifiably so. However, most scientific software is more specialised or lacks this industrial backstop, and I don't see how they could follow a similar path. There's no serious industrial application for spike sorting, for example, but it's very important to neuroscience research.

I'm not sure I understand your concern about throngs of dishonest people, or least how it is any different from the status quo.

I am not imagining a system where money and jobs are just thrown around. Instead, I just want minor changes in hiring and funding. Academic promotion/hiring already weighs a bunch of factors: publication record, funding, peer evaluations, teaching, and various forms of service. I think writing and maintaining a widely used software package is currently under-valued there. Maintaining very popular packages might be treated as editing a journal, another form of academic service that does seem to count for something.

Similarly, it is (comparatively) easy to get money to develop new methods and write some code implementing them. There ought to be a complementary funding stream for maintaining successful packages once they are written. You could imagine a BAA-like process where applicants say, "My toolbox was downloaded 2,182 times and cited in 378 papers last year. However we've got a bunch of open issues including integration with X, Y, and Z, missing documentation for A, B, and C, and so. I want...20% salary support to work on this over the next year and $10,000 for a freelance technical writer." As with other grants, these would be evaluated (potential impact, proposer's track record, interactions with other funded programs, etc).

I realize there are alternatives. Stephen Wolfram essentially took Symbolic Manipulation Program "private" and funded Mathematica development by selling subscriptions. It worked for him, though I think that's asking people to assume a huge career risk or change that they might not want.

I wonder the SageMath guy ever considered something like the NIH's SBIR program (a scheme for setting up small businesses based on research).

Who said people want to do "foundational work"?

The just want to write some specific software they enjoy writing together.

The question rather is: after said software has been proven valuable and widely used by companies, and when there remain some hard and/or non-enjoyable parts to be done to complete it/enhance it (specific drivers, documentation, advanced features, etc), why don't any of the users donate some money to help?

That's an easy one -- because they don't have to :-)

In some cases, "paying extra" without being asked to might even be compromising your fiduciary duties.

I feel the open source movement has spoilt its "user base" a little. There's very little expectation of reciprocity. Sometimes even outright hostility toward OSS modes that try work with reciprocity (check out the linked HN thread above).

This mindset might be hard to reverse at this point. As with all "communal sharing" experiments, the true crunch time comes as the economic reality sets in.

Donating company money is really hard, especially if you're just some manager of some department, and double so if it's not to a registered charity. Much easier to actually buy something like a support contract or a license where you get a purchase order and a receipt from a registered organization. People at companies simply cannot just give company money to random people with paypal accounts even if they think it's the right thing do to.

>Donating company money is really hard, especially if you're just some manager of some department, and double so if it's not to a registered charity.

Maybe the open source communities should register as charities then? They do charitable work after all...

This is the function of the fiscal sponsorship program at NumFOCUS. All fiscally sponsored projects basically come under the umbrella of NumFOCUS' nonprofit status, so that the projects can receive tax-deductible contributions, etc.

The "Every successful science library has the ashes of an academic career in the making" quote has been mentioned several times in the comments, so I thought I would give a plea to everybody who works in academia to help the people who build the foundational tools of your research by citing them in your papers:


Sad to say that I haven't cited these in papers that I've published so far. Much software is taken for granted, especially libraries. I don't often see papers mention SciPy specifically let alone add a citation. The author of GNU Parallel resorted to a message on stderr asking academics to cite it.

But this isn't even the whole story. As we all know, designing and implementing a program or library is often just the first step. Someone then has to maintain that code. There is absolutely no incentive for anyone to maintain these. Institutions will fund new software because they expect a paper to be published about it. But nobody is paying anyone to maintain this stuff.

Because developers generally don't know (or don't like) the outreach necessary to fundraise.

For example in numpy case - https://github.com/numpy/numpy.org/issues/9 That's a request in March 2017 to add a donation button to the website. Im not sure that 6 months back, if Numpy was legally structured to receive larger funding. I posted a similar comment (with many more replies) in the context of Octave and it's funding https://news.ycombinator.com/item?id=13604564

Tl;Dr Don't ask for donations - instead sell "gratitude-ware"

There are tons of people who WANT to support these projects, but you have to make it easy and accountable to do that. The best example that I usually give is Sidekiq.

@mperham is awesome that way "This is exactly why I disclosed my revenue: people won't know there's a successful path forward unless it's disclosed. I want more OSS developers to follow my lead and build a bright future for themselves based on their awesome software."

In fact, I believe there's a start-up to be done here. "Stripe Atlas for Open Source software"

NumPy joined NumFOCUS as a fiscally sponsored project in 2015, so it's been eligible to receive donations as a non-profit (because of NumFOCUS' nonprofit status) since then.

>the entire scientific Python stack was essentially relying upon on the “free time” work of only about 30 people—and no one had funding!

30 people? I remember a time when a certain fruit company would enter a field, literally hire all 30 of those guys, and put them behind closed doors. Then in 2 years they'd dominate the field for the next decade.

Are these guys turning down offers? Or is the fruit company that poorly managed now?

The fruit company (and really all of AFGAM and much of the rest of the industry) is biased toward hiring recent grads or folks who have a clear record of steadily increasing responsibility, which disadvantages those who have followed a more... eclectic path, as well as those who have had major dislocations of some sort derail their career.

And almost by definition, "those 30 guys" building and maintaining a library with their volunteer labor in some "field" (that didn't exist as such until relatively recently) mostly won't have standard-looking career paths.

Sometimes there is a critical mass of those folks concentrated in a company, prompting an acquihire.

They seem more likely than others -- see the recent Homebrew author grousing about getting rejected by Google, only to land at Apple, for instance.

He left apple. He didn't fit in. There's an interview about it on a podcast

link please!

Podcast is called The Changelog, the episode is abiut the guy who runs Homebrew project

There isn't much money in writing libraries like NumPy. Why would a company hire a bunch of expensive devs to write some software to compete with Matlab? The market just isn't big enough to justify the cost.

Google hires devs to develop similar libraries... to power their machine learning efforts. There is tons of numerical work in finance, too. I don't think the money is to be made by selling copies of a Matlab-like piece of software, but in the application of the tools.

This is spot on. There are plenty of jobs applying open source technologies, but far fewer building those libraries themselves.

(I'm NumPy dev who works at Google on machine learning.)

Really? This is somewhat surprising to me. How has Grumpy impacted your work / numerical computing in Python at Google?

I haven't used Grumpy at all, and unless it starts supporting C extension modules like NumPy I doubt I ever will. Google's numerical computing / machine learning stack (e.g., TensorFlow) is based on Python/C++.

But you have to run through their interview gauntlet.

Even if you get head hunted?

Ken Thompson didn't (still doesn't ?) have commit access because he hasn't been vetted by Google as a competent C programmer. I don't think Google is going to relax its hiring policies no matter who you are.

And their decimation.

It's not about money. Apple has $200B just sitting in a bank doing nothing. It's about supply and owning the entire market. What happens when the entire market stagnates because those devs are writing great stuff for Apple exclusively?

Apple also used to buy the entire world supply of sapphire for phones and the entire world supply of flash memory for music players. Literally, nobody else could compete, because Apple contracted it in bulk, first.

Those days seem long gone now. Today Apple takes 5 years to update their Pro desktop, while at the same time, making fun of people for using 5 year old computers.

I wonder too -- Apple should be hiring _somebody_ to build a modern nd-array for swift -- and evolve playgrounds into the ultimate scientific computing environment.

The tools for scientific computing are amazingly scattered - seems like a big company with a modern platform strategy could make a huge impact ...

I don't know who you're talking about, but Orange Micro went out of business in 2004. ( https://en.wikipedia.org/wiki/Orange_Micro )

Not sure if you're being sarcastic, but he means Apple.

It was more an attempt at light humor than sarcasm. :-)

(Not to mention I thought it was hard to imagine anybody around here being unaware of the most capitalized company in the world. Aside from maybe Saudi Aramco.)

I think he knows. Mike has a 5 digit /. ID :)

Heh... I knew my Slashdot ID was pretty low (~32K), but had no idea I'd been around here that long. Now I feel old. :-)

Speaking of old, I used Reddit as my main news source for a while, and I remember /r/programming having a post celebrating 65,536 users. Now it's over 10 times that, and there are more than three subreddits.

Q: Are we conflating two issues?

Is there a difference between the "sole developer problem" and the "lack of funding" problem.

I mean, even if a project finds funding, does it follow that it will attract more talented developers?

One way to distinguish the two issues is to look at for-profit software. In the cases where there is one primary developer, do they find it easy to keep the software going when the person retires?

I ask this because, I think, beyond the very real monetary issue, there is a question of how development works. Do we need one very talented individual who does the lion's share of the lifting?

> even if a project finds funding, does it follow that it will attract more talented developers?

It depends on what the funding is used for. Outreach in various forms (better documentation, presenting at conferences, a nicer website, moderating community venues from mailing lists to chat channels, triaging and grooming issues and PRs, etc.) takes time and effort (and therefore needs to be funded) as well.

For SageMath, what funding we’ve got has been mostly used for organizing many “Sage Days” workshop coding sprints – we’ve had nearly 100 now over the last decade! They have made an absolutely tremendous difference in attracting the over 600 people who have contributed. But also key design decision, e.g., 100% test coverage, helped too. I clearly remember the moment – in a big discussion at a Sage Days (funded by the Clay Math Institute) – that Craig Citro argued for 100% doctest coverage, and we all decided to do it. Without funding, those Sage Days wouldn’t have happened, and the community would not have developed. I also remember a Sage Days that we did jointly with Numpy at Enthought, where discussions and work we did together grew into many important things later. Funding is ridicuously absolutely critical to growing certain types of open source projects. In another direction, a lot of our infrastructure (e.g., build, testing, etc.,) is hosted on a big computer bought with a grant.

Original NumPy author here. I have a lot to say on this topic, given that it has literally consumed my life over the past 20 years. You can go here for some thoughts about some of this: http://technicaldiscovery.blogspot.com/ There are several articles there that relate but in particular http://technicaldiscovery.blogspot.com/2012/10/continuum-and... and http://technicaldiscovery.blogspot.com/2017/02/numfocus-past...

I knew what I was getting into when I wrote NumPy. I knew there was not a clear way to support my family by releasing open source software, and I knew I was risking my academic career.

I did it because I believed in the wider benefit of ideas that can be infinitely shared once created and the need for software infrastructure to be open-source --- especially to empower the brightest minds to create. I did it because others had done it before me and I loved using the tools they created. I hoped I would inspire others to share what they could.

There have been a lot of people who have helped over the years. From employers willing to allow a few hours here and there to go to the project, to community members willing to spend nights and weekends with you making things work, to investors (at Continuum) willing to help you build a business centered on Open Source.

There are many people who are helping to fix the problem. In 2012, I had two ideas as to how to help. Those who know me will not be surprised to learn that I pursued both of them. One was the creation of NumFOCUS that is working as a non-profit to improve things. The second was the creation of Continuum (http://www.continuum.io) to be a company that would work to find a way to pay people to work on Open Source full-time.

We have explored several business models and actually found three that work pretty well for us. One we are growing with investors, a second we are continuing with, and another we are actually in the process of helping others get started with and ramping down on ourselves.

Along the way, I've learned that open source is best described in the business world as "shared R&D". To really take advantage of that R&D you need to particpate in it.

We call our group that does that our "Community Innovation" group. We have about 35 people in that group now all building open-source software funded via several mechanisms.

We are looking for people to help us continue this journey of growing a company that resonantly contributes significantly to Open Source as part of its mission. If you are interested, contact me --- I am easy to track down via email.

Hey, Travis, one question: have you managed to avoid selling non-free software? Everyone seems to think that eventually the way to sell free software is to sell some secret sauce on the side. Is this something you have completely eschewed?

I ask because for Octave this is a non-negotiable requirement. Partly because of the GPL, but mostly because we really do believe that the whole point of Octave is to get away from non-free software, as a matter of principle -- if you want non-free software, there's already a Matlab. Is there any way to generate enough money without a EULA?

I believe it is true that you can make more money selling non-free software (some of the profits from which should always be used to make free software).

You can generate money only through free software, however. Here are three approaches I have found: 1) Consult on projects that use the free software and use some of the profits to support that free software, 2) Sell enterprise-grade support on the free software to big companies. This is much more than just help-desk and answer the phone. All commercial software comes with a big contract. You provide the same kind of contract just no license restrictions. Others can do the same and so you have to distinguish yourself by either having all or most of the experts on the software or just really good marketing. 3) Dual license using GPL3/AGPL for the free version and a commercial license and then aggressively go after people for GPL violations if they don't get the commercial license.

I don't like the relationships created by the third model --- your sales processes become aggressive and counter-service minded by definition. I don't see it scaling and really providing value.

The other two are really hard to impossible to get investors exited about and therefore you struggle to get the capital together you need to prime the customer pump.

There is a another general model with many corollaries where you basically "do something else" that uses the software as a critical part of the business and let the profits of that activity fund open-source development.

A lot of open-source these days is actually funded by this kind of activity (or from VC's hoping to profit from a promise of great wealth from this kind of activity).

In 2006 you sold documentation for NumPy [0] - how did that work out?

0. http://csc.ucdavis.edu/~chaos/courses/nlp/Software/NumPyBook...

There is a long-standing problem in open source software, which is that there is no "business model" associated with funneling resources to people putting significant effort into it. Setting up a consulting business to monetize software creates the perverse incentive to make software harder to use, but there seem to be some examples where this model has worked out reasonably.

Open source projects are typically started by people working in the field, who have a strong urge to scratch some itch. Even if we find a way to find money for them to work full-time, they often don't have the desire to "productize" software, or to create/nurture/govern an organization around bringing together different stakeholders who might be able to use, or contribute to the software. (We got really lucky with Linus+Linux)

I believe I’m one of (or near) the “top 30” contributors (I’ve made substantial contributions to all of the aforementioned packages), and I’m funded to write scientific software. I’m extremely fortunate. Unfortunately, like so many things, I suspect it has everything to do with pedigree (e.g. my lab, my institution, my peers, etc.) rather than my (or my coworkers) exact contributions. In fact, I don’t know if any of my lab’s grants has ever explicitly mentioned our contributions to one of the discussed packages. However, this could change. I’m extremely encouraged, for example, by the comments from new institutions like OpenAI or the Chan Zuckerberg Initiative about the necessity of funding software.

Another project which is easy to overlook: Think about how many scientists use Emacs for most of their development and writing. But there is (to my knowledge) not even a single paid developer working on it.

( http://gnu.org/s/emacs )

It's surprising to me that people are surprised by this.

Even setting aside the fact that the people that can do this work are few in number, the vast majority of people need a way to support themselves and their family. If the number of people that have these skills is low, the subset that is both altruistic enough to donate them for a sufficient period of time and personally able to do so must be vanishingly small. (And the negative feedback a lot of OSS maintainers receive doesn't help.)

Companies have the same issue... there has to be a fairly direct connection between an expenditure (paying developers) and a return on that investment. That can be a very difficult argument to make.

If your open source project needs funding there is https://opencollective.com/opensource (currently waiting, I'm not affiliated)

Look at the numbers before assuming this is a solution to funding Open Source. I'm not saying it's a bad thing, just that the amounts it's bringing in for most projects is effectively like buying the developer a cup of coffee. It's not a sustainable sum for someone to live on, in almost all cases.

Right now, Webpack has $99k; which is respectable. That's a decent annual salary for a full-time developer that lives anywhere other than Silicon Valley or NYC. But, it falls off a cliff after that. Next is MochaJS at 16k; that's two months salary, maybe three or four in cheap places. After the top twenty projects, you're closer to nothing than you are to something. The time it takes to set it up would be worth more than it'll return (there are lots of $0 projects in the list).

Maybe it'll improve with time. I dunno how long this project has been going on, and it looks really well done. It's just that the numbers are abysmal. I hope it'll improve. I wish we had a way to fund our software without having to sell a commercial version...but, it's so far from realistic, we can't really even consider it.

I've been making my living from OSS for ~20 years now, and it's the great curse and tragedy of the thing that because the software is free, there's an incredibly pervasive belief that tiny sums are sufficient to keep a project afloat.

We tried, very early on (~14 years ago now), to crowd-fund some major enhancements to our software...we raised a total of about $15,000 (this was long before crowd-funding was a thing), which to a lot of people seemed like a lot of money, and it caused a lot of folks to feel extraordinarily entitled to specific results (e.g. because they contributed $200, they assumed it would mean we'd develop a big feature they were the only person asking for, and that would take days or even weeks to build and test). But, for two developers working full-time on something, $15,000 is practically nothing. Most developers just don't have the freedom to trade in a full-time job paying market rates for a sub-poverty annual wage. Realistically, we couldn't commit full-time to the project until we started a business based on it that brought in predictable, recurring revenue. And, we couldn't effectively do that without providing things that the OSS version didn't have. OSS is a really tough way to make a living, is what I'm trying to say, and I don't know a lot of people who do it successfully without being employed by someone that does a lot of non-OSS stuff, too.

It blows my mind that NumPy is just getting funding. How did the Eigen (used in TensorFlow, among other things) folks keep it going?

A few of the Eigen core devs (Benoit Jacob, Benoit Steiner, ...) work at Google.

Ok, so the problem seems to be 'lack of maintainer' or could be stretched to contributer. The article later linked to https://www.slideshare.net/NadiaEghbal/consider-the-maintain... which kind of reminded me a problem which I'm facing.

After getting through basic of "Introduction to Computer Science Using Python" and forever pending goal to become a "Python Developer", is anyone here who is experience in Python willing to be my mentor? In return, free Python labor. :)

What sort of things ("Python development" is a bit nebulous) are you most interested in?

The sort of things which people get to call themselves "Python Full Stack Developer" or people whose primary job is to write Python code at some workplace. I just want to work in something where is Python is used and in great demand to get me a full time job. My latest depth into Python apart of learning the surface has been with Flask. Developed a CRUD using a tutorial and there is absolute so much to learn. But if I learn a lot about Flask, I won't be playing with Django which seems to have more demand. I also like to have Python projects on my Github but so far, I have no clue what to make on it and to publish. Just want to be able to develop anything by knowing the art!

Ask me questions and I'll do my best to respond

Want to develop a Github profile which could be very attractive to Python related job. How do I get started on it? People say "do something new and solve an existing problem with Python" But is EVERYONE solving a new problem which hasn't come before to get a Python Job? I just want to become a potential for Python Junior Dev job.

I think the basic idea is to show through your profile that you're a competent developer. The project you pick doesn't necessarily need to do anything new and amazing, it just needs to be your work. You could make a task-list app (which has been done a million times) but if you do it well then you've achieved your goal.

Of course, the best outcome would be to make something that other people want to use. This would get attention and further validation. So if you find a project you're passionate about---the "I built it because nobody had yet" thing---then that would be best.

Good luck!

I tried at work to get some money to the maintainer of iRAP, an RNA sequencing analysis pipeline we depend on heavily at the moment. But business sees this as wasted money, it's there for free, why not take it? Reading this, I think I'm going to double down on my efforts again. We get so much value out of a huge pile of FOSS software, we should be donating. Meanwhile we have spend piles of money on Matlab for years and we aren't even allowed to run Linux on our Laptops if we wanted to.

I think someone on here recently suggested to a project that they provide an "enterprise subscription" or something like that with minor benefits to make it easier for businesses to justify (or maybe it made administration easier?) to "donate". You could suggest that to the maintainer, and then use that to convince business that you need that.

Because it's an important tool for Machine Learning, which makes money from that (of which there's plenty going around right now) flow into it.

This is still an important problem for numerical compute in other languages. It's a struggle to do data analysis and write machine learning applications in Scala, Java, C++, etc. due to a lack of Numpy / Pandas style ease of use and functionality.

Each country have its own taxes to sustain fundamentals, same thing must apply in software industry. Software engineers might be paid +100k$ while developing over open source languages/frameworks or libraries. That's is not fair, it is like riding a starving horse.

A French point of view (after all, France invented VAT...) would suggest to introduce a taxe on software engineers salaries (1% ?) and redistribute this fund on most used languages/frameworks/libraries and use a part to sustain a new projects.

How about instead of taxing the workers we tax the companies that profit from open source efforts.

Let me explain better, no software engineer (SE) would be paid above 30k$ if he always starts from scratch (walking). He is paid its economic value for being fast. And he is fast because he rides a horse (building over open source language/framework/library).

So while a part of SE are maintaining stables, other part of SE don't want to pay for horse. I think this last part of SE are killing their own jobs.

The software engineer is paid because they produce value for they employer. It's upon the employer, not the employee to pay for the technological infrastructure.

That's right. But when your employer will ask to shift infrastructure because old one is no more maintained, and to be as productive as before, you'll be very slow at development and get fired. If the employee is paying for the infrastructure, this is because SE are not solidary to open source projects.

How would you measure how much of the profits a company makes come from open source efforts?

You don't, you just use a fixed percentage of whatever business tax they pay to fund open source development.

How are you going to separate out companies that profit from open source software, from those that don't? Would the fixed percentage be same for a company that is built 100% on open source software as it would for one where a handful of people in one department occasionally use VS Code to write some python scripts?

If you want to fund open source using tax money, then fund open source using tax money. Trying to implement a special Open Source Tax and apply it only to those that "profit from" open source seems both counterproductive and virtually impossible.

I think everybody profits from open source, since a large part of the Internet runs on open source, so there is no big need to discriminate.

So why make a special separate tax that has to be calculated and administered? Just use the tax money you already collect to fund open source projects.

If you want to spend more money you have to increase taxes. I don't mind whether we just increase an existing tax or create a new one.

I find it stupid to offer something for free and then cry when someone does not pay you for it.

Developers who take free time to create projects are not crying for money, they are crying for not having enough time to do more!

Not everybody is made to only work for money.

How do you decide which projects receive more funding than others?

This is exactly the problem Open Collective[0] exists to solve. Often times, there are people who want to financially support a given open source project, but there is no channel by which to do so. Creating the financial channel is the first step toward a much needed culture change where the assumption is you will support the open source you rely on, especially if you're making money off it.

[0] http://www.opencollective.com

I thought that Enthought sponsored a lot of NumPy development, kind of like as a corporate caretaker or something, is that not the case?

I'd also like to read about what Enthought actually did. Anybody knows the details and willing to tell?

I appreciate this article being posted, and have the utmost respect for NumPy developers. The urgency and discrepancy between use of certain important open-source libraries, and their support, is bewildering sometimes.

As I was thinking about it, though, I'm not surprised NumPy hasn't been funded before. The reasons why say a lot about biases in memory.

It wasn't that long ago that the sorts of things NumPy does were seen as fairly niche, and in the domain of statistics or engineering. It's only with relatively recent interest in AI and DL that this has been seen as within the purview of Silicon Valley-comp sci-type business, as opposed to EE or something different. I still am kind of a little disoriented--the other day, looking through our university's course catalog, I realized that certain topics that would have been taught in the stats or psychology departments are now being seen as the territory of comp sci. Statisticians have written excoriations about being treated as if they don't exist, as comp sci blithely barrels forward, reinventing the wheel.

I'm not meaning to take sides with these issues, only pointing out that I think the world we live in was very different not so long ago. It might seem puzzling that NumPy hasn't had more funding, but I think that's in part because what it's most profitably used for now wasn't really seen as much more than academic science fiction not too long ago.

The other part of it too, is that until relatively recently, if you were to do numerical heavy lifting, you'd almost certainly be expected to do that in C/C++ or maybe Fortran. There's a tension in numerical computing, between the performance and expressiveness that's needed, and Python is on one end of that continuum, far from the end that is traditionally associated with complex numerical computing. Sure, you had things like MATLAB with Python in the same functional role, but those were largely seen as teaching tools, or something that engineers did for one-off projects, having learned to do that in school (I still think the use of python in ML derives from the use of Python as a teaching tool in uni).

I'm not trying to knock Python or NumPy or anything, just kind of trying to convey a different perspective, which is that I can remember a time not too long ago when the use of Python in numerics was seen as primarily didactic in nature, or for limited circumscribed applications.

FWIW, it seems to me Python is kind of on a path similar to what happened with javascript, which was treated as kind of an ancillary helper language on the web, until Google started pushing its limits. Then there was browser wars 2.0, and huge efforts put into javascript, and it became a main player in network computing. To me, there's a similar trend with Python: it really kind of existed as a language for prototyping and scripting tasks, and now finds itself in a different role than it has been used for traditionally, and projects in that area are getting an influx of money accordingly. What I see happening is (1) a blossoming diversity of numerical computing communities (Haskell, Python, Julia, Kotlin, Scala, Rust, Go, etc.), due to competition and variation in application scenarios and preferences, (2) a huge influx of resources being put into Python to make it more performant, or (3) people jumping ship from Python into one of those other platforms to get more bang-for-the buck [or (4) some combination of all of these.]

Because capitalism is an inherently exploitative economic paradigm?


> "And if you’d like to take action to contribute to project sustainability, consider becoming a NumFOCUS member today."


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact