Hacker News new | past | comments | ask | show | jobs | submit login
Empirical Analysis of Programming Language Adoption [pdf] (2013) (princeton.edu)
26 points by Multics on Dec 6, 2014 | hide | past | favorite | 20 comments



The observations in this paper mostly seem to back up reasonable, yet rather depressing, assumptions we might have made.

The choice of programming language for a project seems to be driven primarily by the size of the surrounding ecosystem and developer familiarity, rather than by any sort of technical merit in the language itself.

Also, at least among the studied population of developers interested in creating SaaS, programmers appear to be far more interested in the ease of bundling together other people's work to get something built quickly and cheaply than they are in features that would promote attributes like robustness, security, performance and scalability.

This may be due in part to a generally low standard of education and understanding. Many developers indicated a desire for characteristics like good performance and expressive power, yet then placed little emphasis on language features that could actually achieve those things, and a lot of them didn't even understand the most basic properties of and relationships between some of the language features they were commenting on.

The most positive observation to me is that there remains a long and heavy tail in most of the results, meaning objectively better languages can become successful within a niche even if they aren't widely appreciated and may never make it into the very small group of languages that dominate mainstream software development.


Who cares about the "technical merit" (which is a completely arbitrary measure) of a language? There IS no "objectively better" language. Not everything is on a scale ranging from 0 to LISP.There are different tools for different jobs, each has performance characteristics along different dimensions.

There is only one question that matters to developers: Is this productive?

By that measure, the language itself is a small part of the equation which includes tooling, libraries, community, documentation, etc etc etc. This is completely normal and not depressing. It's sane.


Who cares about the "technical merit" (which is a completely arbitrary measure) of a language?

Anyone who wants to produce good software efficiently, one would hope.

There IS no "objectively better" language.

Of course there is.

Language A and language B let you write identical code to solve a problem correctly, but language B does not admit a certain class of programmer error where language A does.

Language A and language B let you implement identical designs, but in language A you can do so concisely with direct semantic support, while language B requires 5x as much code and a load of boilerplate idioms.

There are different tools for different jobs, but sometimes there are also better tools for doing the same job.

There is only one question that matters to developers: Is this productive?

I'm not sure I agree with that, but in any case, it is purely the developer's perspective. There are many other questions that might also matter to someone who uses the software. Does it give the right answer? Does it run efficiently? Is it safe? Will those things still be true next year? These things all affect the value of the software, and if nothing else, that should make any commercial developer consider whether a better product would be worth more in the market.


There IS no "objectively better" language.

Of course there is.

I agree that that there are objectively better and worse languages. (brainfuck is a kind of proof of conception for this, right?)

I think, however, that comparing languages is much more complicated and difficult than most people allow for. Most comparisons are made on the basis of a few issues that the comparator happens to care about, ignoring many other possible methods of comparison.

We've all experienced this when advocate of language x completely misses the point in his critique of our own favorite languages. These reductionist comparisons naturally lead to distrust of all comparisons of languages.

Knowing that there are better and worse languages does not necessarily give us the ability to determine which languages are better and which are worse.


I mostly agree with you, however:

the size of the surrounding ecosystem

I think you underestimate the importance of this. Particularly if you have a complex and multi-faceted problem domain at hand, ecosystem and library support can be a dealbreaker, for which many people would sacrifice expressiveness and language power. Particularly so if the language is relatively malleable and/or dynamically introspectable, so they can sort of emulate certain higher language features just enough to be useful, while having abundant libraries, toolchains and documentation. It's definitely a valid trade-off.

In addition:

programmers appear to be far more interested in the ease of bundling together other people's work to get something built quickly and cheaply

This is primarily a side effect of software complexity. Starting off from first principles can be quite impractical these days, so stitching together others' work is becoming a common paradigm. I'd almost say it's the natural end result of abstraction itself.


Particularly if you have a complex and multi-faceted problem domain at hand, ecosystem and library support can be a dealbreaker, for which many people would sacrifice expressiveness and language power.

Just to reiterate: I think the current situation is perfectly understandable. Obviously the ecosystem does make a huge difference to how productive someone can be in any given programming language, and the choices of language that projects are making are perfectly rational given their constraints and objectives.

It's just unfortunate that this creates momentum behind languages like C, PHP and JavaScript, which then concentrates the effort of library development within their ecosystems, creating a vicious cycle. In terms of robustness and productivity, the industry would be better served by moving almost all development work away from that sort of underpowered, poorly specified, error-prone language as soon as possible, but you can't do that without the libraries. Developing that ecosystem for any new language takes huge amounts of time and effort, so to have any realistic chance of catching on, you first need an excellent bridge to existing libraries. Sadly, the FFI infrastructure even in relatively good languages tends to be clunky at best, and that is a huge barrier to adoption for all the reasons you mentioned.

Starting off from first principles can be quite impractical these days, so stitching together others' work is becoming a common paradigm.

On the other hand, the trade-off for that reduction in NIH early development effort is that you now have a discovery problem up-front and a dependency management problem indefinitely. I don't think that as an industry we've really found a good balance yet between not reinventing the wheel and not spending forever choosing which of 57 pre-existing wheels to use.

What I'd love to see is a new generation of languages that accept the modern reality of combining packages from diverse sources and so place much greater emphasis on supporting that kind of software architecture. Interoperability between packages, each with their own related but probably slightly different concepts and conventions, needs to be much more of a priority IMHO. When we can compose data structures and algorithms across packages from different sources, and have all the marshalling and type safety and error handling mechanisms just work with minimal programmer intervention, then we'll be getting somewhere. But of course that's just one more substantial issue that any new language has to address on top of everything else, which is no small thing to ask.


It's not just clunky FFIs. There are simply some APIs and standards (like POSIX) that are more-or-less designed with a specific language in mind, in the aforementioned case that being C. Using them from a different language can simply feel awkward, though I suppose there has been a shortage of effort in bridging the gap.

I agree that the ridiculous dependency chains and brittle build tooling/bootstrapping are getting out of hand in modern development. Go actually gets this thing right, but is bare bones in every other regard (which I also appreciate, I enjoy Rob Pike's work).

I can't really think of any solution to this dilemma, though. Fundamentally, it's not just limitations of languages, but of operating systems, as well. Having a file server-centric OS design instead of a daemon-centric one for providing services alone will simplify a whole lot of things that we're used to abstracting in libraries.


For an economical perspective, the size and the quality -- which you forgot, but since Java is at aim, the quality of its libraries matters -- of the ecosystem is of the utmost interest.

Why would one reimplement something if it has already been done (correctly), is released under Apache 2 license and is maintained ?

For myself, that's why I stick to Java for productive work. That doesn't mean I don't use any another programming language, but to get something into production quickly, Java is hard to beat.


Why would one reimplement something if it has already been done (correctly), is released under Apache 2 license and is maintained ?

You don't necessarily have to reimplement every library, as long as you can use them easily from another language. Why would you want to do that, anyway? Because the other language is much better than what you already have.

Personally, I only work on one project right now that still uses Java, and it uses it as an applet on a web page, not in a server/enterprise role. The effort required just to keep basic functionality working in that environment is absurd these days, since it seems every major browser developer and Oracle themselves are doing their best to make the platform unviable as quickly as possible. But even if they weren't, this is a project that uses several very different languages for different components in the overall system, and Java is so underpowered compared to every other language we use that I won't be at all sorry to see it go.

The more I think about it, the more Java seems like a perfect example of what I was talking about before: a weak language that is now chosen primarily for its established supporting ecosystem and not because of any great technical merit.


Instead of expecting people to change their taste and demanding re-education, find a way to give them what they want while also giving them what you think they should have.

But honestly you may not be in such a privileged position relative to the rabble, and there may be situational reasons for others not to want the features you want.


Instead of expecting people to change their taste and demanding re-education, find a way to give them what they want while also giving them what you think they should have.

I don't really expect them to change their taste. As I said, the actual situation is something we might reasonably have guessed: combining building blocks quickly and cheaply is perceived as more important than quality. Given that it's clear a lot of people will pay for cheap junk, that is a commercially sensible thing to produce.

I just find it sad that this is the situation we seem to have drifted into. As more and more things we use every day depend fundamentally on software, and an ever more connected society makes issues like security and scalability more important as well, I think the world would be a much better place if consumers were less forgiving of problems caused by badly written software and demanded better. However, the industry seems to have succeeded in convincing most people that mediocrity is inevitable and software that only works for a brief period or requires constant updates for maintenance is the norm.

What is perhaps more sad is that so many professional software developers themselves don't seem to know any better.

For example, part of this study examined attitudes to static typing and unit testing. We might reasonably suspect that a population interested in SaaS has a bias towards dynamic languages like Ruby and Python, so it's not too surprisingly that 62% said they saw the value of unit testing compared to only 36% for static typing. The disturbing thing is that both of these figures are so low, given that these aren't exactly controversial techniques and there is plenty of evidence that using at least one of them can substantially improve quality.

Another telling contrast was that when asked about what they enjoyed using, programmers heavily favoured expressive languages, yet when asked about which language features were important, features that offer more expressive power like templates, generators, higher order functions and macros were all rated much lower than OOP staples like classes/interfaces and inheritance. A majority saw little if any connection between higher order functions and objects, despite their mathematically provable relationship; a third didn't even understand the question.

When you have that level of ignorance of programming theory that really ought to be general knowledge by professional standards, it's not really about letting developers have what they want while also giving them what anyone else thinks they should have. They could already have both today, and they already say they want the benefits that would bring. They just don't know enough about the tools that are already available to them to get those results, and isn't that just a straightforward education problem?


I think the colloquial meaning of "expressive power" is the ability of the language to express something the way the programmer would like. In other words, how close it is to pseudocode. An expressive language, in this sense, would be Python.


I think the colloquial meaning of "expressive power" is the ability of the language to express something the way the programmer would like.

OK, that seems like a reasonable definition.

An expressive language, in this sense, would be Python.

Expressive relative to a language like C? Yes, definitely, Python is much more powerful in this respect.

But expressive relative to the start of the art, the best that other languages have to offer? I don't think so, not really. There are plenty of concepts I would like to express in my code that I can't incorporate neatly or at all when I'm working in Python, but at least for now, the best implementations of many of those concepts aren't found in mainstream languages.


I'm not sure their sources were that much better than what I had with langpop.com although they certainly go more in depth.


This is an analysis of adoption from 2000-2010.


I don't buy the claim that older developers "forget" languages. Instead, our standard for what it is to "know" a language goes up and the number of languages we claim to know remains constant.

I've known some engineers to argue that the number of languages on a CV is negatively correlated to the quality of the candidate.


I've been reviewing CVs recently, and there is a striking negative correlation between a candidate's amount of experience and the number of specific languages or tools they choose to highlight.

It's also quite striking how the younger and less experienced candidates often have a first page that is pure keyword stuffing now, while older and more experienced ones tend not to. My suspicion is that the younger candidates expect to have to get past a computer and then HR before encountering anyone who knows what they're talking about technically, while stronger candidates tend to start from the technically competent end via a contact in their network and only expect to deal with HR right at the end of the process to dot i's and cross t's.


> My suspicion is that the younger candidates expect to have to get past a computer and then HR before encountering anyone who knows what they're talking about technically

As someone who is currently reworking their resume, I can say your suspicion is correct. If I don't know someone in or around a company I am interested in, or I don't have a target company, I expect to be treated like a random person off the street. That means submitting my resume to some email / web form; then having it parsed and stored in their applicant tracking system until HR does a search using the job description keywords. Only after a sufficiently high percentage keyword match would I expect someone from HR to begin their process with some kind of form letter email.

It's like what happened with automated phone systems and ATMs. I no longer expect a person to answer.


> I've known some engineers to argue that the number of languages on a CV is negatively correlated to the quality of the candidate.

The song sung by old engineers who refuse to learn new technology, so they try to make knowing few a positive (it isn't). The classic "It's not a bug, it's a feature!"

I don't think the core of a good developer has changed much since the early 90s. You still want people whose skill-set is T-shaped... You want them to go deep on at least a few domains, but you want them to have broad experience in lots of them. The broad experience makes for developers who are better than the sum of their parts, can adopt solutions from other domains and can communicate better.

I generally won't consider too seriously a "senior" engineer who has only a single "class" of knowledge. I realize that sometimes they have limited control over this (the jobs they happen to get, yada), but I just want the best developers... and in my experience they find a way to follow Norvig's advice and at least toy with some of the major flavors: DBC, functional, class, syntax, declarative, concurrent, parallel...


I don't think the core of a good developer has changed much since the early 90s. You still want people whose skill-set is T-shaped... You want them to go deep on at least a few domains, but you want them to have broad experience in lots of them.

I think Michael's point was that a senior developer might have a very different idea of what constitutes "deep" to a junior developer.

By the standards of a 20 year industry veteran, a six month project they worked on three years ago using language X might not even make X worth mentioning unless it was specifically asked for, next to the half-dozen languages they already show where they have 5+ years of experience including shipping multiple projects in each one.

By the standards of a junior 2 years out of school, a six month project working in language X last year might make it one of their top two languages and get presented as "expert-level" understanding. These are the people who tend to list fifteen programming languages, because they used each of those languages for about five minutes on a toy project during their CS course.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: