Music (even “classical” music) is a pop culture, just like programming. Music is also a science with a body of knowledge and well-formed theories. But it’s a pop culture. Just like programming languages.
I think the current state of programming languages clearly reflects this pattern. I'm happy that the mainstream is becoming interested in something other than the strange variants of Smalltalk OOP we've been using for 30 some years.
In fact there seems to be real grassroots PL renaissance happening these days :)
Given that, what is the place for PL research in industry? There are two main ways that we can see for PL research to make a practical difference:
(2) Create new languages designed for a specific, concrete high-value project. This is what we're doing with Rust -- Mozilla is investing in Rust specifically to be the language that we write a new parallel layout engine in. This constraint helps us focus on keeping the language usable, practical, and feature-rich. Features like the uniqueness typing and (in the future) bounded task lifetimes that allow us to avoid concurrent GC are driven by the pain points that we struggle with in Gecko.
Work on new type systems is a big part of the field, however these type systems aren't always intended to be deployed as part of languages. Rather, they can often be viewed as bug-finding program analyses. Indeed, program analysis (including type inference/checking) is a much more common research topic than "a new language for X".
I think many in the field agree that more work should be done on language usability, although it is a hard topic to research. The PLATEAU workshops (on the Evaluation and Usability of Programming Languages and Tools) were efforts in this direction. The Software Engineering field is much more focused on usability and would be the most likely publication target for PL usability research. (The top PL conferences focus on theoretical concerns (POPL, ICFP) and implementation/systems (PLDI, ASPLOS)). Although usability studies are rare, the PL field has gotten tougher about requiring strong arguments about the usefulness of research: the evaluation and motivation sections of papers really need to be convincing.
When Kiczales and others realized Common Lisp as a community wasn't going anywhere they took their ideas and created AspectJ.
Anyways, her blog seems like a lively read in general - I found this blog post pretty entertaining and I agree wholeheartedly - http://tagide.com/blog/2011/05/programming-is-math-apparentl...
Especially at big companies, we're very risk-adverse. We know you can build high-scale applications in Java and C++ because it's been done. You can probably build a high-scale Haskell application too, but would bet millions of dollars on it?
A much harder problem is that compilers are shockingly time-consuming to fully implement and test, much less to integrate with a full set of libraries and tools that you need to use them in practice. Even if you have a new language design, implementing all of the "basic" optimizations to bring your compiler from hopeless to merely embarrasing requires person-years of heroic effort, none of which results in publications or other recognition required to keep your NSF funding coming (they're "basic").
For example, it has taken us since 2007 to get Manticore (http://manticore.cs.uchicago.edu/ ) to a point where our sequential performance is within a factor of 2-4 of C (depending on the benchmark), and closing that last bit is probably going to take another couple of years unless we have some magical windfall of stunning undergraduate and graduate student candidates. Further, over the _entire_ lifetime of the project, I'd doubt that we will be able to put in even half the people-hours that I saw committed to improving the template error messages in Visual C++ during my first two years working at MSFT.
That said, we've been able to do some truly great things with scheduling computations on multicores, building a GC that works amazingly on NUMA computers, actually getting real speedups on > 36 core machines, etc. But I wouldn't expect to see any of the language features or implementation tricks in mainstream languages for many years. That time lag is just sort of the nature of things in PL research.
That said, we're looking to move over to it because our old code generation library (MLRISC) is beginning to show its age, and porting to LLVM is probably going to be about as much work as fixing the spill code bug we recently hit and exposing more SSE instructions. Most of the work will probably be in porting our calling convention, which does not resemble C's in any way. Like many frameworks, it's a modular compiler framework for building C compilers, not really a modular compiler framework for all types of compilers.
That's not necessarily a bad thing; many people have tried and failed to make more general ones. I'd rather have to do work to shoehorn our compiler into something that works and is widely used in industry than rely on something that's an easier fit but only likely to be around as long as the group is still publishing papers on it.
GHC folks succeeded in including GHC calling convention in LLVM, so there is hope.
That said, we're still hopeful that we can make things work, though we're not optimistic that it will come about without some significant tweaking.
Programming languages as a subset of computer science are a purely mathematical thing, where we can use Turing's ideas from the 1930s today to inform type systems. But they're used by humans and humans are fuzzy; they choose "objectively bad" languages like PHP. That doesn't mean to say that science no longer applies -- it just means whatever metric we're using to judge PHP as bad is not the metric that causes a language to succeed. That is still an area worthy of research.
(Not a direct response to the article, sorry. Got a bit carried away.)
You can see a division like this in many other fields, particularly the arts and literature--most popular literature isn't "great" and most "great" literature isn't (as) popular. So the natural parallel is that PHP, Python...etc are like the thrillers at the top of the best-sellers list and Scheme, Haskell are like what you would read in an English class.
And really, this makes sense--whenever anybody talks about the quality of a programming language, they are talking about whether they would use it themselves rather than whether the public at large would use it. So it is completely reasonable to have a "lower quality" language be more popular than a "higher quality" one.
Coincidentally, while I talk about high quality and low quality, I do not mean to denigrate popular languages. After all, sometimes a thriller is all I want on a plane ride! And it could be a perfectly fine book indeed. But that does not mean it's a better book than Ulysses.
Given that, we can probably look to progress in the "science" of management to get a feel for what progress in the "science" of language design is going to look like. That is to say, we probably can't expect anything at all in the way of progress. It's funny that there's a parallel between the conclusions in William Whytes' "The Organization Man" and "progress" in language design. Whyte concludes, one, that management in the abstract doesn't actually exist and, two, "management" taken as organizational oppressiveness and intrusion is actually a parasitic load on people trying to get work done, and ought be minimized. Look at the success of weakly-typed scripting languages like perl, js, ruby, python, and so on: the fewer strictures they impose on the data, the more work you can get done!
Researchers are just going to have to get over math and physics envy. The "truths" they discover are very unlikely to be anything like nearly as universal as physical truths. Structured programming, OOP, AOP, functional programming, or whatever else aren't ever going to fit into a proposition like "If we adopt ____, we find that blah," where blah is any kind of contingent claim relating to bugs or productivity. All we'll ever get are notions that whatever paradigm worked well in one context and poorly in another. Again, this is parallel to management. You can't manage programmers like auto workers like farm workers like service workers. Outside of algorithmic analysis, computer "science" has as little to say about programmer efficiency as management science has to say about how many weeks of parental leave you should give your employees.
That being said, the ironic part is the relevant research was not type theory, but rather "How can we teach programming to children?"
And of course, there are still languages like Haskell that have carved out a nice niche, despite their recent vintage and academic roots.
Related (long but good) article: http://unqualified-reservations.blogspot.com/2007/08/whats-w...
"No field has been more infested by irrelevant formalisms than that of programming languages - also known as "PL research." So when I say Guy Steele isn't a PL researcher, what I mean is that he's not a bureaucrat."
I read basically all of the Lua papers and I think they're a great model for how to do programming language research. They use a real language for a testbed of their ideas. For example, there was a great paper that went over the history of coroutines and different options for designing and implementing them. There was also a great one about PEGs and a formal treatment of the implementation of the lpeg parsing VM. Somehow I feel that this kind of research wouldn't be respected at big name American CS universities, which is a shame, because it's exceedingly valuable.
- For example, Google Native Client uses a formal verifier to prove the safety of binaries (http://sos.cse.lehigh.edu/gonative/index.html )
- Microsoft has long used a formal driver verifier to prove liveness and protocol properties associated with device drivers
One amazing piece of work going on right now in compilers is by Xavier Leroy, who cares a lot about formally proving that your compiler and its optimizations respect the semantics of the language (i.e. that its execution on hardware is within the range of possible executions specified by the original input language). Without the decades of work on formalization, semantics work, theorem provers, etc. the community wouldn't have a chance of tackling those problems today.
Certainly, if you read the proceedings of POPL or even some ICFP papers, you might wonder where it's going. And even the authors might admit to the same. But until you've fully explored the issues that come up when you try to merge (shameless example from my own work) effect types, region types, concurrency, parallelism, and transactions, it's hard to know what sets of language features can be safely combined in a way that programmers can modularly reason about and a toolchain can implement efficiently and correctly.
My understanding is that Native Client works with a combination of a special compiler toolchain (on the sending side) and runtime checks on the receiving side.
Happy to see any corrections. Your link seems to show a project that is related to NativeClient, but is not the core technology behind NativeClient.
EDIT: Also, I would be interested in details about the Microsoft driver thing, but that doesn't seem related to proof carrying code either.
As the other commenter pointed out, the SLAM tools are part of the Windows Device Driver Development Kit. The last time I talked to the kit's dev manager (~2003), they were talking about making it mandatory that you pass the formal verification in order to have your driver signed by Microsoft. Since those signatures are then verified at driver installation time, that feels very close to it!
I have to confess I'm only familiar with the publications on Native Client and not the actual product. From what I'd read, I understood that the verifier did some basic static analysis to prove that all possible executions did not validate some properties. In that case, no proof object is required, as the source code itself is the proof object. Assuming, of course, that they're actually doing the stuff talked about in the papers and in practice don't just "grep for dangerous instructions."
Now, this obviously involves understanding how compilers and general purpose languages work, but it also involves some other skills and ideas both in design and implementation. The biggest difference is, of course, in scope--rather than thinking about languages good for anything, we think about very narrow languages heavily optimized to do one thing. These languages may stand alone or they may be embedded in bigger languages, but each language itself is distinct from other languages (including the "host" language).
These languages are also not always aimed at programmers--one of the examples (a past final project) was a language aimed at tailors, of all people, to help them work with patterns. Another language we looked at was designed for musicians to combine different inputs into one output.
These sort of languages are programming languages and have the same ideas, but they serve a different purpose. In a lot of ways, languages like this replace programs and GUIs, letting people work with text rather than pointing and clicking. I think there are very many domains where using text is preferable to a GUI--there is a reason I still use a CLI, after all--and this is exactly the sort of thing we're looking into, except not necessarily for programmers.
I've wandered a bit off topic, but I think these ideas are interesting. It's another direction for PL research--focusing on very narrow fields and potentially non-programmers. Just something to think about.
<tounge in cheek> Perhaps developing useful metrics of productivity, rather than strong AI, should have been the real Holy Grail of Computer Science.
A few software-engineering researchers have told me that that's one major reason that recent "tools" type SE-research happens outside academia: if someone in academia had invented git, it's not clear how they would design a user study to evaluate it, especially within the constraints of, say, a PhD thesis timescale/budget. The typical/simple study design is you recruit N participants, randomly assign N/2 to your tool and N/2 to the control tool, have them perform a task, and then try to show with p<0.05 that the group using your tool did better than control. But in this case, the "perform a task" step has to be non-trivial, and it tends not to be feasible to recruit people to participate in a random study that involves them developing serious software over several years, which would be the equivalent of the kinds of randomized studies that are done with medical devices.
I don't actually find the case-study-based approach particularly bad. Start from cases that are awkward or error-prone to handle in a language (either constructed or derived from data about real-world errors), and propose a solution that captures the underlying computation more directly, or in a more checkable way, etc. There are other areas that make progress in that manner; for example, symbolic logic develops with a case-study and counter-example-driven methodology, where someone will propose a case that either can't be represented in Logic X, or at least can't easily be represented, or maybe produces incorrect inferences when encoded in the obvious way, and this will drive development of a Logic X'.
> It appears that deep thoughts, consistency, rigor
> and all other things we value as scientists aren’t
> that important for mass adoption of programming languages.
> Fortunately, academia doesn't have a monopoly on "deep thoughts"
I've used quite a few popular programming languages and they certainly feel "useful" because of syntax, libraries, easy to run on UNIX, etc. Most the "deep thoughts" in these programming languages can be easily found in prior academic literature.
Glad to hear of some examples if you have any! :)
There's plenty of evidence that good engineering can be accomplished with popular programming languages. But I think we're still a long ways off from beautiful engineering. Any way forward needs to elegantly intertwingle Theory & Praxis.
PHP's "deep thought" was that in the context of building dynamic web pages, perhaps it made sense to embed a scripting language into HTML itself -- instead of calling out to an external program. There's an example of the initial idea being responsible for much of PHP's enduring success.
And here's an anecdote that explains how it was (lack of) readability in significant indentation that led to Python getting its colon:
> In 1978, in a design session in a mansion in Jabłonna (Poland),
> Robert Dewar, Peter King, Jack Schwartz and Lambert were
> comparing various alternative proposed syntaxes for B, by
> comparing (buggy) bubble sort implementations written down in
> each alternative. Since they couldn't agree, Robert Dewar's wife
> was called from her room and asked for her opinion, like a
> modern-day Paris asked to compare the beauty of Hera, Athena,
> and Aphrodite. But after the first version was explained to her,
> she remarked: "You mean, in the line where it says: 'FOR i ... ',
> that it has to be done for the lines that follow; not just for
> that line?!" And here the scientists realized that the
> misunderstanding would have been avoided if there had been a colon
> at the end of that line.
Off the top of my head, you might consider that the following count:
* Design by contract
* C3 linearization (i.e. multiple inheritance)
* Traits (or mixins)
* Prototype-oriented programming
* Aspect-oriented programming
* Compilers ;)
* described by Bertrand Meyer while at UCSB
* published in OOPSLA for Dylan
* Lisp machine
I was hoping to draw a line between the idea of a language feature designed to solve a real-world problem, and one designed to advance the theoretical state of the art--all of my examples are ones I'd regard as being the former, and I have less interest in looking where they were developed or by whom. Lisp definitely qualifies as the second (and with apologies to Whitehead, all programming language design consists as a series of footnotes to Lisp), as it wasn't even intended to be a programming language, but an AST. I don't think that necessarily means that AOP is an obvious implementation of Lisp macros, especially as it took 40 years for them to appear.
I don't think it's particularly worth running through the minutae of the claims for each point being an academic language or not when it's just a semantic point. I do think it's pretty disingenuous for the article to claim that Python is not an academic language (when it was developed at CWI) but that C is on the opposite side (when it was developed as a skunkworks project at Bell Labs), and that this is one of the many major flaws in perspective and confirmation bias that the article suffers from--unfortunately, much like the claims about programming language theory that it itself points out.
I would say that both are the two faces of the same sheet of paper. Let's take the law of motion (f = m a, where f and a are vectors), is there anything more "pure" (or, in other words, "simple") from mathematical perspective? And is there any possible way to make this more readable and comprehensible by (educated) minds? I think I have read that before vector where properly "invented" a simple three-terms equation like this one was a complex mix of vague and partial statements. So, at least in this case, it means readability and mathematical purity came together, with the development of knowledge and abstract tools.
I heard some debate it or was hot about this question: is there a possibility that a future Newton will model any of today's complex problems in a way that make them easy to write, understand, and more "pure" (ie. simple, atomic)?
My bet is there will be (don't ask for proofs other than "it happened like that before").
So, more on topic: I think there are ways to make programs pure and readable, I think these ways are not divergent: purity is simplicity, and simplicity is readability.
I do still find that I write far too much code to handle what isn't the normal flow, instead dealing with errors, unusual values, rare interactions etc. Exceptions cover some of that (except in Java with the annoying checked exceptions model). But there is still a lot of code that falls between exceptions and what gets executed most of the time. If only I could leave that rarely executed stuff out.
There is also a massive sore spot with functionality split across multiple processes/machines. Writing synchronous code is clear but has issues, while writing asynchronous code reflects what is really happening but involves a lot of baby sitting. I do think the functional world has immutable values right as a good approach (less locking to worry about, can be transferred between processes, easier to debug etc)
I look forward to the day when my programs look like this, all of one line long:
Using Python I can say a list should be sorted and have no say over the algorithm. If I use the STL I have to make a few decisions. Java appears to have 7 different list implementations that I have to pick from.
Working code that was quick to write is far more important than being very specific about every operation. In most cases it is sufficient. With profiling and real world usage you can get an idea of where more specificity is needed, but in many cases even that can be dealt with automatically (JIT, type inference, calling out to a different language). Even if you write new more specific code you can still use the existing code to test against.
Perhaps languages are no longer at the right granularity to be "solved". This again echoes the market process; too many variables, and too many different preferences leads would-be problem solvers to rely on the pretense of knowledge.
Languages combine a host of disparate features, and sometimes simply combining previously known features in novel ways is sufficient iterative improvement. Clojure is one good example of this, where immutability, functional programming, and STM were all known and used to varying degrees, but combining them in a clever way allowed for something greater to emerge, particularly how the first two allowed for a new form of the third. From this in turn emerged a (new?) well thought out model of time state, and identity.
In the end, perhaps languages are now really engineering problems, not the science/math problems they used to be. Perhaps the academics should embrace this. To continue the economic parallel, if one wishes to examine the qualitative aspects of PLs, perhaps research should approach from the historical perspective, asking why certain languages emerged to solve certain problems.
There's obviously lots of room to improve in programming languages. It seems the only question is getting people to do it.
As much as the ideas of Science and Engineering are popular sacred cows in "Computer Science", I think that the essence of the software world is much closer to some sort of mathematical craftsman.
In my opinion the languages of the present improve by a) skillfully combining the existing ideas and b) implementing the 95% other things that are not big ideas but are still essential to a language being good: consistent and broad standard lib, streamlined syntax, core concepts playing well with each other and so on.
Both of the above are rather engineering than science, so the author seems right in his views. Maybe when we reach the local optimum with the current set of the fundamental ideas, we will see the need for new ones more clearly and the necessity of fundamental research will reappear.
One possible kind of answer could be that, given formal systems are powerful tools to model some interesting corner of reality, the urge to create a DSL could be related to an urge to understand a domain by creating an automaton that can be mapped to domain aspects.
List Comprehension Syntax:
Python's list comprehension syntax is taken (with trivial keyword/symbol modifications) directly from Haskell. The idea was just too good to pass up.
So the point of programming language research is to come up with ideas for the practical programming languages of the next 50 years to steal.
More generally, one point of research is to come up with ideas for practical technologists of the next 50 years to steal.
(1) I saw a talk by Jonathan Edwards that was very much along the lines of what you wrote here: http://alarmingdevelopment.org/?p=5
(2) Second, Christopher Alexander’s early work on patterns in architecture and urban design have been referenced quite a bit in computer science, but seldom is his ‘magnum opus’, a four-book series on the ‘nature of order’, referenced. These texts move far beyond the early work. You would do well to have a look at the first book, which tries to establish an objective theory of design not based on scientific principles: http://www.amazon.com/s/ref=nb_sb_noss_1?url=search-alias%3D...
(3) You might be interested to read some discussion on the history of music programming languages. Max/MSP and Pd, both dataflow-oriented, offer what I would estimate to be an order of magnitude of productivity gain for certain tasks in building one-off multi-media systems. They’re a bit like a UNIX for real-time multi-media + control signals. This essay reminded me a bit of the anti-academic and organic approach that Miller Puckette took in building them despite being trained as a mathematician and developing them in an academic setting. This serves as a good lesson that successful software isn’t necessarily designed by having good principles, but rather the proper environment, namely, one with energy and a need.
Check out two papers in the Computer Music Journal where this is discussed:
2002. Miller Puckette, “Max at Seventeen”. Computer Music Journal, 26(4)
2002. Eric Lyon, “Dartmouth Symposium on the Future of Computer Music Software: A Panel Discussion”. Computer Music Journal, 26(4)
Oh, and I would add that if you are not familiar with Bill Buxton’s career, it may prove interesting reading for you. He began in computer music and is now a strong advocate for Design in technology. One insight that he often emphasizes, which I don’t claim is his originally, is that new technologies take 20-30 years to be adopted. According to this view, new ideas in software design should expect to lie dormant for at least 20 years, echoing what @Ben wrote above.