One big difference (which the article seems to mostly ignore) is that the shortest project was written by a single developer, while all others were a team effort.
Having the whole project - design, architecture, all implementation details - in one head is not a trivial advantage. Even ignoring communication overhead, there might be subtle duplication of code simply because different people choose to do similar things in slightly different ways.
While the ratio of Rust to Python code kind of matches my expectations, I wonder how much of it might be due to the difference in team structure vs the difference in chosen language.
These were different teams of students taking a class. Variance in student effort/quality is generally high.
There were two Rust teams. One had 3x the code and passed less tests than the other. This is our only reference for how much noise is owed to team rather than language; no other language was used by more than one team.
Python did best (least amount of code, yet also the most features) less because of Python, but more because the best programmer was using it.
Would the other students have been able to take advantage of the duck typing and metaprogramming in the same way, or ended up following different designs? I'd have my doubts about the second Rust team.
Although that Python allowed these is a feature, but I think we're really just looking at noise.
And then again still, the “best programmer” sacrificed code quality in pursuit of quickly producing features. The article doesn’t define code quality measures but I’d still take the lines of code metric as either not a measure of value in itself or with a massive grain of salt.
That was the right tradeoff for the situation. I think any attempt to use this case study to draw conclusions about maintainability would be a mistake; that's just not what this is useful for.
Hi. I wrote the python implementation. I've had prior experience with compilers and I'm obviously biased but I would say it's quite readable. The main compiler part (without optional features or the LR(1) or lexer generator) is 2040 lines and contains very little magic. It would be easy to translate 1-1 to idiomatic C++ for instance - it would just be much more verbose. eval() is only used by the lexer generator. The metaprogramming could have been avoided by using dataclasses (but the marker used python 3.5).
(To be clear, the optional portions in which I implement SSA, various optimizations, a Hack-style allocator and some ad-hoc codegen is much less readable.)
Sounds really cool! As someone who programs a bit for work to help with automating processes and fun, but is not a computer scientist I'm a little jealous of never having taken a true database, compilers, PLT, or theory of computation course.
It sounds like you and your classmates are top notch and will go on to some pretty freaking cool careers (ex: I'd work at Jane Street if I was not a parent and a lot smarter :)). Out of curiosity, what are the typical places your classmates go upon graduation?
As I have aged, I have learned that duplication of code can become a feature in terms of building redundancy inside of teams. This is one of the reason that "enterprise code" really sucks, but the people turn-over and need to transfer knowledge over generations is more important than minimalism and the power of individuals (sadly).
It runs to the idea that perfection is the enemy of good, and how do you change perfection?
In a growth company with massive scale, a system always has some risky red-line, and so it is better to spin up some competing efforts with different perspectives to tackle different needs in different ways.
The key is to have redundancy of people over time and to create multiple thought leaders within a company; this makes it interesting. Then, at a future date, a "re-org" will happen to condense efforts and the real product is having a number of people familiar with the ideas spread across the company.
This makes zero sense for start-ups, but when you are a risk taking company with massive budgets... strategies are interesting. You see this with VCs having stakes in similar investments as well, and the idea is the same... diversify over people rather than perfect minimal code.
> Having the whole project - design, architecture, all implementation details - in one head is not a trivial advantage.
It's also a big disadvantage if that one person ever wants to move on. I write this from personal experience; don't be a solo developer on a large project if you can help it.
Haskell without lens, text, vector, etc... is a bit like rust with only core not std.
The haskell standard library is tiny. Libraries like lens are not optional. In practice you won't understand any open source Haskell without rudimentary understanding of lens. I get why parser libraries were banned, but excluding lens, vector, and text?
I like Rust a lot, but haskell minus it's more advanced type system is just Rust plus GC. Lets not pretend this is a fair comparison of languages when it's primarily a comparison of standard libraries.
This is why I gave up on Haskell. Lens works as advertised, but is a pain to learn and to use in practice: the abstraction is tough to grasp and it is hard to form an intuition about it. The compilation errors are laughingly esoteric. The number of adhoc squwiggly operators is ridiculous. You also need to understand a lot of language extensions to get how the type checking works.
To me it looks like an impressive proof of concept for a future programming language based around it.
If I were to start a project with Haskell the use of lens would be explicitly forbidden.
It's about as esoteric as somebody learning C++ for the first time. And from that perspective, it's totally normal for errors or syntax to be weird looking for a long time.
Most of us, including myself, are biased towards languages like Java, C, C++, Javascript, because those are what we learn first - and so our expectations of what errors (or syntax) look like are shaped by our early experiences.
So I don't think it's fair to say that Haskell's compiler errors or quirks are fundamentally less intuitive than something that GCC/G++ spits out even on a sunny day. Just odd when we expect errors to look a particular way, but Haskell is playing a totally different (not exactly harder) game.
I didn't say Haskell's error messages are bad. If you stick with explicit types on your functions and no language extension they are absolutely great. I wanted to point out that type checking errors with lens are hard unless you really know how all the different type aliases relate to each other. It was a few years ago so maybe things are better.
C++ also had this problem with the standard containers. However it is much easier to get what is a dictionary compared to a random "optic".
> However it is much easier to get what is a dictionary compared to a random "optic".
This is exactly what I disagree with. We come from a prior understanding of mutable/imperative dictionary/shared_ptr/std::pair, because that's what we started out with.
Had we been initially been trained on monads, functors, lenses, those would be the familiar tools, and we'd go "Huh, that's an... interesting way to write code" when faced with C++ for the first time.
Yes, but not from programming, but from general life experience. Everyone knows what an actual dictionary is, and even non-programmers can easily grasp how a one-way 'map' works.
Mutation is also how the real world works. If you want to record something, you write it down—you've just mutated the world, not encapsulated your operation in the WorldState monad.
You need to build a pile of mathematical abstractions in your head before you can really get off the ground with lenses. Not everyone has that aptitude or interest.
> Mutation is also how the real world works. If you want to record something, you write it down—you've just mutated the world, not encapsulated your operation in the WorldState monad.
But is it, though? Perhaps you just appended something to the world-log in a purely functional way. Time seems to always go in one direction (at least in my experience, YMMV), kind of like a DB primary key that is monotonously increasing. It really depends on how you look at this.
You 100% do not need to build a "pile of mathematical abstractions in your head" to use lenses. It's a handful of types and functions. Do you need to build a pile of abstractions in your head to use `std::unordered_map` or getters/setters in C++?
Yes to both? C++ is not a simple language by any means.
For context, I was introduced to FP (SML in this case) around the same time I learned Java, and I still think for the vast majority of coders, an imperative map is much easier to grok than lenses.
The former only requires understanding how values are manipulated and mutated. You're going to need to understand this anyway to write software, since your machine is mutating values in memory.
Lenses however require complex type-level reasoning, so now you must learn both the value language and the type-level metalanguage at once, and your language also deliberately obscures the machine model. That might be powerful, but it is still an additional mental model to learn.
Lenses don't really give you anything that you can't get from (a) a sensible syntax for record updates and (b) intrusive pointers. Lenses only exist because of Haskell 98's uniquely bad support for record types. Record access and update in most other languages just is simpler.
Lenses are more than reified record labels though. There is a hierarchy of concepts that can be freely composed based on which features your data structure actually supports. In particular, lenses can be composed with traversals ("lenses" pointing at multiple fields) yielding "LINQ" like features without introducing new syntax or concepts.
The main problem with lenses is that common lens libraries look extremely complicated at first glance and seem to be solving a very simple problem. That rightfully puts most people off of learning what all the fuss is about.
If you use lens as just a way to access records like you do in other languages, then there is absolutely nothing hard about it. Literally all you need to know is:
Name your records like "data Prefix = Prefix { prefixFieldName :: ... }" call "makeFields ''Prefix" once at the bottom of your file and use "obj ^. fieldName" to access and "obj & fieldName .~ value" to set.
That's it. You now have 100% of the capabilities of record update in any other language. This doesn't get any simpler in any other language. It even pretty much looks like what you would do in other languages.
I'll grant you, Haskell and lens do a terrible job of explaining subsets of functionality that are simple and let you get the job done before jumping in the deep end.
Yeah, so it's a less good way of accessing record fields than the one present in 99% of other programming languages. Your own description makes this plain. Let's compare to Javascript:
* I don't need to import a module to make available the syntax for getting and setting fields of an object.
* I can use the same syntax for any object, and don't have to worry about doing a bunch of code generation via a badly-designed metaprogramming hack.
* I don't have to worry about adding prefixes to all my field names.
* The syntax uses familiar operators that I won't have to look up again on hackage if I stop writing Javascript for a few months.
* No-one modifying my code can get "clever" and use one of ~50 obscure and unnecessary operators to save a couple of lines of code.
What bugs me is when Haskell advocates try to use all the additional esoteric features of the lens library as an excuse for this fundamental baseline crappiness.
Haskell really just needs proper support for record types. Then people could use lenses when they actually need lenses (never?). At the moment, they're using lenses because they want something that looks almost like a sane syntax for record updates.
Record types are not a solution to the problem lens solves. Lens is a good library and a good concept. If we spent some time on it in programming class, most people would get it. When moving to non-Haskell languages, the lack of proper lenses is something I notice almost immediately.
The other features of lenses don't strike me as particularly useful. YMMV. I'd also question the quality of the library. It's full of junk like e.g. http://hackage.haskell.org/package/lens-4.17.1/docs/src/Cont..., which is just an invitation to write unreadable code.
My biggest use case for lenses that I miss in other languages is the ability to interact with all elements of a collection, or elements in deeply nested collections.
For example, if I had a list of records with a field named 'categories' holding a list of objects with a field named 'tags', and I wanted to get all of these names in one list, without nested loops, lens makes it easy 'record ^.. categories . each . tags . each' or I could update them all, etc. It's just so easy to do this kind of data munging with lens that writing fors, whiles, etc in other languages is painful.
> And from that perspective, it's totally normal for errors or syntax to be weird looking for a long time.
This isn't normal. This is just using a tool that sucks. Those who consider this normal are just masochists.
Rust, elm, etc. have great error messages. That took a lot of time and effort to achieve. The fact that it is impossible to implement a C++ compiler that produces good error message is just proof about how broken the language is. The fact that some people find this normal is just Stockholm syndrom at work.
Not at all. Several languages Rust included takes understandable errors seriously. I am a Rust newbie but the errors are extremely easy to grasp and fix my code.
You say "not at all", but only cite Rust (which I didn't mention). C++ has horiffic error messages, certainly at the level of a bad Haskell error message.
So your defense for Haskell's error messages is that they're slightly better than what you get from a massively entrenched language with famously user hostile error messages?
I don't think C++ errors are bad any more. 2019 compilers generally produce very good error messages. The situations where you get into pages of template nonsense in an error are becoming fewer and further between all the time.
C++ has bad error messages because of language design. Contemporary C++ compilers are very good at reporting clear error messages about common mistakes, but template heavy code still yields arcane error messages. Templates are untyped, so there is no way to give sensible error messages when defining or instantiating a template. Instead you have to typecheck after template expansion, at which point you are left with an error message about compiler generated code.
There are some proposals which address this (e.g., concepts), but none of them are part of the language standard yet. Concepts in particular made it into the C++20 draft, but they also made it into a draft of the C++17 standard and were ultimately rejected. Somewhat vexingly C++ concepts actually come with the same problems only at the level of concepts instead of at the level of templates.
Some C++ has horrific messages, new compilers do a much better job at complaining about most errors - some even suggest fixes. I don't remember seeing Haskell doing that.
Same for me, except also the incredibly obtuse set of ~20 compiler pragmas you need in Haskell. If you ask for help to do some simple programming concept, like multiple dispatch based on type at runtime, then from the Haskell community you first get a bunch of tone deaf “you shouldn’t want to ever do that” responses, followed by a huge tome of all the language extensions (fundamentally changing or adding syntax) that you need.
With the exception of very few extensions that I've never seen used in practice, Haskell language extensions are mutually compatible and create a language that is a strict superset of the old language. In this sense, I'm not sure how they're much different than the --c++=14 flag in GCC.
If you need to know and understand syntax implications on highly generic type pattern constructs coming from a dozen external pragmas, just to be able to read the code then it’s a severe language design problem.
That's a total stretch, lens is not used in GHC for example and lots of other smaller compilers written in Haskell. It is used in Ermine but that is stuck in a semi complete state for a while now and Ekmett has moved on.
I’ve written tens of thousands of lines of Haskell, and I’ve never used lens. Also, putting it in the same category as text and vector doesn’t make sense — these are indeed unavoidable, and practically all my projects use them.
File named TAGS generated from hasktags (in case of Haskell) that gives you an easy way to "jump to definition" from Emacs or other editors. Good way to navigate codebases even if you don't know how to build them.
Presumably etags/gtags/hasktags etc., ie. having built a TAGS database for such a helper program, you can use it in an editor to jump from a field name to its definition. That wouldn't be the case with a lens accessor.
Excellently written, great topic, and done w/o flaming/too much bias. I'm amazed as many older folks in the industry would not be able to have this level of content and maturity to write an informative article.
It's difficult to write a fair comparison without being a fairly competent programmer in each of the languages. The trouble is, if a person is an expert C programmer and then translates it to Python that he's only modestly familiar with, the Python program will look like C. It won't be idiomatic Python.
For example, my early Fortran programs looked like Basic. My early C programs looked like Fortran. My early C++ programs looked like C. And my early D code looked like C++.
It takes much more than being able to write a program in X to be able to write one that makes proper use of the language.
That is all true. At the same time here the groups were allowed to use the language of their choice. Presumably they chose languages they felt they were competent in.
Of course an expert of a given programming language can write much better in it than a novice. But a comparison like this is not necessarily about comparing top-programmers in every language, but average programmers, because we want to know results that are true "on average" .
The author does note he "knew (they) were highly competent". So they were not exactly novices in their language of choice. Writing a compiler is not a task for novices in general.
There were people with 2k to 10k loc of experience in some language. That seems extremely low for any meaningful comparison and I would really hope that “average” programmers have way more experience than that...
I think I was pretty junior after writing north of 100k loc and working on 1M loc projects. And for sure I don’t consider myself highly competent in F# after writing some thousands lines.
I agree with the conclusions when they say that the design decisions are much more important than the language of choice. But I still firmly believe that the language makes a very big difference in real world projects. In toy throw away projects obviously metaprogramming cuts a lot of locs.
When reading about APL recently, arcfide - the chap working on a GPU compiler - has expressed that he likes APL's terse code because you can throw it away and rewrite it without too much trouble. His compiler is around 750 lines of code after 6 years of development, but[1]:
"If you look at the GitHub contributions that I've made, I've made 2967 of about 3000 commits to the compiler source over that time frame. In that time I've added roughly 4,062,847 lines of code to the code base, and deleted roughly 3,753,677 line of code. [..] It means that for every one of those 750 lines, I've had to examine, rework, and reject around 5400 lines of code."
Yet a 750 loc codebase sounds like someone with very little experience and a few days or a long weekend.
I'm not saying only the trivial "some languages are denser than others", but also that it would be interesting to compare projects in total lines of code written including all commit history, and that it would be interesting to compare people's experiences and project designs in terms of "how many times something got implemented in several ways before settling on a final version", or "writing once and the design got set in concrete because it was too big to bother changing", how many prototypes that work differently were explored before deciding - and what that does to people's skills, and to project designs.
I'm guessing thousands of lines of F# would make you more skilled than the same thousands of lines of C#, even moreso if the F# was higher-abstraction than the C#, would you agree?
> Yet a 750 loc codebase sounds like someone with very little experience and a few days or a long weekend.
For the purpose of evaluating experience you count the total amount of code written, not the final codebase size. Four million lines of code is a fair bit of experience in any language, even if most of those lines were later deleted or replaced.
The code I initially get to work is often pretty large and complex. Then going over it again, I can see how to shrink it and make it elegant. And then again, then again.
It would be unfair to leave it without the link to the HN discussion where he explains in detail his reasons for writing it that way and defends against a lot of criticism, and video stream walkthrough of the codebase as of a couple of years ago
> There were people with 2k to 10k loc of experience in some language. That seems extremely low for any meaningful comparison ...
It's low if you're comparing languages based on the skills of experienced programmers, but most programmers are not terribly experienced. A comparison of languages by programmers that are novices to the language is still meaningful. A language that is easier for a novice to pick up and write good code is at the very least one good measure for the quality of the language.
> So they were not exactly novices in their language of choice.
The Haskell team had "maybe a couple thousand lines of Haskell each" at the start. This one project ended up being 9.7k lines, so it constitutes half of their collective experience with the language. I'd say that counts as "novice" in terms of prior Haskell experience. Under the circumstances I think they did remarkably well to produce a thoroughly tested and maintainable end product in just twice the lines of code of the quick-and-dirty Python implementation.
I thought it very amusing that the teams focus would reflect the language: The Python implementation was as you say quick-and-dirty, while the Haskell one "caught a few edge cases that our team did not" and the C++-team did "optimizations on their IR".
The article says that the instructor for the course cautioned against using Haskell because some people overestimated their competency with it.
I think it is actually very likely that people would chose a programming language or system for reasons other than how competent they are with it. E.g. to seem "smart" because you wrote your compiler in Haskell, even though you actually have much more experience with Java or Python.
FTA:
> Another interesting thing to note is that at the start of every offering of the course the professor says that students can use any language that can run on the school servers, but issues a warning that teams using Haskell have the highest variance in mark of any language, with many teams using Haskell overestimating their ability and crashing and burning then getting a terrible mark, more than any other language, while some Haskell teams do quite well and get perfect like my friends.
> would chose a programming language or system for reasons other than how competent they are with it
Good point these were students so they were eager to learn new things. Can't blame them.
At the same time much of programming is learning new things continually. Some things are harder to learn and master than others. Seems like Haskell might be one such thing, based on what the professor says.
Yeah, I think the problem comes when you try to learn too many new things at once. If you're still learning about compilers, then trying to learn about functional programming on top of that is just asking for trouble.
It's my understanding that UW offers some undergrad courses in Racket, but apparently they're optional so if you hadn't taken that course you might not know much about functional programming.
There is a mandatory compilers-lite course at UW and he teaches the advanced version of it, where the project is to fill in pieces of a Scala-lite compiler written in Scala.
I'd say that "on average" results may be very misleading, in formal statistics and in results like this. Variance and median are very important: see comments about Haskell downthread. (Ideally you have a histogram.)
In particular, some languages with a good average may contain pitfalls that could lead to abysmal outlier results (e.g. C++ or Scala, in entirely different ways), some would keep you from doing really dangerous things but also would not let you achieve spectacular things (e.g. Go).
> So they were not exactly novices in their language of choice. Writing a compiler is not a task for novices in general.
Only having a few thousand lines of code written in Haskell very much makes you a novice. With that said, writing a compiler in Haskell is actually pretty trivial. It's at the very least considerably easier than most other languages.
Fun fact the person you're replying to is the creator of D, I'm pretty sure he likes it. I've personally played around with D a bit and written maybe 1kloc in it and I like it.
It doesn't have good pattern matching or the borrow checker, but it does have much better metaprogramming than Rust. I think a lot of the metaprogramming used in the Python project could also be done basically as easily in D, which is a big accomplishment.
D has many characteristics of lifetime checking in that it can track lifetimes across function calls and issue errors for incorrect usage. I've been looking into increasing its reach, and it looks promising.
Walter Bright is (along with Andrei Alexandrescu - which should be a good enough reason if you like c++) the BDFL of D.
D won't get you hired (probably), but D is designed with hindsight from a C++ compiler writer and a C++ template wizard: It shows, D is objectively better than C++ is many ways. It's worth checking out, at the very least (It's also not hard to learn, so I say go for it)
An example of the power of D: The Pegged library for D can generate a parser, D code which gets compiled, directly from a grammar specification in a text file [inside a D program, e.g. mixin(Grammar("Your Grammar Here"))]
On the contrary. Many members of the D community have managed to leverage their D expertise into well-paying jobs. Many industrial D users recruit from the D community.
Such a good guy creates a language and finds people job postings for that language. Jokes aside didnt know you were behind D. Thanks for your contribution. I will give it a shot.
I was a former C++ full-time programmer (4 different jobs in high performance teams, high maintenance etc) and now I'm a D full-time programmer for 4 years.
It's easy to underestimate the difference, but to me _as a user_ those languages are night and day as an experience.
D is learnable, in the sense that your learning will have some sort of ending at one point. D is surprisingly stable too, the front-end doesn't change too much and all 3 compiler share the front-end. And it's somehow forgiving. So the whole experience is tilted towards what you do with it: you feel like "programming", not "programming in C++".
C++ has a few things going for it: it has some sort of mathematical completeness, you can find jobs maintaining C++ codebases for your retirement, and it has an aura of legitimacy. But overall I fear you would live in a much more complicated world, for reduced productivity.
> I think the smaller differences are also large enough to rule out extraordinary claims, like the ones I’ve read that say writing a compiler in Haskell takes less than half the code of C++ by virtue of the language
Specifically the "by virtue of the language" part:
Seems to me like it's unreasonable to claim the languages are on equal footing because fancy parser libraries aren't allowed to be used for the project. The fancy parser libraries exist for certain languages specifically because the languages enable them to be written. (For example in Haskell: monadic libaries, libraries that take advantage of GADTs, etc.)
I don't think monadic parser libraries have a real claim to be that difference. All the languages listed have excellent parsing libraries that make things similarly easy, if not by language power than by grammar DSL with embeddable code snippets.
I think if any library could make a real difference for Haskell it's most likely to be http://hackage.haskell.org/package/lens, which a Haskeller friend of mine claims could likely make a lot of the AST traversal and rewriting much terser.
While I found your article informative and interesting I think it only works in the very specific context of this assignment. Disallowing powerful language features/libraries means it's not a level playing field and thus not a fair comparison. Some languages standard libraries are tiny some are huge. Some languages have lots of advanced features. Eg. GP mentioned GADTs with which one can write type safe/correct by construction ASTs. In other words programs passing specific tests in a specific context does not imply they are comparable in terms of general correctness/robustness/maintainability (as you noted this re caught edge cases).
Hoopl (data flow analysis) would also make a difference. I did a very similar project at my university in Haskell and Hoopl definitely saved us from writing quite a bit of code. We also used parser combinators in the frontend, which I think saved us time too.
I've found PEGs (Parsing Expression Grammars) to make things extremely easy and terse. E.g. OMeta, Parsley, etc.
My experience with using both PEGs and parser combinators is that there isn't a huge difference in the total number of lines of code. On the other hand though, the syntax of PEGs would be easier to understand for someone who is familiar with BNF style notation.
Recoding a viable subset of lens would have taken 50 locs in haskell.
Likewise, rewriting parser combinators would not have taken long for experienced devs. The problem here is that requiring people to recode the libs on top of the compiler is disingenuous. And if you ban idiomatic libs, you also ban most online help, tutorials, etc.
(A suitable subset of) Parsec is about 100 lines of OCaml. Implementing a PEG syntax on top of it is about 150 lines of Haskell (or less, I'm a Haskell noob).
Building up the knowledge to get to this point however… nope, those students were better off going hand written recursive descent (or Lex/Yacc, since an equivalent was allowed).
Right, to prove or disprove the norm one would need much more information. Two counterexamples does, however, disprove that "Parsing libraries ... just aren’t used for big projects." GHC and the OCaml compiler are both big projects, and they use parsing libraries.
Describing two well-engineered compilers of two relatively used languages as not impactful is quite a statement. In particular, given the good performance results they achieve, for languages that are quite far away from the normal execution model of the machine they produce code for.
Relative to what? Haskell and OCaml are important languages for PLT but not in the context of "production", or as I understood "production" to mean: shipping products with features. To call them anything but niche players in this context is not accurate in my opinion.
As in, there are a handful of large projects the languages are used for. For Haskell, Facebook's spam detection system, Sigma[1], comes to mind, along with some use by banks (Standard Chartered); then there's a bunch of smaller places using Haskell (Wire, a secure messaging app, like Signal; Galois, a US defense contractor doing software verification and related things) plus some open-source tools like pandoc.
I know less about OCaml, but at least Jane Street is using it.
It's somewhat niche, but it's not like it's only used for hobby or toy applications.
And for mysterious reasons, OCaml is now kinda popular for.. web frontends. Bloomberg created an OCaml-to-JS compiler https://bucklescript.github.io and Facebook (again!) created an alternative syntax https://reasonml.github.io and this combination is apparently a new hipster way of writing web apps.
I don't think language use industry-wide is a good metric for how relevant compiler software is. It is entirely imaginable that a very serious compiler that proves the original point exist for a language with very rare industry use.
Another point is that Haskell and OCaml popularized features and styles that are making their way into mainstream languages (e.g. option types instead of null, immutability by default), and the compilers showed that they can be implemented efficiently.
Excluding the lens library (as per the article) is unusual, it provides natural getter/setter and row polymorphism type functionality.
More anecdotally, I’d argue parsing libraries are common, just look at the prevalence of attoparsec and others. But most parsing libraries in the ecosystem are parser combinator libraries which don’t support as performance and nice error messages that compilers need
That was where I stopped reading. If a library like lens—used by nearly every haskeller in every project—was disallowed, I don’t know what the purpose of this exercise was.
Restrict the students from using a parser library. I get that. But allowing nothing except that standard library? That’s stupid.
It also makes the language comparison useless. Python has a standard library that is continuously improved and people reach to that when writing programs. Haskell, like C, ossified it’s standard library when it was created and people use the external packages for equivalent up to date libraries.
It depends entirely on whether the big project still has an elegant and complete formal grammar. Hand-rolled parsers are only common in industrial languages because many have grown to be far too complex and ad-hoc, requiring e.g. additional analysis and disambiguation during parsing. It is not a situation to aspire to.
Building Chromium atm, and to be honest I'd be happy if it were written in a trillion lines of BASIC if that would somehow achieve even a 10x build time speedup.
D is known for having an extremely fast compiler. In fact, Walter Bright wrote one of the fastest C++ compilers (the Digital Mars C++ compiler) before he wrote D.
Using a fancy parser-library would mean we should also count the lines of code in it. It would basically mean adapting an existing solution. In practice that would make a lot of sense, but if the purpose is to compare the productivity of different languages then not so much.
Isn't it a facet of productivity of the language that it's easier to write certain types of libraries in one language than another? If you're writing a compiler, the fact that lots of people write parser libraries in Haskell is a point in favor of Haskell, whether if you intend to use those libraries (because they're available and production-tested) or you intend to write your own (because it's demonstrably a productive language for that sort of work).
I think the big big result from this study is: "Python: half the size !".
A dynamic language like Python is better here, 2x better. I assume similar results would apply to other dynamic languages like JavaScript, Lisp, Smalltalk, Groovy etc.
This does not say that static typing should not be used but I think it shows unequivocally that there is a considerable extra development cost associated with static typing.
You might say that surely that additional cost would be compensated in reducing the cost of maintenance later. Maybe but I'm not sure.
Any development effort of significant size (like writing a compiler here) is a combination of writing new code and adapting and modifying code already written. "Maintenance" is part of development.
This is quite a quantitative study which gives credence to the claims of advocates of dynamic languages.
I enjoy hacking in dynamic languages as much as the next programmer. But, the big take-away is that "the initial implementation was done in 1/2 the code" not that the resulting code was more extensible or maintainable (by other programmers!).
> You might say that surely that additional cost would be compensated in reducing the cost of maintenance later. Maybe but I'm not sure.
I am sure. 100%. From many years of experience.
Yes, static types come at an initial cost at initial development time. But they pay that time back in spades. This is exponentially true the larger the code base is (more to keep in one's head), the longer the project lives, and the more people are on it.
Having worked on very large C/C++, Scala and Python projects, when it comes to add a major feature or perform a serious refactor, I always want the static typing and the compiler to inform me when I've missed something. Far too many times has code been checked into a large Python code base that breaks something (completely unbeknownst to the programmer), because there's a code path that's rarely executed or arguments to a function flipped, etc.
That all said. There are major benefits to being able to prototype very quickly in a dynamically typed language, too.
Lately I've been doing a lot of greenfield development in Python, developing libs for in house use at my employer. We're using 3.7 currently, and I've fully embraced type hints. With proper type hints and use of pylint, you get the static checking that you'd otherwise miss. Bonus, if you're using an IDE like PyCharm, VS code or visual studio, you usually get the linting for free either as you type or on save.
There are not too many dynamically typed languages that truly allow you not to miss static typing. One of them is Clojure. I can't explain exactly how, but somehow I think Clojure is fine without them. I don't think I can say the same thing about JS, Lua, Python or Ruby.
I spent a couple of years working in a Clojure shop with people who actually like Clojure, and the experience for me was not so practically different than if everything had been written in Ruby (and indeed, half the codebase was a legacy RoR system).
You either have a type-checker and compiler, or you don't.
They used Spec. Indeed wise use of a tool like this does make a difference, but then the onus is on the developer to be disciplined enough to apply it appropriately. Humans do not by default have this discipline.
When thinking about extending and refactoring, you also need to keep in mind that static types add a whole layer which is pretty hard to change. There are advantages and disadvantages, having at static types isn't such a silver bullet.
Hm... well that might be true for extending, I'm not sure you've fully thought through that comment.
Duck types are great for writing new code, but they're very troublesome for refactoring; automated tools have a much harder time automating that process.
Refactoring data structures and implementations in static type systems is bother considerably easier
to implement than in dynamic languages, and resultingly, more robust.
Certainly the refactoring tooling these days is pretty sophisticated with type inference, but... well, I've refactoring large python and javascript code bases, and my experience has been that absolutely a static type system makes that process easier, even if you have a comprehensive test suite.
I think it's worth acknowledging that there is a place for static type systems; certainly, it's not a silver bullet, and it results in (usually) a higher lines-of-code count, which is significant; but its naive and wrong to suggest that it has no value at all.
Specifically, for refactoring, it has a lot of value.
So do all the implicit assumptions in duck-typed code, but following them relies fully on the programmer's own caution.
You can of course add explicit checks and tests, eventually paying the same amount (or more) in LOC as the typed implementation's initial cost, but then you're also tasked to keep those up to date, without the compiler's aid.
The solutions to these problems in "extending and refactoring" are the same in dynamically and statically typed languages because in both you have to make sure functions/methods get the correctly typed input and return some expected type as output. In both worlds you will either refactor everything, abstract the problem away with interfaces or union types or you'll do type conversions at the boundaries of the code you touch.
That layer is there anyway, except for your interpreter of a dynamic language only knows about problems at run-time, while the compiler that checks a static type system will tell you at compile-time.
I agree it's definitely an interesting result and a point in favour of dynamic languages.
A caveat is that I'm pretty sure my friend intentionally sacrificed code quality to do it, I don't think you'd find that project as readable and understandable as the others. Another caveat is that you have to be okay with your codebase being extremely magical and metaprogramming-heavy, which many industrial users of dynamic languages avoid.
As I mention, I'm personally into statically typed languages mostly for the performance and correctness benefits. I think it's plausible that on larger projects with teams the correctness benefits save enough debugging time that overall implementation time is lower, but I'm less confident in this now than I was before this comparison.
I've read my share of cryptic JavaScript written by others and in that sense I agree that in multi-person long-term projects static typing no doubt will have its advantages.
My hunch is however that what is often overlooked in development with statically typed languages is that it takes considerable time and effort to come up with the right set of types. Many examples are written showing how types almost magically make programs easier to understand. But when you read such an example what is not stated is how much effort it took to come up with just those types.
One way of thinking about it is that type-definitions are really a "second program" you must write. They check upon the primary program and validate it. But that means you must write that second program as well. It's like building an unsinkable ship with two hulls one inside the other. The quality will be great but it does cost more.
No matter what, you need a rigorous schema for your data. If you write a complex JS/Python program without doing the equivalent of "come up with the right set of types" then you will have a bad time. I'm sure in the OP here the skilled Python programmer did think carefully about the shapes of her data, she just didn't write it down.
To be sure, having to write down those data structure invariants in a rigorous way that fits into the type system of your programming language has a cost. But the hard part really is coming up with the invariants, and it's dangerous to think that dynamic languages obviate the need for that.
It's also hard to massage your invariants into a form that a type checker will accept, since you're now restricted to weird, (usually) non-Turing-complete language.
A good example of this is matrix operations - there are plenty of invariants and contracts to check (e.g. multiplication must be between m x n and n x p matrices), but I don't believe there's yet a particularly convincing Haskell matrix library, in part because the range of relevant mathematical invariants don't cleanly fit into Haskell's type system.
For those cases, checking the invariants at runtime is your escape hatch to utilize the full expressive power of the language.
This particular example can be encoded into the Haskell type system though. For example, there's a tensor library where all operations are (according to the description) checked for the correct dimensions by the type system. It seems to require a lot of type-level magic though, and that may disqualify it for "cleanly".
> But the hard part really is coming up with the invariants,
Surely. But if you have to write them down it becomes hard to change them because then you will have to rewrite them, and you may need to do that many times if your initial invariants are not the final correct ones.
The initial ones are likely not to be the final correct ones because as you say coming up with the invariants is ... the hard part.
What I'm trying to think about is that in a language that requires you to write the types down they have to be always written down correctly. So if you have to change the types you use or something about them you may have a lot of work to do because not only do you have to rewrite the types you will also have to rewrite all code that uses those types.
That does allow you to catch many errors but it can also mean a lot of extra work. The limitation is that types and executable code must always agree.
Whereas in a dynamic language you might have some parts of your program that would not even compile as such, if you used a compiler, but you don't care because you are currently focusing on another part of your program.
You want to test it fast to get fast feedback without having to make sure all parts of your program comply with the current version of your types.
A metaphor here could be something like trying to furnish a house trying out different color curtains in one room. In a statically typed language you could not see how they look and feel until all rooms have curtains of the same new color, until they all follow the same type type-constraints.
"that it takes considerable time and effort to come up with the right set of types. "
I've written once here before, this is one of the 'accidental advantages' of TypeScript: you set the compiler 'loose' when you're hacking away, writing quickly, and then 'make it more strict' as you start to consolidate your classes.
I almost don't bother to type something until I have to. Once I see it sitting there for a while, and I know it's not going to change much ... I make it a type.
It's an oddly liberating thing that I don't think was ever part of the objectives of the language, moreover, I can't think of any similar situation in other (at least mainstream) languages.
You can do that in Haskell also. Just turn on the -fdefer-type-errors GHC option and leave out most of the type signatures. Any expression with a type error will be reported when/if the expression is evaluated at runtime. You'll probably still need a few type hints, to avoid ambiguity, but otherwise it's not that different from programming in a dynamically-typed language.
The takeaway for me was: "The next comparison was my friend who did a compiler on her own in Python and used less than half the code we did because of the power of metaprogramming and dynamic types."
So it's the output of ONE Python programmer vs teams of other languages programmers?
It is well-known that teams cause overhead. Think of the Mythical Man-Moth.
But in the end the other teams had 3 people that were able to maintain and develop their project further, if needed. The single-person team had only one such person.
"Smaller code" does not mean easier to understand. It just means less characters to read. Maybe those characters are heavily loaded with meaning, as in the case with meta programing. You might need twice as much time to _understand_ the Python Code vs. the Rust code. The Rust code might be easier to extend, etc. So this is all comes to trade-offs at the end.
All this being said, I'm still a huge Python enthusiast.
It was also stated, though, that the reason she worked on her own was that she was a very good programmer; presumably, better than a lot of the people who worked in groups. And, as the sibling mentions, teams introduce overhead.
It's pretty clear that (like in every programming teams) the size of the output (in LoC) was linear with the number of programmers, which is something our profession should be worried about.
In my personal notion of code quality, being correct counts for a lot, and this programmer aced the tests, while apparently delivering more within the deadline that any of the multi-person teams.
While writing the previous paragraph, I wrote 'programmer' where I had previously put 'implementation', because I would guess that this study's outcome is better explained by human factors than language differences.
I share the attitudes you state in your last paragraph, but I would add that we should be skeptical of concepts of quality that seem plausible, but which lack empirical evidence for their effectiveness.
Tangentially - do you know how fast her compiler was compared to the compiled-language implementations? I have a vague sense that for many applications, interpreted languages are totally fine and reaching for a compiled language is premature optimization, but I'm curious how it actually holds up for a compiler, which is more computationally heavy than your average web app or CLI.
No I don't know how fast her compiler was. This also isn't a good setup for comparing language performance, since all the test programs were small and there was no incentive to make your compiler fast. Many groups probably used O(n^2) algorithms in their compiler and differences in algorithm choice would probably add too much noise to get a good performance signal.
That aside, speaking off the cuff totally separately from my post, I'm extremely dissatisfied with the performance of current compilers. The fastest compilers written by performance-oriented programmers can be way faster than ones you generally encounter. See luajit and Jonathan Blow's 50k+ loc/second compilers and the kind of things they do for performance. One example of a compiler task it's really difficult to do quickly in a language like Python/Ruby is non-regex parsing, I've written recursive descent parsers in Ruby and compiled languages and the Ruby ones were 1-2 orders of magnitude slower, non-JIT dynamic languages are really bad at tight loops like that.
> I'm extremely dissatisfied with the performance of current compilers. The fastest compilers written by performance-oriented programmers can be way faster than ones you generally encounter. See luajit and Jonathan Blow's 50k+ loc/second compilers and the kind of things they do for performance.
Lua and Jai are lot less complex than say C++: sure, LLVM isn't necessarily built to be the fastest compiler in existence, but I don't think it's fair to compare it to compilers for much simpler languages and bemoan its relative slowness.
I wrote the python compiler. It's very slow. With the optimizations it's probably something on the order of O(n^3) or O(n^4). One of the test cases took seconds to compile. I made no effort to optimize the compiler speed.
Yep, the takeaway for me was that the Python project required far less code, but we're not sure how fast it ran. Further below, the author states the inputs were so small it didn't matter. What if it did? Would the Python solution still be viable?
0.5x with metaprogramming vs 0.7x without in Scala, isn't "far less". This also matches my experience - Scala is pretty on par with Python in terms of code length. It gets a tiny hit from being statically typed, but then makes up for it by having richer stdlib and powerful type-level abstractions.
Less code doesn't imply lower quality code. A more expressive language + less boilerplate allow you to write high quality, readable code with fewer lines.
My experience is that you always use a type system. If the language doesn't include one, you'll end up making a bunch of extra tests and control code that in practice is constraining your data to resemble a type. It's a worse system: more verbose, harder to refactor and it's way easier to miss corner cases.
It's either that or assuming that your data will contain exactly what you want to, ignoring every possibility outside the happy path, which is a recipe for disaster.
That said, I have to say that modern JS produces the cleanest looking and less verbose code I've ever worked with so far. I wish there was a way to work with types without adding verbosity and harming readability.
Seems like a bit of a bold claim when the author themselves directly contradicts that in their own conclusion:
> I think my overall takeaway is that design decisions make a much larger difference than the language
After all, Scala was 0.7x the size and is one of the most strongly statically typed languages. So you could almost invert your conclusion and say the big result is
"Python only saved 20% code lines over fully statically typed language"
No I would not reach quite that conclusion I would say
"Python saved 20% code lines over Scala!".
The results say something about the greatness of Scala, not of statically typed languages in general. The other ones did not do quite as good as Scala.
From what little Scala I've read it looks very terse indeed. That can be a benefit but at the same time makes it harder to understand code written in it, in my opinion.
I think most people would consider 20% basically within the margin of error induced by stylistic and other non-meaningful differences.
For example, it's not completely clear from the post but it seems like the 0.5x figure is from wc -l, which means Python wins a line every time there is a conditional or loop just because it doesn't need a closing brace. That alone might eat up a lot of the 20%, but you would be hard pressed to say that is a meaningful difference.
My surprise from this study was simply that dynamic languages are clearly not much worse than the best-of-breed statically typed languages. Maybe 20% is within the margin or error, but you definitely can't take that as any evidence that Scala is "better" than Python.
The reason I think this is "big big news" is I thought the general consensus had already been reached in academia if not the programming community that "statically typed functional languages are much better". There's little or no evidence of that in the results of this study.
It is actually nearly a 30% reduction with respect to the Scala line count.
More to the point: while the numbers nominally show differences in languages, everyone with programming experience recognizes that unmeasured and uncontrolled human factors probably had a big part in the outcome.
I know. It's strange to say that. But Scala's library is very very rich.
Another example is that Scala offers a lot of ways to process a list like foldLeft, foldRight, unzip, headOption, lastOption, flatMap, groupBy, and many methods around Map, Set, and etc. Python probably doesn't offer many of these methods.
Of course, this comes with the cost of higher learning curve.
Yes, itertools is great. But its stblib is still much lighter than Scala's and Ruby's.
Actually, that seems to be the direction/principle of Python, where it is less inclined to add a helper function.
"It has been discussed ad nauseam on comp.lang.python. People seem to enjoy writing their own versions of flatten more than finding legitimate use cases that don't already have trivial solutions." from https://softwareengineering.stackexchange.com/questions/2542...
Not that this is better or worse. It's just that, on the brevity aspect, Python code is gonna be longer.
While what you say is true, my small example shows that the code is longer in Python for solving the same problem, i.e. `my_list[0] if my_list else None` is longer than `my_list.headOption` or `my_list.first`.
And we are talking about the brevity of a language here.
I'm not sure what your example is illustrating. The SO question asks for the idiomatic one-liner to get the first non-null element of a list. The accepted answer does that.
Maybe I should have given the comparable example in Ruby, which is `array.first`. Even Scala offers `array.headOption`. Both are more succinct than Python's.
The degree of richness and/or the height of abstraction seem lower in Python. (Not that this is a bad thing. It depends on people's taste, of course.)
Python is indeed very frustrating that way. So many of its limitations and flaws are justified on the basis of clarity and then there are this array of very simple things that can only be expressed in unclear ways which every other language offers a solution for.
A comparable example from JavaScript, there is no library-method for getting the last element of an array, so it gets clumsy: myArray [myArray.length - 1]. A better standard library would provide a method for that: myArray.last(). Or maybe myArray[-1].
Sounds like you just don't like Python, but you don't have great reasons for not liking it. The standard library is fantastic in Python. Find better reasons to back up your unfounded dislike.
Saying Ruby/Scala have richer standard lib than Python isn't really a stab at Python.
Like dragonwriter says, it shows that "Python expresses the concept fairly compactly in the core language without even resorting to stdlib".
It depends on your taste whether you like richer stdlib, and I do. But some don't.
We are talking about how short the code can be in this post, and `my_list[0] if my_list else None` (Python) is longer than `my_list.headOption` (Scala) or `my_list.first` (Ruby).
I'd appreciate more if you elaborate why my example isn't a good illustration on the brevity aspect.
I think there are just a lot of people here getting touchy and defensive because other people aren't automatically leaping to false conclusions from this article's data, preferring to imagine that the article validates their personal choice of favourite programming language rather than engage in proper speculation about the quality of the conclusions made within the article.
I would agree that the sample size was small so programmer competency may be a big factor.
The interesting takeaway from this study I think is that it does not show that statically typed (even "pure") functional languages are not obviously better than plain old Python.
The interesting thing is not what this study proves, but what it does not prove.
> The interesting takeaway from this study I think is that it does not show that statically typed (even "pure") functional languages are not obviously better than plain old Python.
Could you elaborate more on this statement? Not a native english speaker here, so I don't quite understand the sentence. Thank you.
> I think the big big result from this study is: "Python: half the size !".
> A dynamic language like Python is better here, 2x better.
I was surprised by how small the LOC benefit was from using the dynamic languages. As someone who typically reaches for Python, I'd use a statically typed language (Go or Java, most likely) much more often if I expected only twice as many lines of code. In practice I feel the same project takes 3-10 times as many LOC and that pushes it to where it is more difficult to maintain and understand.
I have the same experience with Java. But I would have expected languages that have type-inference like Haskell, Scala and OCaml to do much better. But maybe their advanced features in fact make programs written in them harder to understand, which slows down development. Don't know.
Note my half the size estimate was after estimating the amount of code dedicated to extra features. The raw line count is only 0.7x the size of our non-test line count. Although my estimate of how many lines it would have taken without the extra features has very wide error bars.
"This is quite a quantitative study which gives credence to the claims of advocates of dynamic languages"
You have to be careful with that.
In a dynamic language your test system is your compiler. It is essentially a domain specific compiler. With a statically typed language, half your tests are already written, and you just have to activate them with a type signature.
For my money the trade off is down to whether forcing structures into a type system and getting the 'free advanced tests' is more advantageous than constructing a domain specific test harness 'compiler' and getting the flexibility of duck typing.
And you only learn that when you run up against the limits of the type system and have to start complicating structures to work around it.
In terms of a rough metric I'd suggest you have to include the test code in both cases to get a fair comparison.
> I think the big big result from this study is: "Python: half the size !".
> This does not say that static typing should not be used but I think it shows unequivocally that there is a considerable extra development cost associated with static typing.
Are you saying that the bottleneck in software development is typing in text on a keyboard?
If so, I thoroughly disagree. In my experience, inputting text is a small factor in development cost, with the main cost being research (figuring out what to type in), and debugging (figuring out why something you typed in doesn’t work as you thought it would).
> I think the big big result from this study is: "Python: half the size!".
I would agree, if and only if I thought a representative sample of Python programmers would all produce something of a similar size and just as correct, but I suspect, in this case, it's the result of one especially talented person.
Raw talent and prior experience no doubt had something to do with it, but there's also the fact that the Python programmer was working alone and not particularly concerned about code quality or maintainability.
I agree (with the caveat that, as I wrote in another comment, I think being correct is an important aspect of quality), but this observation also argues against claims that attribute the outcome to the qualities of Python.
Indeed, design choices accounted for the biggest difference by far.
I like that the title frames it as a language shootout to pull people in to see if their favorite language wins (and I'm partial to Python having rewritten tens of thousands of lines of Java into numpy). Still, it would be foolish for people to come away from this brilliant analysis by ignoring the more important conclusion.
This comparison doesn’t tell absolutely nothing about the strength and weaknesses of dynamic types languages vs static types languages in real world projects.
Obviously dynamic languages are better for toy projects.
If for you 4-5k loc is a project of “significant size” then we must have pretty different points of view.
How do we know the big result is "Python!" rather than "work alone!" or "women code better!" ? Because that was 3 features of that one sample. It's hard to make any conclusions from that article.
I also was surprised at the magnitude of the difference and am not sure the extra lines are worth it given the strong association between number of lines and number of defects.
Having said that, the largest point I took away was that the difference between languages was smaller than the difference between programmers and approaches.
I don't think the reason why these languages are terser is because they are dynamic. Something that always frustrates me to no end is how many older statically types languages barely have a literal syntax.
Literal syntax and stdlib size and APIs. The latter is especially important on iterator and container interfaces. They must fit together well and be composable.
This can be exemplified using Crystal, which is close to Ruby in both terseness and APIs, but statically typed (and with static dispatch).
The work was done under a deadline, and some teams did not complete all of the assignment in the allotted time. The Python programmer not only completed all the required and optional features, passing 100% of all tests (both the revealed and secret ones), but "also implemented more extra features (for fun) than any other team, including an SSA intermediate representation with register allocation and other optimizations."
That was a really interesting article. I completely agree with
>>> Abstractions may make things easier to extend in the future, or guard against certain types of errors, but they need to be considered against the fact that you may end up with 3 times the amount of code to understand and refactor, 3 times the amount of possible locations for bugs and less time left to spend on testing and further development.
Choosing when and how to abstract is key. Abstraction in a fashion extends the language so you are trading off a burden on the reader (who have to learn and trust the abstraction) for increased expressiveness. Don't overdo it. (And don't abstract idioms).
The former is idiomatic and immediately readable. For the latter I have to go lookup the definition of P2ALIGN (and make sure I got the one actually used as there may be multiple!), check that it does what I expect, and memorize it.
The value of abstracting the idiom is debatable. If it _is_ used sufficiently often, then it might be worthwhile, but often it isn't.
I think thinking carefully about the abstractions is also key, ideally not while at the computer. Good abstractions should be obvious and not really requiring refactoring as per se.
I really wish when counting code that people would use one of the more modern code counters, or at least use cloc. Tokei, loc, scc, polyglot, loccount or gocloc give a much better idea over wc because they account for comments and blank lines and in the case of scc and tokei strings.
I had a similar experience in university. Class had to implement a modified Turing machine. We could use whatever language we wanted. On person did it in C++ and it was several hundred lines. Another in Java which was slightly smaller. I implemented mine in Python and it was small enough to print on a single piece of paper. I think it was something like 30 lines or so. I exploited break statements to make it terse but readable. It did mean I was marked down a bit for them but it was by far the shortest program in the class.
A few groups did give me output from a modern line counter, often Tokei or loc. I mention in the post that the ratios between source lines, word count lines and bytes were pretty similar across all projects. If people want to get a sense of how that compares to their intuitions in source lines they can use the ratio from our Rust project to convert.
I ended up using raw lines for comparison because for relative measurement it didn't matter for the above reason, and I knew everyone had the same `wc` and I could get them to send me the output of the `wc {glob}` version that lists lines and bytes of all files so that I could drill down into which parts were different.
> I really wish when counting code that people would use one of the more modern code counters, or at least use cloc. Tokei, loc, scc, polyglot, loccount or gocloc give a much better idea over wc because they account for comments and blank lines and in the case of scc and tokei strings.
And even that only tells some of the story e.g. do code counters count separators ({ or } alone on a line) as blank or as code? Are multiline strings (e.g. python docstrings) counted as code or comments?
Usually they count { on a single line as code. Multi line strings are code in all but scc and Tokei. Tokei counts them as comments or string depending on user settings. Scc will count hem as comments in the next few months.
For anyone interested in this sort of thing, there's also Unified Code Count (UCC) [1]. It has a lot of interesting design goals like being open and explicit about the counting rules, which is really useful if you want to predict things like cost and reliability.
I actually lot ucc in the not so great categories. Counters like scc and Tokei are getting close to having the same accuracy as a compiler when it comes to code while being much faster.
They are also support for more languages and are updated way more often. Very much second generation tools that learnt from the first.
Your use cases seem to prioritize language support, update frequency, and speed (what do you mean by the accuracy part?). For this, Scc and Tokei would of course be better than UCC.
The (admittedly niche) use cases I described require understanding the counting rules very well, and keeping those rules stable. For this, scc and Tokei are as useless as anything else, while UCC does exactly what's needed.
Have a look at the tests in scc and tokei. They support nested multiline comments, string escapes and other odd language features. As such both get very close to counting lines the way a full tokeniser used by the compiler or interpreter do making them very accurate.
I see your point. I’d argue however the rules for counting should be language rules not some higher level generic set.
Hey trishume! It’s been a while (this is the Reason person).
Great post; it echoes all the experiences I/we’ve personally had, especially the alternative Rust section: on the spectrum of how much intelligence one needs to understand fancy abstractions, all of us programmers, good and bad, are pretty much lumped to the lower end. I’ve seen plenty of skilled folks “decompensate” their expertise by reaching for overly complex abstractions. The definition of a “strong” programmer, as per the post and usually in real world, should probably be rectified to include not only “the person knows how to use this abstraction”, but also “the person knows when to stop”.
In the same vein of idea, it’d be interesting to know how Go would fare in this project. Go’s philosophy is pretty much the opposite of a language that’s usually used for compiler research; but I’ve seen indicators that using it could be surprisingly effective (and in light of this post, maybe less surprising).
More importantly, it’d be nice to know the perf characteristics of each project =)
> In the same vein of idea, it’d be interesting to know how Go would fare in this project.
I would say a bit worse than the mentioned alternatives, which all have better type systems and thereby e.g. make it easier represent and manipulate the trees that are everywhere in compilers. But most likely it's still fine, and the line count metric would be more influenced by the fact how experienced the author is than the language.
Things will get worse in Go if one wants not write a lexer/parser manually but use tooling for that. Parser combinators and other tools benefit a lot from generics.
On the other hand e.g. writing a network server with decent performance will be in Go a lot easier and more straightforward than in any of the other mentioned languages. While all those languages are general purpose programming languages they definitely have their strengths in different areas. Some are better for some tasks (e.g. compilers), other are better for others.
I’m familiar with these positions (static types, ADT, parsing, etc). But my comment on “less surprising” was meant for Python; it turned out that you can get pretty far with it with no static types nor ADT. So it’s very possible that Go can fare not too badly. The reason why I wondered about Go is because I’ve seen several HN posts about using Go for writing parsers/compilers with surprisingly ok result.
(I don’t advocate dropping static types or ADT; just that in the spirit of this blog post, it might be worthwhile to examining our assumptions.)
I've written a Pratt (TDOP) parser in Go and I thought it was a great experience. Only thing I wished was done for me by the language was that it would tell me the byte position of the cursor in a file stream.
I expect Go would end up like C++ without header files or OCaml without sum types and pattern matching, falling somewhere in between them. Although if errors are handled properly instead of by panic then it might really hurt in the line count from all the `if err != nil { return; }`
Also I commented somewhere else in this thread re perf comparison. Short answer is that since there's no incentive towards performance the signal would be swamped by differences in how much people avoid O(n^2) algorithms even if you don't need to.
I haven't made it through the whole thing yet, but I do want to register a vote in favor of using Lines Of Code count as a rough measure of program complexity. I think it's a perfectly valid things to do, provided that it isn't used as an evaluation metric and nobody is gaming it, and everyone is a reasonably good programmer, not doing crazy things like trying to stuff a massive algorithm onto one line to be clever, or copy-pasting the same code in 20 places because they don't understand functions and classes.
I'd like to hear others' opinions: There's a guy at my work who loves to use doubly, triply, quadruple-ly nested ternary operators. I always find them super hard to read. Am I just a dunce, or do I have a point in thinking it's unnecessarily terse.
Ternary operators can be fine. Nested ternary operators are hardly ever a good idea.
One thing that people writing this sort of code seem to miss is that it's not just about expressing the code as concisely as possible - other people including oneself in future need to be able to read it easily.
What I would recommend in a code review for anyone using nested ternary operators is just to break them up using meaningful variable names so that you have one ternary operator per statement. It'll be easier to read and the names will help understand what's going on more easily.
I'd encourage your coworker to use a language where if/else is an expression and not a statement. :) Of the languages mentioned in the article, Rust, Haskell, Scala, and OCaml all have this as the only form of if/else: you can write something like x = if foo then bar else baz, so you can nest them with some parentheses for readability. Python's if/else is a statement, but it has a slightly different syntax for the if/else as an expression - bar if foo else baz - which is a little less readable but still workable.
It's only the Algol-syntax-family languages (C, C++, Java, JavaScript, etc.) that have the inscrutable ternary operator and if/else as a statement.
Algol family syntax is not the same as C syntax. Algol 60 has if-then-else as an expression like "x := if foo then bar else baz". In Algol 68, an ENCLOSED-clause like if-then-else-fi is a PRIMARY, so it can be used anywhere other primaries like identifiers and denotations (literal constants) can be used. "Begin" and "end" are synonyms for ( and ). Both languages are defined on the level of symbols instead of bytes or characters. These symbols can have multiple representations including punctuation and text in non-English languages like Russian. A lot of people today don't know what Algol family syntax looks like which is a real shame because they're very clean and elegant languages.
I prefer less terse code for readability too. LOC is a decent approximation for complexity but it breaks down when people optimize for it.
It's a tricky problem to decide which idioms are most expressive/readable/maintainable as it's has a group dynamic. My rule of thumb, if I feel I've written clever code it's time to rethink approach.
Usually it's awful. I wish JavaScript had if else expressions. Sometimes it can help to break up the lines with indents if you need a single expression:
Personally I'm not a fan. A lot of people love using conditional operators (?, &&, ||) in React, but having overly nested chains of these operators is a code smell for me. Probably means they can refactor the logic into separate components, or into intermediate boolean variables. Or just into a plain old if statement.
Nope, even code with doubly, triply, or quadruple-ly nested if statements is typically hard to read. I usually refactor that down to avoid that level of nesting.
I don't think there's ever a good excuse for that much nesting of ternary operators.
I love them. I also just managed to mess up a function because of one and not realize it until a late stage of testing
I don't know that I've doubly nested them.
The traditional method of measuring code complexity is Cyclomatic Complexity [1] which, roughly speaking, measures the number of branch points in the code.
When comparing different languages it becomes more of a measure of the relative abstraction levels of languages. The exact same program will have hundreds (or thousands) of times more LOC in Assembly than in Python.
Not only that, the LOC heuristic depends on the verbosity of the language. Java, for example, is more verbose, than say, Python, even though you’re close in abstraction level.
Assuming x and y are 1 line but long enough expressions you don't want to use a trinary operator (available in both) the python is 2/3rds the length of the c because of bracketing. Obviously cherry picked, but I bet these differences add up.
Sure, and you can put the else on the same line as the end of the if block even without doing that. Or you could make the difference bigger by putting the opening brace on a newline.
I wrote the style of C that I actually write, but it's probably not fair for me to label it as a property of just the language.
I assume OP is in grad school. I took a few grad classes during my undergrad and the difference was night and day (teaching and cohort). So maybe your school was fine and you were just in the wrong classes. :)
Nope undergrad. UWaterloo just has a special co-op program that extends the degree and fits in six four-month internships in alternating school and work, for a total of two years of work experience by graduation.
The Scala result confirms my bias :P I love Scala because it strikes a great balance between being succinct and offering type safety.
I wish you would use Ruby instead of Python. Python is strangely inconsistent. For example, Python doesn't really offer a rich standard library; people have to resort to ugly solutions for a simple problem (here's an example: https://stackoverflow.com/questions/363944/python-idiom-to-r...)
Ruby would be better at an example of how a dynamic language can be more succinct.
Your linked stackoverflow question - how do you return the first item of the list or None.
Proposed solution - `a[0] if a else None`
Your reaction - Python doesn't offer a rich standard library.
I disagree, and I suspect most programmers would too. We'd much rather have ergonomic libraries to make http requests, parse command line arguments, datetime, itertools, data structures like heaps, filesystem access, data archiving, data serialization and deserialization and a million other nice-to-haves. It's actually at the point where I've heard criticism of Python's stdlib doing too many things, rather than too few.
If you don't want to write 19 characters to find the first item in a list, pick another language. But don't mischaracterise python's standard library.
> But don't mischaracterise python's standard library.
My main point is Ruby's stdlib is richer than Python's.
(And that's not either better or worse. Some people prefer lighter stdlib, and that's fine.)
> If you don't want to write 19 characters to find the first item in a list, pick another language.
Yes, if Python supported `.first`, I wouldn't want to write the longer version of it.
Since the article focuses on brevity, I did propose another language here, which was Ruby. It would serve better at how succinct a dynamic-typed language can be because of its richer stdlib, especially when we compare a dynamic-typed lang with Scala, which has an extremely rich stdlib. Using Ruby would be a fairer comparison.
I gave one small example (`.first` vs `a[0] if a else None`) to illustrate my claim. Two more examples (from https://ruby-doc.org/core-2.4.1/Array.html) are `.rotate` and `.transpose`. I'm sure there are more examples around Hash and other data structures.
Python don't have these methods, and we have to make them ourselves. To make code even longer, we need to maintain and write unit tests on them.
I have also found Python’s stdlib to be laughably bad. Just compare the data structures available to Scala. I write Python everyday and honestly, it’s a chore. The language is primitive and inexpressive.
The sad part is that it totally didn’t have to be this way. But instead of evolving, Python is just stuck in the past.
In Python, it's `import itertools` and `list(itertools.chain.from_iterable(list2d))` or `[item for sublist in list2d for item in sublist]`. In Scala, it is `list2d.flatten`.
In fact, Python is against making a richer stdlib in general.
"It has been discussed ad nauseam on comp.lang.python. People seem to enjoy writing their own versions of flatten more than finding legitimate use cases that don't already have trivial solutions." from https://softwareengineering.stackexchange.com/questions/2542...
And, intuitively, when you don't want to provide a helper function, the user code gets longer.
Not that this is better or worse. It's just that, specifically, on the brevity aspect, Python code would generally becomes longer. Because, as you see in the quote, "People seem to enjoy writing their own versions of flatten".
Python has Lists (arrays), Dict, and Sets. That's pretty much it.
There's no linked lists. There's no sorted maps, no sorted sets, no maps that preserve insertion order, no queues, no priority queues, no bitsets. And there's no immutable collection in Python besides strings.
Moreover, scala has synchronized collections that can work over multiple threads. I guess all collections work that way in Python, but that's because Python doesn't even support true thread parallelism in the first place!
Also, if we're talking about the number of methods on the collections, Scala has way way more. map/foreach/filter/foldl/foldr/option/drop/take/first/last etc etc.
Python is usually considered one of the richest standard libraries out there.
Yes, the is no standard way to get this one edgecase, but every language has warts and things like that. It sounds like you are just looking for an excuse to hate on Python.
> It sounds like you are just looking for an excuse to hate on Python.
That's a big claim to make based on a couple of assertions that are neither about the quality of Python as a language nor the quality of its standard library, merely observations and comparisons.
Are you sure it's not you who is upset to see Python 'attacked' (for lack of a better term) in any way?
> but every language has warts and things like that
Sure, but we are talking about the brevity of a language. The richness of the standard library directly impacts brevity.
So, Ruby would have been a better representative (for brevity) of a high-level dynamic-typed language.
> the is no standard way to get this one edgecase
I don't think `.first` is an edge case. It is used fairly often. One example that I can think of right now is when you want to fetch the first row from MySQL. MySQL returns an array, and you would need `.first` to get the first row or null.
> It sounds like you are just looking for an excuse to hate on Python
Not at all. While I don't prefer Python, I recognize there's a downside to a richer standard library. The language becomes more complex; harder to learn.
Richest is relative. If we consider Python against every other programming language, then, yes, it's ONE of the richests.
If we consider Python vs. Ruby vs. Scala, I doubt Python would be considered as richer or richest.
For example, Scala offers a lot of ways to process a list like foldLeft, foldRight, unzip, headOption, lastOption, flatMap, groupBy, and many methods around Map, Set, and etc. There are many examples where Ruby version would result in shorter code like `array.delete_if` and etc. Python probably doesn't offer many of these methods.
If you're into this sort of thing, definitely check out the ICFP Programming Contest -- take next Friday off and join in on the fun! https://icfpcontest2019.github.io/
Has anybody made a study of the corpus of ICFP contest entries? Seems like an interesting dataset with many languages, many programmers competing multiple times, etc.
I mean, if you design an esolang for the competition task and implement an optimizing compiler that's faster than any general purpose language out there, that absolutely feels like something that should be rewarded by the spirit of the competition! So it's okay that it's allowed by the letter of the competition, too.
My professor was definitely a good professor and no way they would have done that. It clearly would have been done as an added challenge for themselves for fun and there's no reason to penalize that.
It's definitely fun: you can also take part of the challenge to be figuring out what parts of the full language are worth adding to make the overall job easier (so you're not taking the language subset as set in stone). When I did this for Python I started out in unrestricted Python, gradually both increasing what's implemented and reducing what's used. (Result at https://codewords.recurse.com/issues/seven/dragon-taming-wit...)
I doubt I will ever be a professor, but I would be happy to pass any student who did this provided they did the task adequately. Maybe extra points for cheekiness.
Maybe no one thought of the immense satisfaction they would get at the end when they compile their compiler to x86 code after compiling it with javac to bytecode to run it the first time.
I conducted a similar exercise, albeit on a much smaller scale, when I was "language shopping." I took a project that I had written in Visual Basic (my go-to language at the time), and re-wrote it in a variety of programming environments.
The project was just complex enough to require maybe 100-200 lines of code at the end of the day, thanks to liberal use of libraries. Not huge, but not atypical for a "scientific" programmer who uses code as a way to solve problems rather than to create software for others to use. It's pretty representative of my life as a programmer.
The exercise forced me to learn enough of each language to get a feel for it, and then I could look at the programs alongside one another to assess their strengths. I also imposed some rules, such as that the language had to run on multiple platforms, and be FOSS. In addition to comparing the languages, I was also implicitly comparing libraries and even access to online help. This was also my first real exposure to StackOverflow.
I tested Javascript, Python, GNU Octave, and wxMaxima. Ultimately Python won out and is my language of choice today. This was just my own little exercise, and not worth publishing, but has made me a believer in learning multiple languages.
I like the idea of using this project to compare languages.
However, to me it seems the comparison is more of a comparison between specific implementations rather than languages. The high variance in LOC between the two rust implementations makes me wonder if there's a similar variance for other languages and the samples we have lie somewhere between e.g. 0.5 and 2 times the average size for a given language.
Therefore, the python implementation could just be a really compact one, even when compared to other python implementations. It could be interesting to ask the Professor if he'd be willing to collect some stats over the years.
These comparisons are actively harmful. What the OP is doing is the same as if you were trying to determine the winner of a race by looking at the competitors at the start. You don't know their speed, their strengths etc. You can't determine how maintainable the code you produce this way, how it performs under real world usage, etc. Please stop doing these.
I feel like what you're actually testing with this is "amongst the top percentile of programmers, what are the proclivities of people by choice of favourite language"
Because the implementations vary so much, that's the source of a lot of the LOC-difference. They effectively delivered more or less (if you count more stages as "more" which I would as it will make it easier to reason about/debug, and count a typesystem as "more", since it gives you more guarantees).
python - solves the problem brutally fast, some ugly shortcuts
haskell - solves the problem quite slowly/delivered the most
rust/c++ - intermediate
scala - solved fast, took shortcuts
ocaml - this is the one that surprised me I'd have expected it to be the shortest with python
I find it curious that both the team that used C++, and the author, both appear to believe that sum types and related facilities are not usable or conveniently expressible in C++.
I got that Boost Spirit, a powerful parsing library, was forbidden. Were all the Boost libraries similarly embargoed?
I regularly use Boost Spirit as an example of "sounds good but actually a nightmare", and am always surprised to see it mentioned in the wild. It is truly a modern horror.
Boost is like Apache -- a collection of libraries of various quality and stage of development, not all alike. A lot of stuff in there is designed to prove out experimental language features ahead of the next standard revision. A lot of that stuff is convenient, but a complexity nightmare under the hood.
For your own sanity, and of those you care about, don't accept all of Boost equally. Be suspicious, and treat each package as if it were some random library you found on the internet.
(There are of course tons of great things in Boost, but tread carefully)
> I regularly use Boost Spirit as an example of "sounds good but actually a nightmare", and am always surprised to see it mentioned in the wild. It is truly a modern horror.
huh, I have written a few parsers (half a dozen maybe ?) with spirit and absolutely don't regret it - why would you say that ?
Yeah the issue is whether the language makes it easy to use them on all levels from syntax and features (simple constructors, pattern matching) to use in the stdlib and wider community
There was a general prohibition on libraries that don't ship with the compiler, but you could ask the professor for an exception. Someone asked about Boost and a blanket exception for the non-parsing and scanning parts of Boost was granted. However the C++ team I talked to didn't use it.
Also note that it's not only the sum types that are important, powerful pattern matching facilities are part of what makes them so valuable.
C++ has had sum types for years, so I'm not sure why that was mentioned. If you aren't familiar/experienced with the language, then the comparison is not fair overall.
It is correct to note that C++ lacks some core language features that many newer languages have, tailored to more convenient and, often, safer use of slippery sum types. Sum types are a step away from lexical typing, and without that support add risk that strong typing was invented to contain.
There is an active project to have a powerful pattern-matching primitive ready in time to adopt into C++23. In the meantime, C++17 supports "structured bindings" that are often helpful.
Yeah it's had everything for years. I'd rather shoot myself than use those features outside of a modern language where they aren't just bolted on along a billion other things.
I mean, they aren't. The ergonomics of sum types in C++ are terrible and are actually useful in far fewer situations than they are in nearly any other language which supports them.
I think the right metric to measure here is not quantitative, but more qualitative.
In other words, how easy it is to adapt the language to the problem you are trying to solve.
So, in order to do that you need to have a good understanding of the principles each language is based on. Once you've got that, then you look at the resulting code and "measure" how easy it is to understand.
As was already said, this requires having someone with a very good understanding of the underpinnings of each language, which is not really going to be reasonable for most people.
The problem is that if you simply tried to port an idiomatic solution from e.g. C++ or Python into Haskell or Scala, or whatever, then you would probably end up with something very ugly, because you didn't adapt the language to your problem. You tried to force it to do something it wasn't necessarily designed to do.
Love reading your posts Tristan, a question about the choice of Rust. Was it specifically based on the previous experience you and your groupmates drew on or were these languages assigned to students? Was there any point in time where you and your team were trying to optimize your compiler?
We got to choose. Everyone on our team had some experience in Rust and OCaml, the two of us more experienced in Rust were pretty indifferent between OCaml and Rust, and our third teammate wanted to use the project to learn more Rust so that's what we went with.
Another big takeaway is the reusability of integration tests.
Write once, verify implementations in any language.
Static or dynamic typing doesn't matter that much if you can verify your software, and there are many different reasons to choose a particular programming language.
You need to look at the whole picture, time taken from day one to release, performance/correctness, then maintenance/onboarding, how easy is it to refactor/add features etc? It's virtually impossible to definatelly conclude in absolute terms that one language is superior than other.
One of my best experiences was dovetailing that happened on a project that had to be c or c++ where I prototyped fully working system in ruby first, then rewritten it in c. Everything went almost too smoothly.
This is an interesting writeup, but I'd say the groups settling on different algorithms matters much more than the language used. It is obviously a pretty big difference whether you use a recursive descent or an LR parser.
It would be interesting to see experts in different technologies compete, in front of a live audience, to implement or change a variety of projects. It would be useful to see what a master can do with each tool or platform.
It would be interesting and educational to see how they would do it, definitely. But when comparing languages it makes sense to compare how well average people perform with them. We can't always hire an expert they are not universally available.
P.D: I'm building a relational language and my ideas mimic closely "writing-a-compiler-in-rust" ie: using pratt parsers and similar, but not see much of how do that on rust...
None of it is open source because it was done for a school course that uses the same project every term. The only use for having access to the code would be cheating on that project. The school gets mad when people publish the source for their school projects, especially when it isn't otherwise useful, that's also why my ray tracer isn't open source.
I would not agree that that is the only use of that code. I’ve found a lot of school projects on GitHub when googling how to do stuff as a professional programmer and have found them useful quite often.
Being a teacher myself, I understand this policy. Too bad, though, I'd have dearly loved to read the python implementation to rewrite it in Clojure. I think the language could allow for something at least as concise be maybe cleaner.
The bit about "no parsing helpers even if they’re in the standard library" makes me wonder about using DCGs, but if disallowed re-implementation would be straightforward and add only a small constant to the code volume.
I would like to see metrics on speed of compilation and size of resulting asm.
I've condensed the relative source code sizes and notes for quick ref:
Rust_1 1.0 using recursive descent, visitor
Haskell 1.0-1.6x depending on how you count for interesting reasons
C++ 1.4x for mundane reasons
Python 0.5x fancy metaprogramming, dynamic typing, single author
Rust_2 3x different design decisions
Scala 0.7x LR table generator, online Java grammar
OCaml 1.0-1.6x depending on how you count, similar to Haskell
> AST visitors and recursive descent [...] weren’t taught in the course
Similar implementations in Rust(1) and C++ are roughly the same
Similar implementations in Haskell and OCaml are roughly the same
A similar implementation in Rust(2) to one in Haskell/OCaml is roughly 2x-3x
Don't underestimate dynamic typing, metaprogramming, and individual effectiveness
But for real world, long lived projects I still promote static typing with the exception of startup development (MVP/product validation, early iterations) of new product ideas.
Interesting, but nothing to make any conclusions from. Too many variables to be of any use. Almost everything was different about these projects (different parsers, error handling, features, developer skill, team size).
The actual effort of writing one line of code in the specific language doesn’t mention in the article.
Write one line of rust is extremely difficult than one line of Python, or even C++.
I'm not clear whether the SLOC counts are for the application itself or including their test harnesses. It would be good to have a table of results with this data.
Luckily, they don't bring with them a lot of cognitive overhead for the developer, so their presence masks the expressiveness of OCaml in LOC stats IMHO. I use interface files to make public signatures explicit, abstract away some types and write thorough doc-comments. I'd be tempted to exclude them from such a comparison to better relate code size with programmer efficiency, especially when comparing to a language like Python.
On the other hand, in languages like C we couldn't exclude header files because they include macros. (Although I suppose a fancier comparison could count macros and exclude function signatures.)
Rust seems so ugly ... makes Perl line noise look like paradise. I guess it's that time of the year ... when I have to learn an esoteric language to keep my sanity.
I wonder how Racket would compare. I’ve written some DSLs at an old job for Racket and found the libraries for language design almost unreasonably powerful (and maybe unreasonably complicated as well). The dynamic typing would also help in reducing the amount of code.
Yah I figure it would be like Python but better if you were allowed to use the language design libraries. However the course specifically forbids standard library components that are designed for implementing languages so that you have to write the parser and things yourself.
But you could use Racket if you wrote your own parser generator macro as part of the project's code?
And would you be allowed to use Racket's existing facilities for representing and implementing macros?
(You might want to use the macro stuff for your AST, IR, and transformations. Though you could still win with normal data types, but the macro stuff can help you win even more.)
> Since my team had all interned at Jane Street the other language we considered using was OCaml, we decided on Rust but I was curious about how OCaml might have turned out so I talked to someone else I knew had interned at Jane Street and they indeed did their compiler in OCaml with two other former Jane Street interns.
Relatedly this is why I get annoyed at people that don’t believe 10x or Nx developers exist - they absolutely do!
In the 90's, the Swedish Ericsson and German Siemens companies participated in a joint (crash) project. Each sent 250 engineers. After six months, they delivered. A friend counted up lines of code written by each engineer, at the end. Fully half the code in the final product was written by one person: N lines by him, N lines by the other 499 people.
He was a lead programmer, who issued two-week assignments to other engineers; if one wasn't done come Friday, he would do it himself on Saturday.
He doesn't consider himself especially fast, because he knows someone else who codes ten times as fast, and wears out two keyboards per year.
This project was also interesting because they took blood samples, by which they got objective measurements of stress. Everyone's stress level increased right up to the deadline -- except his. And their stress levels did not start falling until many months after.
Price's Law - that the square root of the number of people in an organization do half the work - predicts that about 22 people should have done half of it, regardless of anything about 10x engineers.
It also predicts that the N lines by the remaining people would be half done by about 22 of them and half done by 4xx of them; I wonder if that pattern was seen?
Even if the original engineer was 10x as productive, that would leave 12 people doing half of it, not 1. It makes me think that if the average team worked twenty days a month for six months the project should have been 60,000 units of goodness, and if he was as productive as all of them, 120,000 units of goodness. If, instead, he wrote rushed low-quality code, they spent a week trying to untangle and integrate with it, then he ignored their work and rewrote it himself, demoralizing the team and refusing to play nice with them, it might have reduced the overall units of goodness way below what it could have been. And at the same time he gets the boost of writing it all himself his way and not having to care about making it stable for others to work on, so he looks disproportionately better for that as well. What did the other team members say about it? What happened to the project after?
> He was a lead programmer, who issued two-week assignments to other engineers
You ought to expect a lead programmer to be better, otherwise why would they get and deserve that position? You'd also expect a leader to be way better than assigning work people can't complete, not working with them, not stopping doing that, then taking over from them. What would a 10x exceptional leader who wasn't a worker, get out of a team of 500 average workers?
Jane Street only interviews and hires really exceptional engineers. A team of exceptional engineers in a single CS class, even if it is at a school like Waterloo, would imply that the 10x phenomenon isn't a myth but a somewhat mundane reality in certain circles.
"Their project was 17,211 raw lines, 15k source lines, and 637kb"
So a couple of students cranked out 15K lines of code, in basically a subset of their study time, for a single class?
I don't doubt they did it, what I doubt is the quality of what they wrote. Because it's 'just a project' it doesn't have to be fully debugged ...
But that is a vast amount of code to hand-write over a short period of time.
At that level, I think it's basically just 'code like the wind' with not much consideration for whether it works, architecture etc. etc. - which in a way might put the conclusions of the 'mini study' at odds.
At 15K LOC dashed out very quickly ... these are not products, they're comparing 'rapidly written out lines of code', which is something else entirely from working code.
Also, given that everyone is in Uni, and likely may not have been exposed to proper idiomatic code for the language ... this presents another issue. Ideally we'd want to compare reasonably idiomatic code for one language, to another.
Kudos to the author for this, but we should be aware of some caveats.
Note that project is the outlier for how much code it contained. That team did have the most trouble getting things done on time and fixing all the bugs, so they passed the least tests, as one would expect given how much code their design choices required them to write. I think the causality goes from their design to the line count, not from the line count to worse design, they had to write lots of code to implement their more abstract design choices. I looked at their code and didn't see any obvious deficiencies in anything other than the design.
Also note that UWaterloo's CS and SE programs are different than other universities. Everyone on all teams had at least two years of full time work experience at 6 internships. The people I talked to were also programming enthusiasts who read online a lot. The only teams that may not have written the most idiomatic code are the person who used Python alone, and the Haskell team because I've heard idiomatic high-end Haskell involves an insane amount of knowledge of abstractions like lens, way more than all the other languages.
This compilers class is also well known for requiring lots of work, so students often arrange their schedules to have a low load from other classes while taking it, which is possible since the CS program requirements have a lot of flexibility.
Two years of work experience isn't going to make them experts in a language, especially during an internship. Languages like c++ would be a minimum of 5 years of professional programming to be competent I've found.
My office was across the street from UW for years, and I've hired handfulls of you as interns and full timers. You're great.
But - '6 internships' is not quite enough to do this comprehensively, as exhibited by someone blasting out 15K LOC (I'm still reeling at that).
I've worked on a number of projects in C++ and I'm pretty sure that I'm not very good at it, even with many years of experience in other languages, for example.
The authors deserve a lot of credit, but I don't think much can be concluded from this ... it's just not the right situation.
FYI - this kind of comparison is very difficult to do even for the most experienced.
I can hear all the Haskell fanboys screaming that they didn't do Haskell right because a kleisli arrow didn't appear and ascend them into monadic heaven.
To be fair, I don’t think any Haskeller would actually write this project without attoparsec and maybe lens by choice. They’re basically base libraries for this sort of thing.
Agreed. The rule about not using any library that doesn't ship with the compiler creates a heavy bias in favor of "batteries included" languages like Python. I understand not allowing parsing libraries, given the nature of the assignment, but if even widely used libraries like lens are off-limits the result is not going to resemble idiomatic Haskell. As a matter of basic fairness, if some library or built-in language feature is to be permitted for one implementation then analogous libraries should be permitted for all the other implementations regardless of whether they happen to ship with the compiler.
The eval function leveraged heavily by the Python version should probably also have been off-limits, for the same reason that parsing libraries were prohibited. Using eval amounts to embedding an existing fully-developed parser and runtime environment into the project as a library.
I think a much more interesting comparison would be to ask an expert in each one of these languages to implement a "production" ready version of the project - so allowing popular, stable, community-accepted libraries to be used.
Then we'd see what truly idiomatic solutions to this look like.
Prohibition was an extremist political movement. Widespread public rejection of it, in the form of stills, speakeasy and other means proved it to be a law without public sopport, resulting in its repeal within 10 years - an absolute public rejection of a limit unsought by the majority. OP pov has zero basis in fact.
Having the whole project - design, architecture, all implementation details - in one head is not a trivial advantage. Even ignoring communication overhead, there might be subtle duplication of code simply because different people choose to do similar things in slightly different ways.
While the ratio of Rust to Python code kind of matches my expectations, I wonder how much of it might be due to the difference in team structure vs the difference in chosen language.