Whether enforced compile-time strong type checking is a benefit seems to depend on the programmer. It apparently helps some people, it certainly does not help me.
For what it's worth, I've been building "production" (is "production" something you make money on? I find this word increasingly vague) systems with both Common Lisp and Clojure for quite some time now. I prefer Clojure. But both languages are lisps. My thoughts so far: you can build spaghetti code in any language. You can either use a language that lets you run your spaghetti quickly, or one that whacks you on the head repeatedly thus making your spaghetti stiffer and straighter. But you end up with spaghetti anyway.
I agree that it is difficult to write good lisp code. In my code, I spend a lot of time thinking about contracts and data structures. If I am not careful, I end up with problems later on. But using a language like Java doesn't solve that: you just get the illusion of "better code", because your spaghetti design is now codified into neat factories and patterns.
The advantage of using a language from the lisp family is that reworking your spaghetti into something better is much easier, if only because there is so much less code.
Have you ever worked on a 1M line code base? Or with dozens (or hundreds!) of developers.
Once the team gets big, or the code gets so big you can't hold it all in your head, any machine-enforced QC can have a major impact.
(I'm working on a very large Haskell code base now -- the parts that tend to cause trouble are the dynamically typed bits -- they can just go wrong in so many unanticipated ways, and the complexity means its almost unavoidable that devs miss things).
I'm increasingly convinced that to wrangle massive code bases, you need both heavy abstraction abilities (higher order functions, DSLs, algebraic data types, polymorphism), and very strong typing, to keep the complexity under control
A 1M line codebase can mean many things. It can mean code that belongs to many systems mashed together as if it were a single thing. It can mean that the code that should belong to different systems is tightly coupled into one monolithic entity that should have been several little entities. It can mean lots of boilerplate code too. It can indicate a lack of timely refactoring on an aging codebase so interconnected nobody has the courage to separate in more manageable pieces. None of those can be solved by clever language choices alone.
You also mention things going wrong in unanticipated ways - this may signal that the problematic code where errors bubble up is not problematic at all - it is called by code written by people who don't really understand what the functions do and who probably didn't write adequate tests for that - because the tests should catch the unanticipated parts. The problematic code is the one calling the parts where errors bubble up. The canary is not responsible for the gases in the mine.
While you may be right that, in order to deal with multi-million-line codebases you need static typing, I'd much rather split that codebase into smaller units that could be more easily managed.
Wearing a straitjacket is often a consequence of an underlying condition that can, sometimes, be corrected.
I would expect that dons is talking about a 1M codebase which already consists out of manageable pieces. It's only, that the pieces have to work together, talk to each other, know about each other (but not too much).
Sometimes software solves problems which provoke incidental complexity because of sheer size and my experience (albeit not above 500k) tells me that indeed, all bits help, also compiler enforced type checks. I would never bet my life on tests. As you write: "because the tests should catch the unanticipated parts", that's the point: tests never catch unanticipated parts by their very nature. Sometimes, by sheer luck, yes.
A compiler can only go so far. It'll happily compile:
int main(int argc, char *argv)
FILE *fp = fopen("outfile", "w");
int do_something(FILE *f)
fprintf(f, "This should work if you know what you're doing");
The compiler will happily compile that because C's standard library thinks it's a fantastic idea to just throw type-safety out the window. Even C libraries can protect themselves from this via strong typing instead of overloading what FILE* effectively means.
I think you mean contracts or defensive coding. If someone calls my code wrong they will have to think to test that particular case themselves, which is hard. Unless I've written contracts; then when it blows up they'll know what they did wrong.
The question we have to answer to properly understand what went wrong is where did the argument originate. If it's being generated inside function A that then calls function B with it, a test of A should fail when it calls B with the wrong argument. In any case, I would imagine the test coverage in the A-B system is lower than it should.
The biggest programs I've worked on are way smaller than 1MLOC, but I've dug around a few large code-bases written in dynamic languages(emacs most notably, 1M lines of Lisp, 1/4M lines of C), and my unproven theory of large systems is that it might be a good idea to make mistakes have smaller impact, rather than try to avoid them(they are inevitable), by focusing on better abstractions, and thus keeping problems more local, focusing on the protocols rather than on the implementation. Static typing might help here, but designing a protocol/interface/api/dsl is hard and I need to experiment a lot with it, so I'm willing to sacrifice some safety for better overall design and flexibility(which will hopefully make debugging easier). IOW I agree you need DSLs, polymorphism and all that good jazz. Where my disagreement comes in is that It's a trade off, and not necessarily an absolute need to have a lot of static typing. In my personal work I'm willing to make the trade off and work with clay rather than marble. Other people work and think differently and I accept that; absolute truths are rare in computer science, and don't exist in software engineering, its all trade-offs here :)
ps My hobby vaporware CL project will probably be called mudball, I'm trying to learn how to design systems with hackability and extreme flexibility in mind :)
>Whether enforced compile-time strong type checking is a benefit seems to depend on the programmer.
And, due to the lisp's nature, you can have static type checking as a library, eg. there's some work on it for clojure https://github.com/frenchy64/typed-clojure, and AFAIK it's based on work already done by Typed Racked/Scheme.
Also I believe that because clojure is a lisp, has very few special forms, is immutable, has nice ns/var semantics and overall focuses on simplicity - you could build quality code analysis tools with relative ease, something that will do search on code for common error paterns, maybe even do verification outside type system with custom DSL. It's something that I would like to explore in few months after I finish my current project.
> But using a language like Java doesn't solve that
The article isn't talking about languages like Java; it's talking about languages with good static type systems.
I don't know if it's possible in Clojure without giving up on Java interop entirely, but good static type systems in other languages can make null pointer exceptions impossible, among other things. If you tell me null pointers aren't a problem for you personally, I'll find that difficult to believe.
If by null pointers you mean that some data is nil where it should not be, then certainly — they are a problem. In fact this is probably the biggest class of problems I encounter on a daily basis.
I still think you don't need a type system to address this. You could even argue that "nullness" is not a type, it's a data value (an out-of-band data value, shall we say). The way I deal with it is tests for corner cases (I always try to have tests that demonstrate what a piece of code does with null parameters), preconditions and assertions.
I agree that good static type systems can be of help, that is what I meant when I wrote that they seem to help some programmers. I don't dispute that. I dispute the other claim that I understood from the article: that languages that enforce a rigid structure are strictly necessary for "production software".