Hacker News new | comments | show | ask | jobs | submit login
Diminishing returns of static typing (merovius.de)
434 points by robgering 9 months ago | hide | past | web | favorite | 617 comments



There are 3 main areas of interest in the discussion of benefits of static vs dynamic typing.

- Quality (How many bugs)

- Dev time (How fast to develop)

- Maintainability (how easy to maintain and adapt for years, by others than the authors)

The argument is often that there is no formal evidence for static typing one way or the other. Proponents of dynamic typing often argue that Quality is not demonstrably worse, while dev time is shorter. Few of these formal studies however look at software in the longer perspective (10-20 years). They look at simple defect rates and development hours.

So too much focus is spent on the first two (which might not even be two separate items as the quality is certainly related to development speed and time to ship). But in my experience those two factors aren't even important compared to the third. For any code base that isn't a throwaway like a one-off script or similar, say 10 or 20 years maintenance, then the ability to maintain/change/refactor/adapt the code far outweigh the other factors. My own experience says it's much (much) easier to make quick and large scale refactorings in static code bases than dynamic ones. I doubt there will ever be any formal evidence of this, because you can't make good experiments with those time frames.


> For any code base that isn't a throwaway like a one-off script or similar, say 10 or 20 years maintenance

I think one of our problems is that people have downgraded the importance of this. Much code nowadays (rightly or wrongly) is considered "disposable" - people think that the likelihood of any given piece of code they are writing as surviving more than a few years is negligible. It is a natural assumption when you see the deluge of new technologies, hype cycles, etc. It is further reinforced by the fact that people's empirical experience is that a huge amount of their software work is abandoned, rewritten, outdated, obsoleted, etc.

I think these views are horribly mistaken, because at a deeper level even if 90% of code gets abandoned, the quality of the 10% that survives is still going to determine your maintenance cost. And half the reason we keep throwing code away is because it was created without consciousness of maintainability - it is so easy to say that the last person's code was garbage, so we are going to rewrite it because that is faster than understanding and then fixing the bugs in what they wrote.

I observe this in myself: my favorite language to code in is Groovy - a dynamic, scripting language with all kinds of fancy tricks. But my favorite language to decode is Java. Because it is so simple, boring, there is almost nothing clever it can do. Every type declared, exception thrown, etc. is completely visible in front of me.


> people think that the likelihood of any given piece of code they are writing as surviving more than a few years is negligible

Well, most code is written by relatively-inexperience developers, who have not had to retire a system or support a legacy one, and don't know what should be sought out & what should be avoided when designing a system. Thus, they make decisions with limited information to solve the problem at hand, and only later find out the implications of those decisions when someone wants to (say) deploy it as a dockerized service on k8s.

It's one thing to read The Mythical Man Month, and another to write a replacement system that stops providing business value after 30 months and needs to be rewritten to support the current needs.

> it is so easy to say that the last person's code was garbage, so we are going to rewrite it because that is faster than understanding and then fixing the bugs in what they wrote

There's no black and white answer here: sometimes the code is so convoluted (or in the wrong language) that it has to be rewritten; sometimes the design of the system strongly resists changes in behaviour & so much of it needs to be made more flexible that an incremental improvement would cost about the same as a full rewrite.


And half the reason we keep throwing code away is because it was created without consciousness of maintainability.

Well, this has nothing to do with static vs dynamic typing. You can write unmaintainable code in static languages very easily. In startups, developers often overlook maintainability, I completely agree but that's because everyone knows that the code you are writing today might not be needed 2 years down the line, you are mostly iterating to find PMF.


It's not totally divorced from the static vs dynamic argument. One of the main arguments people deploy for dynamic typing is that most type declarations are boilerplate and take time to code but don't add any value. But the reality is that they do add value because they enhance the readability and maintainability of the code (and this is one reason I'm not even a great fan of type inference in many situations). So it comes back to the value of maintainability vs getting your first iteration of the code to work.


It's also caused by the fact that nowadays a lot (most?) of the code that gets written is for the web or mobile, where technology changes incredibly fast.


In my experience with growing companies, even compay-critical code bases get rewritten within 3-4 years to account for flexibility that the previous strongly-typed system just can't handle. A well designed system uses strong types for the "knowns" but allows changes via dynamic types for the "unknowns". Those are the systems that last.


> I observe this in myself: my favorite language to code in is Groovy - a dynamic, scripting language with all kinds of fancy tricks. But my favorite language to decode is Java. Because it is so simple, boring, there is almost nothing clever it can do. Every type declared, exception thrown, etc. is completely visible in front of me.

one of my favorite things about groovy is that it's easy to start strongly typing things as your code shapes up, because it allows for totally dynamic types, but it also allows for strong static typing. haven't really had the chance to use groovy since 2012, though.


Static typing was grafted onto Apache Groovy in 2012, but no-one really uses it. I'm not sure about its reliability -- its use never took off on the Android platform, and none of the Groovy codebase itself has ever been rewritten in static Groovy.

Groovy's still great for scripting on the JVM though, for stuff like those 10-liner build scripts for Gradle, glue code, and mock testing. Just don't use Groovy for building systems -- use a language based on static typing from the ground up, like Java, Scala, or Kotlin.


> Static typing ... no-one really uses it. I'm not sure about its reliability -- its use never took off on the Android platform, and none of the Groovy codebase itself has ever been rewritten in static Groovy

You keep saying this repeatedly but it just isn't true:

https://github.com/grails/grails-core/blob/master/grails-cor...

https://github.com/groovy/groovy-core/blob/master/src/main/g...


Both your examples use very simple logic. The Apache Groovy codebase example is of some peripheral functionality, i.e. a builder. All the methods in your Grails codebase example are, at most, 1 line long. I can't be bothered re-investigating what proportion of the core Groovy codebase really uses static compilation -- it certainly wasn't much only 2 years ago. As for Grails, virtually no-one has upgraded from v.2 to Grails 3 since it was released 2.5 yrs ago, or started many new projects with it.


i was just saying i personally found type declarations useful as the couple of small groovy codebases i worked on progressed over the short period (maybe a year?) i worked on them. thinking about how i might declare types made me decompose things a bit differently, which made the logic simpler in some places, which allowed me to do things like get rid of tests where i checked the behavior in a case where a method was missing on a function parameter, because now i knew the parameter was over certain type (and thus would have that method).


>I think one of our problems is that people have downgraded the importance of this. Much code nowadays (rightly or wrongly) is considered "disposable" - people think that the likelihood of any given piece of code they are writing as surviving more than a few years is negligible.

I think younger devs think this. Once you get a decade or more experience, you grow wiser and realise that code never dies, and especially the code you wish would die is particularly tenacious. And this is pure speculation, but I would wager that the number of lines of legacy code that is kept alive with maintenance is much greater than the number of lines of code that gets abandoned/rewritten/obsoleted.


I agree that the third point is important, but it's not clear that it's static typing that is important, and not type annotations. One reason why I can still fairly easily read and understand Eiffel code that I wrote decades ago is Design by Contract. And there's normally nothing static about DbC, it's about assertions that are checked at runtime and that by convention are part of a class's interface.

What both type annotations and DbC are is self-enforcing documentation (of an interface) that doesn't go out of sync with the actual code. But for that, you don't necessarily need static type checks. Now, type checking of type annotations that happens exclusively at runtime is an option that hasn't been explored much (after all, if you already have type annotations, why not let the compiler make use of them?), but an option that has sometimes been used successfully is having a mixture of static and dynamic type checks. You can often greatly simplify a type system by delaying (some) type checking until runtime (examples: for covariance or to have simpler generics).


I think one disadvantage of runtime type checking and DbC is that the compiler can't aid you in refactoring.

For example, if you add a case to a variant or sum type, or change the parameter or return type of some function, in a static type system, the compiler can tell you all the locations you need to change. In a runtime system, you have to find them yourself, or wait till you see an error at runtime.

Now, this is still better than the alternative of having the error propagate until it crashes 10 functions down, but the compiler finding all the places that need to be changed is something I've found to be really useful, especially in early development when there's a lot of refactoring happening. Presumably, this is probably useful in later stages as well, when the system is large enough that you can't expect to find all the uses of a function or type manually.


IDEs can still substantially aid in refactoring with runtime type checking and return type annotations - see WebStorm and PHPStorm for two good examples of this. It isn't perfect - but certainly for things like changing the return type of functions, it will usually get you at least 90% of the way there.

Now, whether you consider that's actually helping solve the refactoring, or actually introducing new bugs, well - that's another issue :)


Completely agree. I've had a team member able to quickly contribute a change to a project he didn't work on due to static typing keeping his code within the guard rails. It's very valuable.


4. Performance. There is software that can't be slow.


Right - I was trying to avoid runtime considerations and keep it on language, but it's true there are some concerns that extend into language. The border is becoming fuzzier when you consider that many (most?) languages these days can run in a browser after some transformation. Some even have 3 or more runtimes including js, a managed runtime, or native.


Incredulous that people would downvote this.


Seriously with the downvotes? Dynamic languages are obviously slower than static languages in general. Some special cases aside.

Performance is a concern for some projects - you wouldn't write an OS, database kernel, or mainstream game engine in a dynamic language.

How is that not a valid concern in the dynamic vs static typing argument? The parent comment has a legitimate point.


Technically, this is a strong/weak type distinction. Dynamic but strongly typed languages like Julia and erlang can be quite performant when given strong fences around the types their functions are passed.


No. Not at all. It has more to do with implementations.

Python is strongly typed, but dynamic. But slow. JavaScript and PHP are weakly typed and dynamic as they will coerce types in strange ways during operations and comparisons.

Lua is dynamically and strongly typed like Python, but LuaJIT can sometimes produce code on par with or even slightly faster than native code - because it's really JIT compiling the hot path to native code with some guards and offramps to interpreted code for special/unexpected cases.

But there are limits to those techniques and it's doubtful that dynamic languages will ever perform at the same level as static languages because the compiler simply has more information and doesn't have to be as pessimistic or insert as many runtime guards.


> it's doubtful that dynamic languages will ever perform at the same level as static language

I think these timing benchmarks come after the jit warmup procedure, so the presumption is that the compilation cost is amortized over lots of runs in an HPC-type setting:

https://julialang.org/#high-performance-jit-compiler


I’m in the static camp myself, but there are dynamic language JIT runtimes that meet or exceed the performance of many static languages.


In special cases. It's not nearly as reliable. There are a lot of ways to get poor performance out of such JIT compilers, and writing performant code for them is a bit of a black art that varies from version to version. Just read a little about the "fun" people have had with V8 over the years.


Including the Java runtime - the JVM is basically dynamic.


Java is a static language. While there are dynamic languages on the JVM like Jython, JRuby, Groovy, they don't get anywhere near the performance of the static languages.


Oh, sure it's statically typed language with all things being constantly cast to and from Object. (Have you ever used collections? Have you ever heard of "type erasure" term?)


There is code that doesn't `constantly cast ti and from Object`, most code won't need to, anyway. Generic information is kept during type-checking and then is discarded, this can be considered an optimization. There are some warts that appear because of erasure, but I believe you're wrong in implying that most developers care (most java developers, maybe scala peole have more issues because of erasure).


That it doesn't have reified generics like C# does not make it a dynamic language. Go doesn't have generics at all, but it's still a static language.


These are good points, but what about considering replaceability as an alternative to maintainability?

I personally find dynamic languages allow for easy replacability, as there's less explicit references of types. However this is highly dependent on the system being somewhat modular I suppose.


This is basically why erlang's hot code reloading would be impossible as a general solution in a statically typed language



15 commits 4 years ago, 121 commits 8 months ago

If these things are so good why does no one use them? EVERYONE using Erlang is using the same hotswapping facility. This sort of dynamism is just fighting against the language in an environment like Haskell.


Because Haskell isn't used in the same domains as Erlang, and hot patching full Haskell semantics either isn't needed, or the program is likely already using a DSL for the parts needing well defined dynamism, like the Yi editor, and so hot patching the runtime isn't needed.


Why would you need to? GHCi supports dynamic code reloading which most people do during development. During runtime there’s not so much of a use case though for most people.


Replying to myself. There are some use cases though, and in fact my comment points to one of them. Xmonad configurations are themselves Haskell code, and Xmonad does dynamic code reloading to make configuration changes without having to logout. Xmonad rolled their own solution, but "dyre" is a reusable generalization inspired by what Xmonad does. Using Haskell DSL's for configuration of Haskell programs has some significant advantages and dynamic code reloading is essential if the program itself is long-lived or can tolerate no downtime. I don't meant to imply there are no uses ... but outside of configuration or development environments (or real-time routing algorithms in Erlang's case, something I wouldn't advise doing in Haskell) I really don't know what general use cases there are for it. If someone does though, I'd love to learn something new.


Hot code reloading isn't used in practice to much extent in Erlang codebases either. It exists, in my experience mostly for Dev or emergency patch but not for normal release upgrades. Dynamics with tagged values and pattern matching def make Erlang the easiest to hot reload.


As long as we are here, with experienced programmers of both "sides", here's a question I've been wondering for some time: is it possible to create a haskell-like type system on an erlang-like language?

Erlang has dialyzer, which is great but it's based on optimistic typing. Hot-reloading aside, what would be the issue to creating such language? Maybe some issue with the process pids, which are quite dynamic?


There is no issue. Plenty of typed process calculi exist, and concurrent ML existed back in the late 90s and features channels and processes.


There is Cloud Haskell[1] which is reasonably close to a Haskell version of Erlang. It forces you to run the same code everywhere while in Erlang you could just hope that the code everywhere is compatible, though.

[1] http://haskell-distributed.github.io/


My first big love in languages was Turbo Pascal, because, it was the first one I learned. So many people do this, fall in love with the first language they understand.

My second big love in languages was Python. It's also the language in which I wrote my first major software product. It was this product that taught me to hate Python. Not because it was hard to create, or because quality was low. In fact, I was VERY FAST to produce 1.0. Took about a week. But after that, I had to work with other developers. That's where everything went to hell.

Then I got a new job a few months later where almost 100% of my time was spent doing maintenance on aging codebases written in Java, a language I never worked with before then. I won't say I fell in love with Java, but, I did fall in love with the ease of inspecting "the world" in each project. As soon as I had it setup in my IDE properly, it was so ridiculously easy to explore how everything related, and then to make refactoring changes? So much easier than it ever was in Python.

Now, at that point I didn't directly make the connection with the type system, but, in retrospect, I know that all of the value I derived from working in Java vs. Python came from having a descriptive, static type system. And frankly, I never once felt slowed down by the need to specify my types up front. In fact, the opposite is true. It taught me to put more thought into my data structures and vastly improved the quality of my software design before I even started writing logic.

Sadly now I'm moving into the data science/data engineering field and everything is Python and I don't know. I don't want to go back to this nightmare. It's like I spent the last decade in first class establishments with the best tools and now I'm going to have to work in the mud with sticks and shovels. I am interested in the field in terms of the capabilities it enables, and I have no problems working in Scala or whatever decently typed language is around, but, the reality is the lion's share of people in this field are doing everything in Python or R and I hate them both.

I figure I have two choices: help advance the capabilities of "better" platforms, or pursue some other direction in my career. It's too hard to know how much better life can be, then go back.


> Sadly now I'm moving into the data science/data engineering field and everything is Python and I don't know. I don't want to go back to this nightmare.

Now there are (optional) type annotations and mypy [0]. I've been using them in my latest projects and I found them useful/helpful.

[0] http://mypy-lang.org/


I started out with that in mind when I was initially working on some PySpark code. It fell completely apart the moment I had to include boto3 libraries. In fact, I'd hold boto3 as the pinnacle example of searingly awful garbage that languages like Python promote. Completely impossible to do even the slightest static analysis of a library that's 100% dynamically generated at runtime.

The worst part is that versions of the API for other languages are fine. It's just the Python one they decided to go all "clever junior developer" on.


> you can't make good experiments with those time frames

Multi-decadal longitudinal studies are not too uncommon in medicine, epidemiology and psychology. Why there is no will to conduct, or fund, this kind of research in computer science, I am not sure.

https://en.wikipedia.org/wiki/Longitudinal_study


It's hard to find N projects that are comparable and live for that long. Not least because whatever effect you are seeking will be much less noticeable than e.g differences in developer skill and experience.


Maybe because of "lifespans"?

People live for ~80 years... doing a 2-4 decade study isn't out of the realm of possibility.

Computers on the other hand... While there are a few mainframes that live to be 10 years old - the vast majority of the internet, program languages, apps, etc... Hell, even the iPhone just hit 10 years old.

How can you have a 20 year study when the majority of "code" is less than 10 years old?


10-20 years?! Holy Cow! Other than huge software projects (like Word or Mac OS - and even then...) is there really software that still has that kind of maintenance window? I've worked for a Fortune 150 company for nearly 2 decades. There is not a single piece of software at the company that has not been rewritten from scratch (usually due to business changes) at least once every 10 years. I can't even imagine something that would still be useful after 10 years (honestly, even 5 years seems like a stretch). Just think - software written 20 years ago would have been written when the WWW was still soiling its diapers.


I'm working for a healthcare insurer and I don't believe anything has been rewritten since they went from mainframe to .Net.

The previous application I worked on is over a decade old (and it shows). The current application I'm working on is about 8 years old.

Neither applications has any sign of being replaced. Which would be insane, as they both have roughly a decade of laws & regulations and business lessons embedded in them. Despite the state of especially the older application, I don't see how rewriting the entire application would fix anything.

At best parts would be rewritten. And the parts I'm thinking about wouldn't be rewritten because of technical reasons, but because of the way they work. The prime example is a part that only 1 person, a business user, understands.


I work in medical robotics and I can tell you that a lot of our code is quite old, and we have a culture that code you write will stay around. Some things don't change for example optimal control algorithms, while others things are very difficult to change such as network routing. So, while the applications get re-written on the 4-10 year time frame parts of the OS are 15+ years old.


> I can't even imagine something that would still be useful after 10 years

Ah the HN perception bubble.

Good code last longer than that. Bad code gets replaced.


Good code is replaceable. Bad code is hard to get rid of.

Some code sticks around because it's great at what it does. Some code sticks around because it works if you don't touch it and is impossible to delete due to various kinds of dependencies.


All code is replaceable. It's bad APIs that are hard to get rid of.

Most POSIX APIs, for instance, are confusing, obtuse and unnecessarily imperative but still good enough in spite of being 40 odd years old. There's way too much code that implements or calls them to justify making significant changes as this point.


We are talking years and decades, where "replacing" is applied to entire applications, not chunks of code.

Especially in SOA, it can be cheaper to replace a poorly written service than trying to rewrite all of it over time.


If it ain't broken, don't fix it.


> If it's broken but in familiar, known ways, with runbooks, and the cost of rewriting it is way higher than supporting it for 5 years & hopefully we'll move away from the business model which requires it, don't fix it.


I have started companies 15+ years ago that I sold which still use (a lot of) the same code. You think (I thought) that would never happen but I think this idea that every company rewrites everything is not all that common. Banks don't, but the small startups I work with don't either. Frontends get redone, there is refactoring and library updates but most (unless trivial tiny systems) just stays the same. You need to think that they run a business and that business is not software development usually. So if there is not a pressing reason to replace things, why would they allocate money for that?


I also consult for Fortune 500 companies on a regular basis, and most of them still have core business processes running on mainframe code bases well older than 10 years. No one is doing major greenfield development on mainframes, but they still exist all over the place.


I think IBM and their Z division would disagree. IBM Z Users: The 10 top insurers, 44 of the top 50 banks, 18 of the top 25 retailers, 90% of the largest airlines


That seems to agree with me? Mainframes are still in use everywhere. However, that statistic doesn't imply that those customers build their brand new greenfield capabilities on those mainframes.


I'd love to read more about what it's like doing mainframe consulting. Do you own or know of any blogs in this area?


I don't know of any. I don't consult for the mainframe systems themselves. Usually I get pulled in when the client realizes that their last mainframe developers are years away from retirement, and they cannot find any new mainframe developers to hire. That starts a mad dash to migrate/replace the mainframe solution without disrupting the entire business. Despite the existing codebase, these projects are very difficult because noone knows how they work anymore.

At one of my clients they had one mainframe developer left that knew their systems. She had already tried to retire, but they got her to agree to stay on for 5 years in return for bags filled with money. That meant they had 5 years to rewrite on a platform they could actually hire people for. 5 years to replace a system with decades of history.


IME most software will last that long, if it's remotely successful then it will at least make it to the 10 year mark. Business rarely changes drastically enough for a rewrite to make financial sense.

About best you can hope for is a new "epoch" that forces a rewrite. In the MS world we went from classic VB and VC++ to .net, a lot of companies went through rewrites to keep up with that and some of that software is now nearing 20 years old. There has been a few other epoch like changes, terminal -> GUI, c++ -> java, desktop -> web, except for maybe the last one it's been quite a while since a new epoch has begun.


Parasolid [1] is 30 years old and is the dominant B-rep solid modeling kernel powering Solidworks, Siemens NX and Solid Edge. It's very difficult to see it being replaced as it is so entrenched.

Parasolid (written in a C dialect) was a rewrite of Romulus (written in Fortran) and that goes back to 1974. And that was a rewrite of Build that originated from Ian Braid's PhD thesis. [2]

I know people who are still working on the same Parasolid code after 30 years. Some of them

Disclaimer: Parasolid dev 1989-1995

[1] https://en.wikipedia.org/wiki/Parasolid

[2] http://solidmodeling.org/awards/bezier-award/i-braid-a-graye...


I recently finished making some mods to a PHP CMS to make sure it works fine with PHP7.1. The base of this code is 17 years old and is still used every day.

The CSS/JS on the frontend rarely lasts more than a few years, usually changed to due to design trends (flat, responsive, mobile-first etc).


Everything that controls hardware has tremendous maintenance windows. Trains, planes, industrial machines.

Most business critical software like SAP for example is also based on decade old codebases.


Wordpress is 14 years old.

It's probably a good reference project, when talking about maintenance (nightmares).


My business, https://www.filterforge.com/, is almost 12 years on the market since the release of v1.0 back in 2006. And if we also count the 6 years of initial development, that would be 18 years in total.


I also work at a similarly large company.

Some of our internal infrastructure systems are 10-20 years old - some could definitely do with a complete rewrite, but in the meantime, they're mission critical systems.

As for our products - some of them have even longer timeframes than 20 years.


I know of multiple finance companies that have Mainframe Assembler from the 70s.


I wonder how older you are than the average javascript crowd :-)

Another little thing is (from my own point of view) is that many application don't live in a vacuum : they use json schema, or WSDL; and database with types and constraints. So what the language does not "type", the context does.


performance at runtime is usually factor in type systems as well.


Rather than conduct experiments I believe that existing data still holds an answer. There's one metric that hasn't been looked at. Many projects over a long period of time tend to get rewritten in a different pattern or a new language/framework. I would say dynamic languages tend to have this problem in greater proportion over say a typed language like java. This is a direct long term marker for the maintainability of a language.


Counterpoint: Java projects tend to be maintained rather than rewritten because the verbosity of the language makes it difficult to tell boilerplate from productive code. It's less dramatic to rewrite in a dynamic language because understanding the full system before and after is easier.


I would add fun.

I know "fun" is highly subjective but still important none the less.


I think what's often missing from these arguments is that statically checking (or inferring) homogenous lists is probably one of the most superficial uses of the type system in Haskell (and indeed not the interesting feature most power-users of Haskell are interested in as far as I can tell).

What is interesting is using the type system to specify invariants about data structures and functions at the type level before they are implemented. This has two effects:

The developer is encouraged to think of the invariants before trying to prove that their implementation satisfies them. This approach to software development asks the programmer to consider side-effects, error cases, and data transformations before committing to writing an implementation. Writing the implementation proves the invariant if the program type checks.

(Of course Haskell's type system in its lowest-common denominator form is simply typed but with extensions it can be made to be dependently typed).

The second interesting property is that, given a sufficiently expressive type system (which means Haskell with a plethora of extensions... or just Idris/Lean/Agda), it is possible to encode invariants about complex data structures at the type level. I'm not talking about enforcing homogenous lists of record types. I'm talking about ensuring that Red-Black Trees are properly balanced. This gets much more interesting when embedding DSLs into such a programming language that compile down to more "unsafe" languages.


List typing isn't as superficial as it seems. The following has happened to me multiple times, perhaps in the last month:

I have a large code base. I want to replace a fundamental data structure to support more operations/invariants/performance guarantees. I change the type at the roots of the code base. My instance of ghcid notifies me of the first type error. I fix it. This repeats until the program compiles again. I run the tests. All the tests pass.

This is insane in Python/C/Ruby. I've had to do it in C and Python. In Haskell I do it with impunity.

The type system doesn't just check what my program does, it is the compass, map, and hiking gear that gets me through the wilderness.


Yup, love that about statically typed languages. This happens all the time for me in C#. I have several libraries I like that do a lot of code generation. When the project is young, directly handling the generated classes works well but as the project grows, I inevitably want to wrap the handling of the generated classes. It's awesome to be like, welp... it's time to handle this one type differently. Change the return type of an interface or the type of a container and just have the compiler tell you everything you broke.


I’d argue that code generation is an anti—pattern that is only necessary because of static typing. A dynamic language would let you change the implementation of all generated objects simultaneously.


Code generation is, in essence, the ultimate static construct since it allows you to compile any feature you would achieve through late binding and reflection, but with a static code path. Which in turn lets you leverage the compiler toolchain more and bring more errors towards compile time(where they're cheap to resolve).

Dynamic languages can always lean on runtime features, but that's also their peril. Late binding deprives you of leveraging the tools in favor of "trust me".

In both cases you can get a maintenance nightmare, of course. The point as I see it is to move things toward the runtime when the error case is not troublesome, and towards the compiler when automating in more safeguards would help.


Code generation by macros is an important feature of Lisps, which tend to be dynamically typed. Maybe it feels less like code generation when you don't also have to produce the correct type annotations, but a good inference engine can eliminate most of those from statically typed languages as well.


You can actually do a huge amount of metaprogramming in dependent type systems and still have the the power of static checking. We're still figuring out how to improve the ergonomics to the level that it matches macros and other code-gen methods, but it's super exciting stuff.


That may be entirely true and everything you can achieve with Roslyn and T4 templates may be achievable through some more elegant construct in another language.

But I've never had the luxury of choosing a tech stack for it's purity of design. So the feature is wonderful in my day to day regardless.

I'd love to build 3D experiences in a language like lisp or scheme. It'd be great fun to learn but I don't currently have the luxury of the time it would take to ramp.

And I certainly don't have the political capital to convince my entire dev team to change.


In principle, this should be possible with static typing with a robust enough type system, too.


A lot of generated code could be done with reflection or meta-object protocols.

There's a good reason people generate code instead (in statically and dynamically typed languages!): performance.


I find that people doing these claims have seldomly really worked in large Python codebases...

Personally, I find it pretty workable in Python with a big codebase (but you have to respect the rules like having a good test suite -- you change the time from compiling to running your test suite -- which you should have anyways)...

I find that the current Python codebase I'm working on (which has 15 years and around 15k modules -- started back in Python 2.4, currently in Python 2.7 and going to 3.5) pretty good to work in -- there's a whole class of refactoring that you do on typed languages to satisfy the compiler that you don't even need to care about in the Python world (I also work on a reasonably big Java source codebase and refactorings are needed much more because of that).

I must say I'm usually happier on the Python realm, although, we do use some patterns for types such as the adapter protocol quite a bit (so, you ask for a type and expect to get it regardless of needing a compiler to tell you it's correct and I remember very few cases of errors related to having a wrong type -- I definitely had much more null pointer exceptions on java than typing errors on Python) and we do have some utilities to specify an interface and check that another class implements all the needed methods (so, you can do light type checking on Python without a full static checking language and I don't really miss a compiler for doing that type checking...).

I do think about things such immutability, etc, but feel like the 'we're all responsible grown ups' stance of Python easier to work with... i.e.: if I prefix something with '_' you should not access it, the compiler doesn't have to scream that it's private and I don't need to clutter my code -- you do need good/responsible developers though (I see that as an advantage).


> I definitely had much more null pointer exceptions on java than typing errors on Python

Null pointers are a type error. The fact that several nominally "statically" typed languages don't differentiate between nullable and non-nullable types is a significant source of failure in their type systems. Using a modern language that properly identifies nullable values as a distinct type from non-nullable ones goes a long way towards eliminating a whole host of problems. It will be interesting to see what things look like in 10 years or so once Rust has had time to really displace a significant portion of the C and C++ code in the wild, and hopefully Kotlin has killed off Java (and if we're really lucky Typescript has done the same more or less with Javascript).


I'd worked in OpenStack for 4 years and with Python in general for about 10 years.

I can say from my experience it is definitely possible to maintain large codebases in Python. The type errors of the superficial variety that the OP refers to were usually caught before they made it to production (and were rare besides if you were experienced enough to avoid them). It requires discipline to maintain tests and write code in a way that avoids errors.

I've been learning OCaml and Haskell for a couple of years along with formal methods using TLA+ and Lean. I used to think type theory was the accounting of maths. I still think that's at least partly true but the power it brings you as a programmer is quite powerful.

I find working with Haskell or OCaml to be much more productive. Instead of stepping through a debugger or following tracebacks (a descriptive error) I get prescriptive errors as I make changes to a Haskell codebase. The propositions in the type system form a much better specification than unit tests alone.

I still like Python and C for many reasons and will continue using them where appropriate. However I think Haskell/OCaml offer quite enough power that everyone should at least consider what they bring to the table.


I do the exact same thing on large codebases of Ruby, but instead of the compiler type-checking errors, it's the test suite errors.

It's true that static type-checking proves the absence of an entire class of errors. But it doesn't prove that the code does the correct thing; it could be well-typed but completely wrong. On the other hand, tests prove that the code does the correct thing in certain cases. ...Of course, it's up to the developers to actually write a good test suite.

The faster we can all accept that there are pros and cons to both, the faster we can come up with a solution that takes advantage of the best of both worlds. That's the whole point of this OP.

I, personally, have always wondered about ways to dial in to the sweet-spot over time as a project matures. At the start of a project, shipping new features faster is often more important. But if the project survives, maintenance (by new developers) and backward compatibility become more and more of a priority.


> It's true that static type-checking proves the absence of an entire class of errors. But it doesn't prove that the code does the correct thing;

So prove it yourself. Proving things about programs that rely on dynamism is invariably much harder than proving things about programs that don't.


> This is insane in Python/C/Ruby.

It's only insane if you don't have test cases with good coverage, in which case you are very-very-very screwed, statically typed or not.


Testing is not the answer to the problem of knowing if I refactored everywhere necessary. In a statically typed language I don't even have to worry about what do I do if my function is passed a var of the wrong type, it won't compile.


I don't really the lumping of C in with Python and Ruby here. The C compiler picks up on that, too. All over this comment section people are calling C weakly typed, I don't get it. Is it because void* exists? Every language has something like that.


It's not just that void * exists, but that it's basically mandatory.

C's built-in arrays are super weak, so you need some library to do proper resizable arrays. Since C doesn't have generics, such a library will use void * as the type for putting values into the array and getting them back out again. You'll be casting at every point of use, and nothing will check to make sure you got the cast right, other than running the code and crashing.


> C's built-in arrays are super weak, so you need some library to do proper resizable arrays. Since C doesn't have generics, such a library will use void * as the type for putting values into the array and getting them back out again. You'll be casting at every point of use, and nothing will check to make sure you got the cast right, other than running the code and crashing.

There are other options though like macros and code generation. Code gen in particular can give you more options than generics without sacrificing any type safety.


I don't know why macro-based containers aren't more popular. You can have a quite usable interface, and even the implementation isn't that ugly. Example:

http://attractivechaos.github.io/klib/#Khash%3A%20generic%20...

This approach isn't just more strongly typed than using void * for everything, it's also typically more efficient. For an array, you can put larger structs directly in the array rather than being forced to use a pointer. For a hash table, the same applies, plus you can avoid expensive indirect calls to compute hashes. (There are alternative non-macro approaches that trade off that overhead for other types of overhead, but you can't do as well as with a specialized container.)

I guess you could ask, at that point why not just use C++? And a lot of people do, and the people left writing new C programs are often traditionalists who don't want to switch to new approaches. And to be fair, there are disadvantages to macro-based containers, like increasing build time. But I still think there's room for them to see more adoption.


I use a code generator. It has a great type system, it's composable, mature and it can even compile pretty fast with the right tooling. It's called C++!


Templates are good for some things but they only do a fraction of what code generators can do. With code generation you can generate types from database tables, web APIs, etc. You can do things like declaratively declaring database views and generate huge chunks of an application. It can handle all sorts of boiler plate code that you can't do with templates alone.


In this thread, a.lot of people are conflating strong/weak typing with static/dynamic typing.

Static typing versus dynamic typing is fairly binary: if your types are checked at compile time they're probably static, while if they're checked at run time they are probably dynamic. Haskell, C are statically typed, Python, JavaScript are dynamically typed.

Strong/weak typing is more of a spectrum. A strong type system can check many properties of programs and accommodate many patterns as types. A weak type system, on the other hand, can't check many properties of programs, and has to be bypassed to accommodate common patterns. JavaScript has probably the weakest type system, because it checks almost nothing ("hi" + 42 returns "hi42" even though this is nonsensical, {}.foo returns undefined rather than throwing a type error). C is fairly weakly typed because you can add disparate types (int* + int returns int* even if you intended to add two integers) and the type system has to be bypassed with void* to do anything sizeable. Python, ironically, is slightly stronger, in that applying operators to objects of types with no defined relationship throws exceptions ("hi" + 42 errors). A spectrum from weakest to strongest might look something like: JavaScript, C, Go, Ruby, Python, Java, C#, OCaml, Haskell.

My personal experience is that the difference between static and dynamic types isn't very important to my development process or code quality. I have to run and unit test my code to verify it, so the checks happen regardless of whether they happen at compile time or run time. But the difference between strong and weak typing is huge. Strong types catch more bugs, but perhaps more importantly, they catch those bugs where they occur. A type error when adding "hi" + 42 is far more useful for debugging than a mysterious unit test failure on a completely different function where it's returning "Hi42username" instead of "Hi Username" because you added the wrong variable. A segfault 30 lines later is harder to debug than an error when trying to add an int to the value at an int*.


> The C compiler picks up on that, too.

Except when it's not. Just five days ago I was debugging an error in my Erlang port driver that was caused by me passing receiver (ErlDrvTerm, an int in disguise) in the place where I wanted number of iterations. The funnier thing was that the declaration of the function had the arguments in correct order (and that's what guided me), but definition had them swapped. The compiler did not catch that bug, because, well, both are ints, so apparently the declaration and definition match, don't they?


That’s why people should use one-element structs instead of typedefs for that kind of use case. A struct is a distinct type that can’t be accidentally mixed up with random integers, but its memory representation and efficiency will generally be identical; and you can add one-liner conversion functions to minimize syntactic overhead when you do need to convert to/from raw integers. Same idea as ‘newtype’ in Haskell; it works pretty much just as well in C, at least for ‘ID number’ sorts of types where you’re usually just shuttling values from place to place rather than doing any arithmetic. (For types where you do need arithmetic, it gets pretty ugly in languages without operator overloading. Except Go, which doesn’t have operator overloading but does have builtin support for defining distinct versions of integer types.)


In fact, make the structs opaque so that you retain full control over the data and its invariants.


void * is part of it, but you can also implicitly cast from between integer types, and also between integers and enums. Think of passing an enum or an int into a function which takes a long as an argument.


I mean... I don't really think that's a strong case for calling C's type system weak.


I suspect most people which state that C has a weak type system are really talking about the fact that C has a weakly _enforced_ type system. You can break the type system's rules rather easily (or perhaps it's more accurate to say, the type system is too permissive), either way it doesn't provide you the same guarantees a stronger type system provides). At least that's my take on it.


This.


I would say it's mostly a matter of use: In C you deal with void* or typecasts all the time, whereas in higher level languages it's much less common, either because the type system is smarter, or the constraints that it does have are more strictly enforced. For example: you can happily compare a char* and an int in C, but other languages like python might error at the thought.


C is incredibly permissive with regard to its types which are themselves very anemic. With the exception of the numeric primitives, C really only has a single type, the pointer, everything else is just syntactic sugar for various forms of pointer arithmetic. For instance arrays in C are just a shortcut for some pointer plus an offset multiplied by a constant determined at compile time based on what you've claimed is the underlying struct or primitive of the array. Importantly C is perfectly happy to take any random pointer into arbitrary memory and allow you to map any set of offsets into it. It's worth looking at for instance Rust that at least in theory allows the same thing to be done, but only by explicitly opting out of static checks via unsafe declarations. In normal safe code Rust will statically verify that a given reference (pointer more or less) is in fact referring to the type you're code is expecting it to, rather than the C approach of simply assuming the program is correct. Looked at another way, as far as the C compiler is concerned nearly everything is a pointer, and one kind of pointer is entirely exchangeable with another kind of pointer (with at most a cast being required, but probably not even that if it's the entirely too common case of being a void pointer). This is in contrast to nearly every other statically typed language that will either at compile or runtime verify that any given reference is the appropriate type before dereferencing it. C++ nominally at least has a more powerful type system, but since it was designed (in theory at least) as a superset of C, C's permissivity blows a giant gaping hole in its type system.


    /t/tmp.1q8r9dZAtX > cat test.c
    int main() {
    	char *test = "test";
    	int i = 10;
    	return test == i;
    }
    /t/tmp.1q8r9dZAtX > cc test.c
    test.c: In function ‘main’:
    test.c:4:14: warning: comparison between pointer and integer
      return test == i;
                  ^~


    $ python -c '"test" == 10'
    $


> The developer is encouraged to think of the invariants before trying to prove that their implementation satisfies them. This approach to software development asks the programmer to consider side-effects, error cases, and data transformations before committing to writing an implementation. Writing the implementation proves the invariant if the program type checks.

I really wish more languages took this to the logical conclusion and implemented first-class contract support. It seems work on contracts stopped with Eiffel (although I've heard that clojure spec is _kinda_ getting there).



Check out Dafny, Whiley, and Liquid Haskell.


Ada also supports contracts, as does .NET with code contracts.


Contracts are useless. They merely describe what you want your code to be, not what your code actually is. Just grab a pen and a piece of paper, and start proving things about your programs.


> Contracts are useless.

No, they are not.

1. If the code doesn't conform to the contract, it will fail on the contract boundary, with a well-defined error. If this is useless, then `assert` is also useless, which it is, of course, not.

2. With a sufficiently well-designed language and sufficiently smart compiler, you can move some contract checks to compile-time. See Racket.

3. If your language supports both static and dynamic typing, the contracts are a dual of static types, which lets you interface the static and dynamic parts of the code seamlessly and automatically (in both directions). Again, see Racket.

Meta: I wonder, why it's mostly static-typing proponents who aggressively evangelize, insult the other side, are 110% sure they're right even though there is no scientific evidence and so on. Could it be the bondage&discipline approach of static typing just appeals to people with a certain mindset, who are statistically more probable to engage in such behaviors, no matter the subject?


0. Trapping the error doesn't make your program any less wrong. I agree that `assert` is equally useless.

1. With pencil and paper, you don't need to wait for a smart compiler - you can get started proving things about your programs today!

2. I can totally see what's coming next: “Being wrong is dual to being right, so being wrong is another possibility worth exploring”. Right?

> Meta: (slander)

No comment.


Static typing prevents bugs in code to the degree that the programmer can correctly encode the desired behavior of the program into the type system. Relatively little behavior can be encoded in inexpressive type systems, so there's a lot of room for bugs that have nothing to do with types. A lot more behavior (e.g. the sorts of invariants mentioned in agentultra's top level comment) can be encoded in a more expressive type system, but you then have the challenge of encoding it /correctly/. A lot of that kind of thinking is the same as the kind of thinking you'd have to do writing in a dynamic language, but you get more assurances when your type system gives you feedback about whether you're thinking about the problem right.

For my money, I work in a primarily dynamic language and I already have a set of practices that usually prevent relatively simple type mismatches so I very rarely see bugs slip into production that involve type mismatches that would be caught by a Go-level type system, and just that level of type information would add a lot of overhead to my code.

But if I were already using types, a more expressive system could probably catch a lot of invariance issues. So I feel like the sweet spot graph is more bimodal for me: the initial cost of switching to a basic static type system wouldn't buy me a lot in terms of effort-to-caught-bugs-ratio, but there's a kind of longer term payout that might make it worth it as the type system becomes more expressive.


> Static typing prevents bugs in code to the degree that the programmer can correctly encode the desired behavior of the program into the type system.

Exactly. The author of the article implicitly equates "statically verified code" with "bug-free code". But that's not correct. It's quite possible (and even, dare I say it, fairly common) to have code that expresses, in perfectly type-correct fashion, an algorithm that doesn't do what the user actually wants it to do. Static typing doesn't catch that.


It depends on your type system. In new languages like Idris or F* you can encode in the type the correctness of an algorithm and it will not compile if the compiler can not prove that correctness.

For example I can prove my my string reverse works in Idris (https://www.stackbuilders.com/news/reverse-reverse-theorem-p...). Or I could prove that my function squares all elements in a list. Etc.

Now a big part of the problem is expressing with sufficient accuracy what the properties of the algorithm you want to prove are. For example for string reverse I may want to show more than that `reverse (reverse s) = s`. Since after all if reverse does nothing that would still be true. I would probably want to express that the first and last chars swap when I just call reverse xs.


> For example I can prove my my string reverse works in Idris (https://www.stackbuilders.com/news/reverse-reverse-theorem-p...).

This article basically demonstrates GP point, though. It proves that `reverse` is self-inverse, but there are lots and lots of functions that are self-inverse (for example, `x -> x` is self-inverse. As would be the function that swaps any odd-index element with the one following it).

The claim was a difficulty of how to encode the actual correctness into your type-system. That this article doesn't actually encode correctness of reverse, seems like pretty good for that difficulty.


This is precisely what I said in the next paragraph and even mentioned away to correctly encode that "correctness". Where a _correct_ implementation of reverse has the property `strHead' s = strTail' (reverse s)` recursively.

It's important to understand that type systems can encode correctness to the level you can specify it. So the program is therefore bug-free to the accuracy of your requirements on it.

Most people do not work with type systems which can do this and are unfamiliar with formal verification. The author presents directly (and argues through out) that there is a correlation not that bug free and static analysis are the same thing.


Sorry, you are right, of course. I fell victim to one of the internet's classic blunders: Skimming a long comment thread and not carefully read what I reply to in the end :)


There's no such thing as "proving correctness". You can have bugs in the type definitions. You can have bugs in the english (or whatever your native language is) description of what you think the algorithm should be doing. You can prove a program does what the types say it should do but that is not what "correctness" means.

>Now a big part of the problem is expressing with sufficient accuracy what the properties of the algorithm you want to prove are. For example for string reverse I may want to show more than that `reverse (reverse s) = s`. Since after all if reverse does nothing that would still be true. I would probably want to express that the first and last chars swap when I just call reverse xs.

This is no different from writing tests in a dynamic language.


> This is no different from writing tests in a dynamic language.

Types and tests are not equivalent. This is a prevalent myth among dynamic typing enthusiasts. There is no unit test that can ensure, eg. race and deadlock freedom, but there are type systems that can do so. There are many such properties, and tests can't help you there.

Types verify stronger properties than tests will ever be able to, full stop. You don't always need the stronger properties of types, except when you do.


To add some color here on the difference between a type proof and a test: consider that you can never test all possible strings for reverse.

However, a type proof can show that reverse, reverses all possible strings.

It is possible to test that a function on 16 bit integers returns the correct value for all inputs. Doing so would be a proof by exhaustion.

Type based proofs let us prove things using other methods than exhaustion, which is the only possible way to prove things with tests. That is an important property.


Indeed, or to summarize as a soundbite: tests can only prove existential properties but types can prove universal properties.


> There's no such thing as "proving correctness". You can have bugs in the type definitions.

You can have bugs in type definitions and you can have bugs in tests as well and that'll be a problem until computers can read our minds. Type definitions are superior at checking what you specified though because the checking is exhaustive. Perfect is the enemy of good and all that.

> This is no different from writing tests in a dynamic language.

It's not. In that example, the type system will verify the stated property is true for all values of "s". Tests only check some specific examples whereas using types in this way tests all possible input and output pairs. It's like comparing a maths proof to checking an equation holds for a few examples you tried.


No, I’m pretty sure there’s a pretty large body of academic and industrial research on proving program correctness that you can’t just hand wave away with sophistic “but what if your type signature is wrong” nonsense. And there’s a huge difference between a test and a proof - a test can only tell you a program doesn’t do what you think it should for a particular case, a proof tells you that your program does exactly what it is supposed to.


Defining "correctness" in terms of types is the CS equivalent of defining "risk" in terms of volatility - it replaces a real and fundamentally unsolvable problem with a problem that, while it has the advantage of being tractable, isn't actually all that important to solve. Great for publishing papers, dangerous when people start confusing the fake problem and the real problem.


And you’re just doubling down on sophistry. Why should anyone take you seriously in this conversation?


Saying "you're doubling down on sophistry" instead of just "you're wrong" doesn't actually make you more convincing.


No, I’m saying sophistry contributes nothing to the conversation. I don’t care about convincing you about the subject at hand: you have the opinion of someone who’s invested too much into justifying their ignorance to actually pick up a textbook and learn the relevant material.


> There's no such thing as "proving correctness".

Given the caveats you mention, is there such thing as proving anything?


There are two meanings of prove. The one that type theorists are using is roughly "to derive your statement from axioms with pure logic." This sense of the term can never apply to things in the real world, like programs (of course, a program also exists both as an abstraction about which things can be proved, but when you're talking about programs which are actually doing things in the real world, you can't treat them as being pure logic).

The second sense is the scientific "gather enough supporting evidence that your are reasonably sure". In this sense, you can prove a lot of things.


> There are two meanings of prove. The one that type theorists are using is roughly "to derive your statement from axioms with pure logic." This sense of the term can never apply to things in the real world...

I am not sure it's a meaningful distinction. There are no triangles in the "real" world -- if you look close under a microscope, there will be more than three sides -- but geometry proves to be of great practical value just the same.


> You can have bugs in the english (or whatever your native language is) description of what you think the algorithm should be doing.

According to the principle of separation of concerns, this isn't the programmer's problem.

> You can prove a program does what the types say it should do but that is not what "correctness" means.

Of course, the ultimate arbiter of what “correctness” means is the program specification.


> The author of the article implicitly equates "statically verified code" with "bug-free code".

Not at all. First, the statements as put here are discrete (boolean even) while I present both "statically verified code" and "bug-freedom" as living on a continuum. Secondly, I don't equate them. If anything, I assume a monotonic, positive relationship between them (strictly speaking not even that. I make pretty clear that the curves could also have whatever shape. But I yield that I am very suggestive in this because I do strongly believe it to be the case). In fact, one of the main points of the argument is that the two are not equal - otherwise, the blue curves I drew would all be straight lines from (0,0) to (1,1). And lastly, none of this is done implicitly. I mention all of this pretty explicitly :)


> It's quite possible (and even, dare I say it, fairly common) to have code that expresses, in perfectly type-correct fashion, an algorithm that doesn't do what the user actually wants it to do. Static typing doesn't catch that.

It's also possible to grab a knife with your hand on the blade edge and cut yourself, but that doesn't diminish safety the value of knife handles.


Or as Knuth pithily put it, "Beware of bugs in the above code; I have only proved it correct, not tried it."


The article was not trying to discuss how to make programmers smarter. No language is going to help with that so there is no point in talking about it. As far as the scope of the article is concerned, it's fair to say that statically verified code equals bug-free code.


I had a related reaction, which is that the problems you're mentioning can become more complex when using libraries outside the standard library with static languages.

E.g., my experience is that poor library design can sometimes be exacerbated in statically typed languages if the type logic is poor and doesn't match the problem domain. Dynamic languages sometimes inadvertently "correct" for this by smoothing over these sorts of issues.

I prefer static languages (or at least optionally typed ones) but there can be big downsides of the sort you're mentioning, that are exacerbated by third-party libraries.


> I already have a set of practices that usually prevent relatively simple type mismatches

Care to share those practices? I also primarily work (this year, at least) in dynamically typed languages.


I think it depends on language and context a lot so I can only give some vauge advice. A lot of the idea is to write code that only does the one thing you know you need it to do, does it very simply and fails very obviously when it's not used correctly. In some cases your instinct is to make things as generic as possible at the first pass. Preemptive abstraction pays off more in statically typed languages, in my experience, because it saves you the time of refactoring a type system in place if you turn out to need flexibility later. But in dynamic languages it can introduce complexity that hides bugs.

Simple type-level errors come up the most when you have types that are easily conflated. That tends to happen when you have functions that accept more than one type of thing or output more than one type of thing - avoid that. Avoid polymorphism and OOP patterns that set up a complicated type hierarchy or override methods - you don't want any instance where you end up with something that looks a lot like one type of thing but isn't. Type hierarchies can often be factored out into behaviors provided by modules that supply functions that operate on plain data structures. For variables and parameters, stick to really simple types to whatever degree it's possible, e.g. language primitives, plain old data-container objects and "maybe" types (e.g. things that could be a primitive or null) when absolutely necessary (and check them whenever you might have one). Use union types extremely sparingly. Assignment/creation bottlenecks are useful: try to have only one source for objects of a certain type that always constructs them the same way (so you don't end up with missing fields).

A lot of programmers coming from a language with a stronger type system (especially when transitioning from OOP languages to functional or hybrid languages) tend to be nervous about writing functions without a guarantee about what kind of inputs they'll see, so they try to compensate for the lack of type safety by building functions that can cope with whatever is thrown at them. The idea is that this makes the function more robust but ironically, this tends to make bugs a lot harder to track down. In my experience it's better to write functions with specific expectations about their input that fast-fail if those aren't met, instead of trying to recover in some way - garbage-in-exceptions-out is better than garbage-in-garbage-out. If you send the wrong kind of thing to a function, you want it to throw an error then and there, and you'll likely catch it the first time you test that code.

A lot of the idea of this kind of advice is to shift the work that would be done by the compiler's type system to the very first pass of testing - if your program is basically a bunch of functions that only take one sort of thing in each argument slot, only emit one sort of thing as a result and fail fast when those expectations are violated, you'll typically see runtime type errors the first time those functions get executed, which is a lot like seeing them at compile time.


> but you get more assurances when your type system gives you feedback about whether you're thinking about the problem right.

It really isn't as much about languages as it is about the people who use them. The key ability is to prove things about programs. Powerful type systems, especially those that have type inference, merely relieve the programmer from some of the most boring parts of the job. Sometimes.


> ... and just that level of type information would add a lot of overhead to my code.

Can you give an example of this overhead?


The biggest issue with claims like "there are only diminishing results when using a type system better than the one provided in my blub language" is that it assumes people keep writing the same style of code, regardless of the assurances a better type system gives you.

"I don't see the benefit of typed languages if I keep writing code as if it was PHP/JavaScript/Go" ... OF COURSE YOU DON'T!

This is missing most of the benefits, because the main benefits of a better type system isn't realized by writing the same code, the benefits are realized by writing code that leverages the new possibilities.

Another benefit of static typing is that it applies to other peoples' code and libraries, not only your own.

Being able to look at the signatures and bring certain about what some function _can't_ do is a benefit that untyped languages lack.

I think the failure of "optional" typing in Clojure is a very educational example in this regard.

The failure of newer languages to retrofit nullabillity information onto Java is another one.


The article makes two main points: a) static typing has a cost and b) thus, any benefit it brings should be examined against that cost.

I am sorry, but I don't really see how you stating more benefits of static typing really counters either of them.

I recommend reading the article again. But this time, try not to read it as defending a specific language (I only mentioned my blub language so that it's a more specific and extensive reference in the cases where I use it - if you are not using my blub language, you should really just ignore everything I write about it specifically) and more as trying to talk on a meta-level about how we discuss these things. Because your comment is an excellent example of how not to do it and the kind of argument that prompted me to this writeup in the first place.


Those are not really 'points', though; they are far too trivial. Obviously, nothing counters them, because they are tautologies that could just as well apply to any subject.

The point is to explore a comparative difference in value, and that is realized through mastery of the tool, not merely living in a world where it exists.


> they are far too trivial.

You'd have thunk I didn't have to make them than. But I did, judging from literally every argument I had about this.


> This is missing most of the benefits, because the main benefits of a better type system isn't realized by writing the same code, the benefits are realized by writing code that leverages the new possibilities.

The inverse is also true; you don't really get the benefits of dynamic typing until you start doing things differently to take advantage of that difference. If you still code like you're in a static language, you'll miss the benefits of a dynamic one.


Optional typing has not failed in Clojure, it's growing with clojure spec


What amuses me in all "static typing versus..." discussions, is that it usually it is the comparison between two camps:

Camp A: Languages with mediocre static typing facilities, for example:

     -- C (weakly typed)
     -- C++ (weakly typed in parts, plus over-complicated
        type features) 
     -- TypeScript (the runtime is weakly typed, 
        because it's Javascript all the way down)
Camp B: Languages with mediocre dynamic typing facilities, for example:

     -- Javascript (weakly typed) 
     -- PHP 4/5 (weakly typed) 
     -- Python and Ruby (no powerful macro system to 
        help you keep complexity well under control 
        or take fulll advantage of dynamicism)


Both camps are not the best examples of static or dynamic typing. A good comparison would be between:

Camp C: Languages with very good static typing facilities, for example:

     -- Haskell
     -- ML
     -- F#
Camp D: Languages with very good dynamic typing facilities, for example:

     -- Common Lisp
     -- Clojure
     -- Scheme/Racket
     -- Julia
     -- Smalltalk

 
I think that as long as you stay in camp (A) or (B), you'll not be entirely satisfied, and you will get criticism from the other camp.


It's Camp D that I'm least familiar with here; outside of academic projects in lisp/scheme I've never used them for anything serious.

What exactly does it mean to have "good dynamic typing facilities"?


> What exactly does it mean to have "good dynamic typing facilities"?

To quote Peter Norvig on the difference between Python and Lisp, but you could apply it to most other mainstream dynamic languages vs Lisp :

> Python is more dynamic, does less error-checking. In Python you won't get any warnings for undefined functions or fields, or wrong number of arguments passed to a function, or most anything else at load time; you have to wait until run time. The commercial Lisp implementations will flag many of these as warnings; simpler implementations like clisp do not. The one place where Python is demonstrably more dangerous is when you do self.feild = 0 when you meant to type self.field = 0; the former will dynamically create a new field. The equivalent in Lisp, (setf (feild self) 0) will give you an error. On the other hand, accessing an undefined field will give you an error in both languages.

Common Lisp has a (somewhat) sound, standardized language definition, and competing compiler/JIT implementations that are much faster than anything that could ever possibly come from the Python camp because the latter is actually too dynamic and ill-defined ("Python is what CPython does") and making Python run fast while ensuring 100% compatibility with its existing ecosystem, without putting further restraints into the language, is akin to a mirage.


What does “somewhat sound” mean?


I think he refers to some of the usual criticisms of Common Lisp:

1. The language specification is very big. This is true, it is a very big specification. On the other hand, this is mostly caused because the language spec also includes the spec for its own "standard library", unlike what happens in C or Java, for example, where the Std. lib is specified elsewhere. CL's "standard library" is very big, because there are many, many features.

The other reason the spec is so big, is that this is a language with a lot of features - you can do high level programming, low level, complex OOP, design-your-own OOP, bitwise manipulation, arbitrary precision arithmetic, dissasemble functions to machine language, redefine classes at runtime, etc etc etc.

Probably the extreme of the features is that there is a sql-like mini-programming language built in just for doing loops (!), "the LOOP macro". On the other hand, you can choose not to use it. And if you use it, it can help you write highly readable and concise code. More info:

http://cl-cookbook.sourceforge.net/loop.html

2. The "cruft"; Common Lisp is basically the unification ("common") of at least two main Lisp dialects that were in use during the 70s. So there are some parts (mind you, just some) in which some naming or function parameter orders could have been more consistent; for example here everything is consistent:

    ;; access a property list by property
    (getf plist property)

    ;; access an array by index
    (aref array index)

    ;; access an object's slot
    (slot-value object slot-name)
... but here the consistency is broken:

    ;; gethash: obtain the element from a hash table, by key
    (gethash key hash-table)
There is also sometimes some things that seem to be redundant, like for example "setq", where "setf" can do everything you can do with "setq" (and more); or for example "defparameter", and "defvar" where in theory "setf" might be enough. But there are differences, and knowing such differences help to write more readable and better code. And it's really nitpicking, for these are easy to overcome.

3. Because of the above, CL is often criticized because of being a language "designed by committee". But, unlike other "committee-designed languages", this one was designed by selecting, from older Lisps, features that were already proven to be a Good Thing, and incorporating them into CL without too many changes. So you can also consider it to be "a curated collection of good features from older Lisps..."

4. Scheme, the other main "Lisp dialect", has a much, much smaller and simpler spec, so it's easier to learn. But on the other hand this also means that many features are just absent, and will need to be implemented by the programmer (or by external libs), without any standarization. On the other hand, due to the extensive standarization, usually Common Lisp code is highly portable between implementations, and often code will run in various CL implementations, straight away, with zero change.

Historically, Scheme was more popular inside the academic community while Common Lisp was more popular with production systems (i.e. science, space, simulation, CAD/CAM, etc.) Thus, there used to be an animosity between Schemers and Lispers, although jumping from one language to other is rather easy...


ANSI CL is what some 1100-something pages?

JavaScript is up to 885 pages. https://www.ecma-international.org/publications/files/ECMA-S...

The C++ 17 draft: 1623 pages. https://www.ecma-international.org/publications/files/ECMA-S...

C, which "is not a big language, and is not well served by a big book", according to Thompson and Ritchie's 1988 introduction in the K&R2, is up to 683 pages in C11. Almost triple the size of C90's 230 pages.

How about something non-language? USB 3.2 spec (just released Sep 22): 100+ megabyte .zip file download. Up from 2.0's 73.


0. There is nothing wrong with big standard libraries, so long as they are not redundant and the core language is small.

1. This is a serious criticism, but it has nothing to do with soundness.

2. There is absolutely nothing wrong with a language being designed by a committee, so long as the committee's members are all competent.

3. Back to 0.


>2. There is absolutely nothing wrong with a language being designed by a committee

It sort of has a bad stigma, because two well-known, unloved languages were designed by committee: COBOL and PL/I.


Today, those roles are played by C++17 and C11.


If you Google, you can find tons of good stuff written in Lisps. Julia is an up-and coming language - but it is highly regarded by the data science community.

> What exactly does it mean to have "good dynamic typing facilities"?

The ability to change the structure of your program at runtime will be at the top of the list for me. You can't do that with Ruby/Python.


>What exactly does it mean to have "good dynamic typing facilities"?

Picking Common Lisp as an example:

(NOTE: Some of the features are also present in good statically typed languages as well, so what I advocate is to use good, well-featured languages, not really static vs dynamic.)

(NOTE 2: I'm sorry for being such a fanboy, but that thing is addictive like a hard drug...)

0. Code is a first class citizen, and it can be handled just as well as any other type of data. See "macros" below.

1. The system is truly dynamic: Functions can be redefined while the code is running. Objects can change class to a newer version (if you want to), while the code is running.

2. The runtime is very strong with regards to types. It will not allow any type mismatch at all.

3. The error handling system is exemplary: Not only designed to "catch" errors, but also to apply a potential correction and try running the function again. This is known as "condition and restarts", and sadly is not present in many programming languages.

4. The object oriented system (CLOS) allows multiple dispatch. This sometimes allows producing very short, easy to understand code, without having to resort to workarounds. The circle-ellipse problem is solved really easily here. (Note: You can argue that CLOS is in truth a statically typed system, and this is partly true -- the method's class(es) need to be specified statically, but the rest of arguments can be dynamic.)

5. The macro system reduces boilerplate code to exactly zero. And also allows you to reduce the length of your code, or have very explicit (clear to read) code at the high-level. This brings down the level of complexity of your code, and thus makes it easier to manage. It also reduces the need for conventional refactoring, since macros can do much more powerful, automatic, transformations to the existing code.

6. The type system is extensive -- i am not forced to convert 10/3 to 3.333333, because 10/3 can stay as 10/3 (fractional data type). A function that in some cases should return a complex number, will then return the complex number, if that should be the answer, rather than causing an error or (worse) truncating the result to a real. Arbitrarly length numbers are supported, so numbers usually do not overflow or get truncated. (Factorial of 100) / (factorial of 99) gives me the correct result (100), instead of overflowing or losing precision (and thus giving a wrong result).

So you feel safe, because the system will assign your data the data type that suits it the best, and afterwards will not try to attempt any further conversion.

7. The type system is very flexible. For example, i can (optionally) specify that a function's input type shall be an integer between 1 and 10, and the runtime will enforce this restriction.

8. There is an extensive, solid namespace system, so functions and classes are located precisely and names don't conflict with other packages. Symbols can be exported explicitely. This makes managing big codebases much easier, because the frontier between local (private) code versus "code that gets used outside", can be made explicit and enforced.

9. Namespaces for functions and variables (and keywords, and classes) are separate, so they don't clash. Naming things is thus easier; this makes programming a bit more comfortable and code easier to read.

10. Documentation is built into the system - function and class documentation is part of the language standard.

11. Development is interactive. The runtime and compiler is a "living" thing in constant interaction with the user. This allows, for example, for the compiler to immediately tell you where the definition of function "x" is, or which code is using such function. Errors are very explicit and descriptive. Functions are compiled immediately after definition, and it can also be dissasembled (to machine language) with just a commmand.

12. Closures are available. And functions are first-class citizens.

13. Recursion can be used without too many worries -- most implementations allow tail call optimizations.

14. The language can be extended easily; the reader can also be extended if necessary, so new syntaxes can be introduced if you like. Then, they need to be explicitely enabled, of course.

15. There is a clear distinction between "read time", "compile time" and "run time", and you can write code that executes on any of those three times, as you need.

16. Function signatures can be expressed in many ways, including named parameters (which Python also has and IMO is a great way to avoid bugs regarding to wrong parameters / wrong passing order.)


> 3. The error handling system is exemplary: Not only designed to "catch" errors, but also to apply a potential correction and try running the function again. This is known as "condition and restarts", and sadly is not present in many programming languages.

I found this extremely weird and (or hence) interesting at the same time. Where can I read more about this?


You are welcome, sir! How about this chapter of the famous book "Practical Common Lisp", available online for free?

http://www.gigamonkeys.com/book/beyond-exception-handling-co...


googling for "common lisp condition system" turns stuff up. i looked at one point, and i don't remember finding an academic treatment of them.

they're (hand wave hand wave) basically a matter of stuffing the current continuation inside of the exception, whenever you throw an exception, and then making use of that to provide more options whenever the exception is caught.


Languages like TypeScript and Racket are also interesting because they let you shift camps partway through the project - you can start of untyped and then add typing annotations.

https://en.wikipedia.org/wiki/Gradual_typing


Perhaps it's because the languages you mentioned barely register in statistics(aside from Clojure maybe).

Maybe it's simply hard to have both a good type system and a friendly learning curve?


The other possibility is that the industry suffers from anti-intellectualism, so we keep reinventing the same 2 languages.


Do you believe that this is plausible?


I think this has happened for a long time. First (70s), features from Algol-68, like structured programming or better flexibility for data types, were ported to other languages. Then (80s-present), the features from Smalltalk and Lisp started slowly to be ported to other languages, many times in an incomplete or unelegant way.

We're still doing this, for example the last production spec of the Java language finally incorporated a mechanism to pass functions as input parameter on a method. And the next version of Java (9) will attempt to have some interactivity, with a kind of REPL. This, coupled with the powerful facilities of good Java IDEs, will give Java developers of 2017 the level of Interactivity and easiness of development that Smalltalk and Lisp users have enjoyed since the late 70s. Sad but true.

Julia -an interesting language, by the way- borrows multiple dispatch from the Common Lisp Object System (CLOS), among other features. CLOS itself was a further evolution of the OOP brought to the table by Smalltalk, invented by a true genius: Alan Kay.

Rust is basically a "fixed C++", that is, a more usable, less annoying C++.

So it's difficult to say there are truly new things in programming language. But it's not everything limited to Smalltalk and Lisp -- Prolog, ML (and OCaml, F# and Haskell) do bring new concepts to the table, and are worth checking out.


You managed to pick the two things where Rust is actually not improving on C++, because it's both more annoying and less usable compared to C++.

There's a reason the expression "fighting the borrow checker" was coined.

You're confusing your own biases and preferences for facts.


I find this whole "fighting the borrow checker" thing a tad inflated. I personally don't "fight" it anymore, because it's a simple rule to anyone who's familiar with pointer arithmetic.

Also the compiler usually tells you what is it exactly that you screwed up this time and how to get out of this mess, which cannot be said about C++.


In absence of a garbage collector, what people don't get is that it's really easy to screw up by creating race conditions or memory leaks.

If fighting the borrow checker is annoying, that's because you don't get memory safety otherwise.

The vast majority of vulnerabilities in the wild are created because of sloppy usage of C / C++, which is basically unavoidable in absence of expensive static analyzers that become as annoying as Rust, while not being as good.


>friendly learning curve

To be honest, i love Common Lisp, it might be the most powerful programming language out there, but it's not easy to learn at all. In part because, being a truly multi-paradigm language, you should better make sure you are well versed in most programming paradigms first, otherwise you won't leverage the full power of Lisp. Not to mention the paradigm of meta-programming and DSLs, something that is usually new to programmers foreign to Lisp.

However, languages like Clojure and Smalltalk can be rather easy to learn, and they are fairly powerful.

Smalltalk was designed to be taught to kids!!


>Not to mention the paradigm of meta-programming and DSLs, something that is usually new to programmers foreign to Lisp.

Is it really? What about templates(as in C++ templates), macros, CSS and HTML? These are two examples of metaprogramming and two DSLs respectively.

> However, languages like Clojure and Smalltalk can be rather easy to learn, and they are fairly powerful.

It's not like I don't believe you, but if this is true, then where's the popularity? Why isn't it there? I'm asking because I genuinely don't know.

EDIT: punctuation.


>Is it really? What about templates(as in C++ templates), macros, CSS and HTML?

"Lisp macros" go far, far beyond "C macros" ("preprocessor macros", and indeed go far beyond what you can do with C++ templates. You should take a look, but basically, explained in a few words:

In Lisp, code is data. Code is a first-class citizen. The functions and constructs that are there to manipulate data, also manipulate source code with the same easiness. So your code can manipulate code very, very easily. Writing code that creates code "on the fly" -be it at compile time or at runtime- is not only possible, it is also very easy to do, and it is 95% similar to writing regular code.

Thus, Lisp is sometimes described as "the programmable programming language."

>It's not like I don't believe you, but if this is true, then where's the popularity?

"Programming is pop-culture" -- Alan Kay.

The reasons a programming language gets highly popular is not always related to the quality of it. There are also other reasons. Consider Javascript for example. Before the ES6 specification, it was plainly a horrible programming language, full of pitfalls and missing features. You couldn't even be sure of the scope of the variable you just declared!! But it went popular, simply because it was the only programming language usable on all web browsers.

C, for example, was never a great programming language. But it ran efficiently on any hardware, so it started as a (very good) alternative to assembler. And then got more traction.

Then object-oriented programming got popular, because it allowed you to do nice stuff (on the Smalltalk language, where it was very well implemented). So somebody said: ok, i want C with object orientation, and C++ was invented, which wasn't a very good object oriented language, but since C was popular, and OOP was the next big thing, it got wildly popular.

C and C++ languages require you to manually manage memory, unlike in Smalltalk or Lisp, where the memory was automatically managed. So somebody at Sun said "ok, let's make a language with syntax similar to C++, but with automatic memory management", and Java was born, and thus, due to the small learning curve, and a LOT of marketing, went wildly popular, although many of the problems of C++ were present, plus it introduced limitations of its own. (I, as a student, loved Java when i learnt it, after having to use C++. How naive i was!!)

And the story goes on and on.

So it's more about riding the wave of popularity, rather than using the best tool for the job. It has also something to do with the triumph of UNIX over other operating systems. Otherwise, Smalltalk [what the groundbreaking Xerox machines used] and Lisp [what the groundbreaking Lisp Machines, and also the Xerox machines used] would be way more popular.

It also has something to do with speed -- Lisp (in the 60s) used to be a very slow language. Smalltalk (in the 70s and early 80s) used to run very slow as well. They also required a huge amount of memory. Nowadays they are not really memory hungry, and they can run very fast.

Some problems are much easier to express in Prolog, or Haskell, than Java or C++ or javascript; but they aren't popular languages. Popularity sometimes is harmful...


JavaScript was very nice before ES6. All ES6 did was to add syntax sugar/meta. var in JavaScript belongs to the function it is declared in. ES6 made the language way more complicated, and divided the community up into more dialects. The plan was to unite compile-to-JavaScript communities like CoffeeScript, but that didn't work because there's more compile-to-JavaScript languages now then it ever has been.


The learning curve for Julia is very easy.


> Ruby

Ruby has some of the best meta-programming facilities out there. Yes you can't manipulate syntax in the same way as lisp, but the fact that all methods are message passing and first class blocks make tons of very powerful meta programming possible. Basic features that look otherwise first class are based on Ruby's meta programming facilities like `attr_reader` and friends. Funnily enough, the meta programming facilities of Ruby are precisely what turns a lot of people off. The wtfs per minute of using something like ActiveRecord is super high for people with only passing familiarity because there's so much that that's defined through Ruby's meta programming facilities.


I say this as someone that has used and loved Ruby for more than ten years now:

Smalltalk has worlds better dynamic and metaprogramming.

That said, ruby does have a lot of power, but it's not of the same order as self/smalltalk etc.


>Ruby has some of the best meta-programming facilities out there.

Ruby has meta-programming facilities, but they pale compared to the easiness of doing meta-programming in Common Lisp. In Ruby, meta-programming is an advanced topic (see for example the implementation of the RoR ActiveRecord). In Lisp, meta-programming is your everyday bread&butter, and one of the first things a beginner learns. Because it isn't too different from regular programming!!

[The same comment applies, mostly, also to Clojure, Racket, Scheme, and the other Lisps]


So where does Java & C# fit in those categories?


Probably Camp A. They have some static typing, but it's pretty basic. Then again, according to the article posted, they may be in the "sweet spot" for many purposes.


Where's Rust in here?


Category C


How are the dynamic typing facilities in Smalltalk better than those in Ruby?


What makes you put Python and JS in the same basket here? Python is widely raised as a language that has a good dynamic typing system with strong types and good type error handling - arguably better than Lisps (nil punning). JS is infamous for the opposite. Macros are quite orthogonal to this.


> Macros are quite orthogonal to this.

You ain't gonna to find any sane way to combine macros with a powerful type system in a way the doesn't make a 140+ IQ a requirement for any programmer touching the code using these features in a real world project...

Problem with programming language design is that the ideal/Nirvana solutions lie at the edge, or beyond, the limits of human intellect. If you want something that can be learnt and understood with reasonable effort (like in not making "5+ years experience" a requirement for even basic productivity on an advanced codebase), you're going to have to compromise heaviliy! The most obvious ways to compromise are throwing away unlimited abstraction freedom (aka "macros"), or type systems.

Sorry to break it to ya, but we're merely humans, and not that smart...


There is a programming language property called Restrictability - it means that you only need to know a subset of the features the language provides to become productive. The best languages have high restrictability without compromising on the high level features like powerful macros.

The point of having macros is that they allow you to solve problems that cannot be solved elegantly in any other way. But 95% of programmers don't need to solve such problems and can do very well without using macros.


> Restrictability - it means that you only need to know a subset of the features the language provides to become productive

Thanks, but.... NO THANKS! It's basically what makes languages like Scala or C++ horrible - a false sense of "you only need to know this subset of the language" and then you see that in real life: (1) nobody agrees on what that subset is and (2) you are going to have to hack your way through the most advanced frameworks and libraries (written by folk way smarter than you) and you are going to need to do it under unreasonable time pressure!

If a feature exists in the language you will be forced to understand it and become proficient at using it, whether you like it or not. Otherwise you're a "play pen programmer", only comfortable in his little patch of expertise.

I'm personally an "Expert Generalist" and like to be confident I can hack my way through anything this shitty life throws at me ;) This is kind of why I'm starting to love forcefully minimalistic, abstraction-wise-rigid, and intentionally "retarded" languages like Go nowadays :) (But yeah, when dynamic is the way to go, I'd prefer a Lisp with macros any time - one extreme or another, never the middle way, I'm not smart enough for it.)


> it means that you only need to know a subset of the features the language provides to become productive.

That only works when you work by yourself (or in a small team to whom you can dictate the language subset), and without any third party code.

> But 95% of programmers don't need to solve such problems and can do very well without using macros.

Languages that have great macro systems use them to bootstrap themselves. So when you use the standard, documented features, you're using macros.

E.g. if you're writing in Lisp and your file begins with (defun ..., you've just used a macro.


Sure, you can use them as a consumer all the time, but that doesn't mean that you need to write your own to program in Lisps, for instance. For the few that do need it, it's a worthy tool to have.


Rust has macros that seem well liked, and everyday stdlib constructs are implemented using them. Though I can easily believe that the average Rust macro is authored by a smart person.


>What makes you put Python and JS in the same basket here?

I'm very well versed in Python (i've delivered two financial software systems done in Python, written entirely by yours truly). However its features and facilities pale in comparison to the languages i listed in camp "D".


There's one huge benefit to static typing people often forget: self documentation.

While, yes, top-quality dynamic code will have documentation and test cases to make up for this deficiency, it's often still not good enough for me to get my answer without spelunking the source or StackOverflow.

I feel like I learned this the hard way over the years after having to deal with my own code. Without types, I spend nearly twice as long to familiarize myself with whatever atrocity I committed.


Many dynamically typed languages offer excellent runtime contract systems (Racket, Clojure) that serve as an implicit documentation at least as well as a statically-type language. Often more so, because you can express a lot of things in contracts that are not easily expressed in type systems.


> because you can express a lot of things in contracts that are not easily expressed in type systems.

Can you give an or some example(s) of this?


you can put arbitrary functions in a contract. with static typing that requires dependent types. and while i'm a fan, that's an enormous can of complexity to bust open.

say you've got a function that takes a list of numbers, and some bounds, and gives you back a number from the list that is within the bounds (and maybe meets other criteria, whatever). your contract for the function could require not only that the list be comprised of numbers, and the bounds are numeric, but also that the lower bound is <= the upper bound, and that the return value was actually present in the input list.


I consider Reading the source code to see what something does is a feature, if you can understand the code that is. If the code is easy to understand, there will be less bugs.


Having programmed in languages ranging from Ruby to Coq, for web apps and games, I feel the sweet spot is somewhere in the neighborhood of Java/C#, i.e. include generics but maybe leave out stuff like higher kinds and super-advanced type inference (and null!).

The main use case of generics, making collections and datastructures convenient and readable, is more than enough to justify the feature in my view, since virtually all code deals with various kinds of "collections" almost all of the time. It's a very good place to spend a language's "complexity budget".

I wrote an appreciable amount of Go recently, with advice and reviews from several experienced Go users, and the experience pretty much cemented this view for me. An awful lot of energy was wasted memorizing various tricks and conventions to make do with loops, slices and maps where in other languages you'd just call a generic method. Simple concurrency patterns like a worker pool or a parallel map required many lines of error-prone channel boilerplate.


> An awful lot of energy was wasted memorizing various tricks and conventions to make do with loops, slices and maps where in other languages you'd just call a generic method.

I feel the same way going from languages with HKTs back to Java/C#...

Not sure why you think they're not as useful, it sounds like you're making the same argument as OP but just moving the bar one notch over...


I am. I think the OP is fundamentally right about the sweet spot being pretty far from either extreme, I just disagree slightly about where exactly :)

Subjectively, I use ordinary generics all the time, but see the need for HKTs only occasionally. It's entirely possible I'm not experienced enough to see most of their possible use cases, but then I'd wager most programmers aren't.


In retrospect, HKTs are arguably Haskells greatest innovation, enabling extremely general abstractions and huge amounts of code reuse.


In my subjective opinion, Haskell has taken abstraction way past the point of diminishing returns, at least for the problems I tend to work on.

A large portion of advanced Haskell type system features seem to be about emulating things you could do with side-effects. I guess I prefer Rust's approach to managing side-effects, or even just Scala's implied convention of: use 'var' very sparingly, and mostly locally. Yes, some guarantees get traded away, but so much simplicity is gained.

I'm not very experienced with Haskell, but I've written a fair bit of Scala and I've utterly failed to see the value in scalaz and similar libraries, despite trying them a few times. They always seem to add lots of complexity without a tangible benefit.

Coming at it from another angle, I just don't see many cases where I feel I have to repeat myself due to a shortcoming of, say, Java's or C#'s type system. If I could add one feature to either, it'd actually be support for variadic type parameters.


As a counterexample, C# needed expensive language extensions to accommodate both LINQ and Async/await. Both can be implemented in Haskell purely as a library, thanks to HKTs.

Both Java and C# tend to rely heavily on frameworks such as Spring to workaround issues with the expressivity of the languages. This causes problems when one needs two frameworks (they don't in general compose). In Haskell, HKTs allow one to write polymorphic programs that are parametric with respect to certain behaviours and dependencies, no dependency injection framework needed.

Please don't judge Haskell using Scala and scalaz.


I'm not sure what Java expressivity problem Spring is meant to solve. XML configs are basically just a duplication of what would be done in a static initializer, except you lose type-checking and get to find your wiring mistakes at startup time instead of compile time. Autowiring annotations can be nice when you first use them, but become inscrutable magic once some other poor sap has to come along and make changes to the original project setup.

I just don't understand what is so horrible and inexpressive about a static initialization block.

The only possible purpose I see to Spring is if for some reason you really need to be able to change how your dependencies are injected at runtime. (90% of Spring apologists point to this, and 99% of them never use it in practice.) Even then, I don't see how a Spring XML config file (which I have seen run to 4000+ lines, to my horror) is better than just reading some settings out of a properties file to pick an implementation in your static initializer.


I guess passing parameters down manually through all the constructors gets too painful. The language is not expressive enough for a Reader monad! :)

Java's static initialiser blocks are too dangerous whenever one has threads.


Not sure about LINQ, I thought that was "just" syntactic sugar for a bunch of collection methods. Are you refering to extension methods as an unfortunate prerequisite?

But I think I get your general point: things like 'Control.Concurrent.Async' ('async'/'await') and 'Control.Monad.Coroutine' ('yield') are libraries that implement some and very generic type classes: 'Functor', 'Applicative', 'Monad'. This then lets you use features that are generic over those type classes ('do' syntax, 'fmap', ...).

It's been many years since I had a proper look at Haskell. Maybe it just takes more practice than I had back then to fully "get it". But I still don't see those abstractions being that useful in everyday programming. They seem to have huge potential for hard to follow code as you need to mentally unpack and remember more layers of abstraction, and the gain is not clear to me. Even the features that have trickled down to C# are not _that_ crucial I feel. The way mainstream languages pick the most useful use cases of those abstractions seems pretty OK to me.

(Also, macros and compiler plugins are another interesting avenue towards very powerful abstractions, with a different set of problems.)

As for Spring and dependency injection, I don't follow how HKTs would help there. Could you give an example? Aren't DI frameworks mostly about looking things up with reflection magic to automate, and arguably just obfuscate, the task of wiring things up in 'main'?


>They seem to have huge potential for hard to follow code as you need to mentally unpack and remember more layers of abstraction

That's the beauty of abstraction without side effects, you don't need to unpack anything. If you know what the inputs are and the outputs are, you don't need to know how it works or what type classes are even used to transform certain things.

People use sequence

all the time in Scala, not realizing it's only able to be implemented with HKTs of Applicative and Traverse. FYI, sequence flips a list of Futures to a Future of List, or a vector of Trys to a Try of Vector, etc.


Fair point about 'sequence'. There are probably a bunch of these I use regularly in Scala without realizing it. Though as a counterpoint, 'Future.sequence' wouldn't really lose _that_ much if it didn't return a collection of the same type. And I haven't yet felt the need for a generic `sequence`, which I'm sure scalaz has.

I don't buy your point about not needing to unpack side-effectless code, however. There are _always_ reasons to dig into code, be it bugs, surprising edge cases, poor documentation, insufficient performance, or even just curiosity. And those high-level abstractions tend to be visible in module interfaces too. I remember some Haskell libraries being very hard to figure out how to use if you didn't know your category theory :)


It's a pretty typical symptom I've seen a lot of hardcore FP developers exhibit: they forget how much time it took them to reach their level of mastery.

It's like spending ten years learning to speak Russian and then criticizing anyone who says that learning Russian is difficult.

Puzzling out scalaz code is difficult and requires an enormous investment in hours and practice, investment that a lot of people prefer to put into different learnings.


Mainstream developers forget just how much time they invest in learning the latest fad frameworks with new ad-hoc concepts and terminology. I guess "Hardcore FP developers" are fed up with this state of affairs and are looking towards mathematics to provide guidance and common patterns/names. At least any knowledge of mathematics will not become outdated!


Yea, puzzling out some scalaz code takes investment. On the other hand, the library is used for web apps, network servers, database based applications, streaming libraries etc.

It's incredibly multipurpose, more so than even Spring or Guava or LINQ, and these are things that developers regularly have to invest serious time in.

The argument is just that FP libraries (like Scalaz) have a bigger payoff in the investment.

At Verizon Labs were have 20+ microservices that I have touched/looked at. Some use Akka, some use Play, some use Jetty, some use Http4s but everyone makes use of Scalaz somehow.


> The argument is just that FP libraries (like Scalaz) have a bigger payoff in the investment.

It depends on the people, not everybody has the inclination to dive so deep into hard core FP and they will be more productive using a different approach.

Don't make the mistake of thinking you've found the only software silver bullet that exists and that people who don't use it "don't get it", which is another attitude I've seen a lot of hardcore FP advocates embrace.


Just like async/await, LINQ (and even enumerators) are tied to special syntax in the C# language. HKTs allow Haskell to provide very general resuable syntax, such as do notation.

What I meant by "polymorphic programs" as an alternate to DI, is something like this:

doStuff :: HasLogger m => Input -> m Output

The effectful function "doStuff" above is polymorphic with respect to which logging implementation is used, it could even be one that uses IO. All made possible with HKTs.


Ok, the point about special syntax is fair, but as I said, I'm happy with the use cases that have trickled down to mainstream, and I'd argue there aren't _that_ many truly useful ones. I realize this is very analogous to how the Go programmer is somehow happy with the few generic collections they are granted :)

Your DI example seems to be an example of my earlier point about "emulating things you could do with side-effects". No HKTs are needed when you just pass an impure side-effectful Logger object. Or, as discussed in another subthread, you could do side-effect management with Rust-style uniqueness typing, which results in a less elegant but arguably easier to use type system. It's debatable, but it seems people struggle less with the borrow checker than with advanced Haskell.


Looks like we reached maximum thread depth so replying here.

I agree, the side-effectful choices are either a global, some DI container, or just passing it down.

For loggers, I think global lookup from some (pluggable) logging library is justified because logging is probably the most ubiquitous cross-cutting concern ever. For pretty much everything else, I think passing as a parameter is actually the best option. It's explicit and simple, and you don't even need to explicitly pass it around _that_ much if you store it in a field of a class that plays the role of a module. Most uses of the dependency will be in non-static methods, lambdas, or inner classes.

I dislike Reader because it's similar to a DI container (or a global) in that it's more work to figure out, for a given call site, what the last value written to it was. With parameters, you just climb the call chain.


I don't see how passing in a logger object explicitly is the same, this is what OO DI frameworks try to avoid, otherwise you'd also have to pass it down to other functions used inside. The example above works just like a Reader Monad, but we are not tied to any specific logging implementation.

I guess the side effecting version is just to use some global registry to look up the logging implemention to use. But such code does not compose.


Rust doesn't have anyway to manage side-effects in types ..


Don't mutable and immutable references with lifetimes count? Sure, one could argue whether the borrow checker is really part of the type system, but it's a compile-time check either way.

Yes, in the standard library, an immutable object can hide mutable state in e.g. a 'Mutex', and effects to the external system aren't wired through anything like monads or unique objects.

I see those as compromises Rust makes in the name of pragmatism and being a system'ey language. I don't necessarily like all of them, but I find the general uniqueness typing based approach interesting.

See articles comparing Clean and Haskell for an interesting historical perspective, including how both approaches could be used to model side-effects in a purely functional language. Haskell "won", possibly because it was seen as more generic and composable. I always felt Clean's approach had merit too, so I was really glad to see Rust bring the idea, or a closely related idea, to prominence.


Right, but you are addressing only part of the story to side-effects. IO is another story, which rust doesn't address.


The standard library doesn't, and most crates don't, but I'm pretty sure nothing prevents you from writing libraries in a style where all IO requires mutable access to some explicit unique "World" object, similar to Clean.

Passing a unique world object around is effectively the same as composing with the IO monad, and borrowing 'f(&mut world)' is basically equivalent to 'let world = f(world)'.

Maybe someone will one day write a standard library in that style.


It is not going to be convenient because rust doesn't have higher kinds. I see people making this argument in other language contexts e.g. Ocaml; but they have no typeclasses, which make writing monadic style code extremely inconvenient.


> I think the OP is fundamentally right about the sweet spot being pretty far from either extreme, I just disagree slightly about where exactly :)

And I think an important take-away should be, that this perception is entirely subjective and colored by both of our experiences, preferences and the kinds of problems we work on :)


But dynamically languages give you generic collections and data structures for free. Why would you need static types at all?


They emphatically do not. Ignoring types doesn't give you a type system "for free"; much the same way that building a shelf doesn't make you a librarian.


Dynamically typed systems don't "ignore types", they just handle them at runtime.


I just don’t buy that go is some sort of sweet spot because it doesn’t have generics. Generics pretty much exist for maps and slices, because they are needed in real programs. The language designers just don’t let you make your own generic collections.


Yeah far from finding a sweet spot, Go exists in some kind of type system ghetto, because its type system is so crippled users have to resort to code generation (go generate).

Neither Python nor Java programmers have to do that.


Yeah Go is weird in that its static type system doesn't to provide you with great static typing power but instead it's just there as a sort-of sanity checker. If there's logic, they say write it with data structures and functions. Have invariants? Enforce them yourself.

If Go is annoying with how little power it provides, that's fair, but other type systems can be just as annoying then, because when given the ability to, type astronauts will blast off into space, purely as a matter of honor or instinct.

Besides, code generation isn't all that bad. Java programmers will eventually find some kind of code generation in their build setup (serialization/schema tools).


There's nothing wrong if users independently choose to use code generation. However when a programming language starts to rely on it, it becomes a major problem.

We've been here before with the C preprocessor. There's nothing wrong with having a preprocessor, but in C it is necessary to use the preprocessor and that causes a lot of problems, like making it especially difficult to write tools.


Yeah, I've noticed that Go APIs are very stringly typed. The APIs are not very self documenting, and it is hard to figure out whether something is nullable or not. Libraries often require you to initialised data in a partially invalid state and the whole thing feels quite error-prone and flaky.


> Besides, code generation isn't all that bad.

It is the number one thing that makes C++ templates unusable: semantics defined by means of code generation.


The fact that maps/slices/channels already exist generically is what puts Go into the sweet spot. You have generic containers for the vast majority of use-cases, so the value-added consideration of being able to cover more use-cases with generic containers becomes a lot smaller.

In a hypothetical world where the designers never added the specific containers they did, you'd get a whole lot more value out of generics for containers. But it turns out, the designers used what seems on the surface like a kludge to get most of the benefits, while saving most of the cost. It's a perfect embodiment of the kinds of tradeoffs I'm talking about.


You have containers for all the use cases the designers thought of, but then you have it worse than Python for all the other use cases, and you are stuck doing code generation or type erasure. It is impractical to expect go’s designers to have foreseen the best trade off for every codebase.

Architecture astronauting can be prevented with best practices and code review, not with language limitations. It’s a fools errand to try, code generation allows you to get all the complexity and more of generics.


> It is impractical to expect go’s designers to have foreseen the best trade off for every codebase.

Which is not the argument made by anyone. Indeed, I explicitly acknowledge that there is a certain fraction of use-cases not covered by the builtins. So there isn't really any disagreement about this.

The question is how large this fraction is, how much it would benefit and how inconvenient/costly the existing workarounds are. Like all engineering questions, these are impossible to talk about when dealing in absolutes. And once you actually talk about these questions quantitatively, I achieved the goal I had with the post - to change the debate into a quantitative one explicitly acknowledging the tradeoffs involved.

> Architecture astronauting can be prevented with best practices and code review, not with language limitations.

I work at a company which has probably one of the highest standards in regards to code review in the industry. As such, I disagree with you that it is effective in addressing this.

> It’s a fools errand to try, code generation allows you to get all the complexity and more of generics.

If that's the case, where do the complaints come from about the lack of generics? It seems that Go really has generics then, in your opinion?

Of course, that's a strawman and a misrepresentation of your argument. But what makes this a strawman, the difference between the existing workarounds and actual generics, is just as effective an argument for your side as it is one for my side. Because codegen is made so inconvenient, people bias heavily towards using the builtins, away from custom data structures, if they can at all get away with it. Thus greatly reducing the overall complexity of the codebase.

So it would seem to me, that this argument is logically flawed. Either codegen is a poor replacement, thus leading to people using less generic code, thus there is an effective reduction in complexity. Or codegen has the same effect on complexity, which would mean it is used just as much, meaning it can't be that bad a workaround.


Code generation achieves the same effect with more work and without a standard abstraction in the language it takes more effort to understand. Using general purpose primitives that are blessed to be generic similarly takes more work to understand when it exposes too many underlying implementation details. A nice wrapper class would work much better.

I don’t agree that using piles of built in objects makes the code easier to understand. If I want a Tree<Node, Node, Value>, how is using lists of lists and integer pairs making my code easier to reason about? Or using code generation to make reams of classes that create a Tree for everything I want, and anyone who uses my functions? How is encouraging either of those things a positive?


In this thread: people will bring out the same tired arguments for or against static typing, without commenting on the actual content of the post, which was quite good!

I have come to see type systems, like many pieces of computer science, can either be viewed as a math/research problem (in which generally more types = better) or as an engineering challenge, in which you're more concerned with understanding and balancing tradeoffs (bugs / velocity / ease of use / etc., as described in the post). These two mindsets are at odds and generally talk past each other because they don't fundamentally agree on which values are more important (like the great startups vs NASA example at the end).


I think this post was extremely hand wavy. It stated the same divide that is already known, but doesn’t actually make any arguments to why Go or whatever lies on some part of the curve, because it assumes that the way you program at different points on the curve are roughly the same but with more type boilerplate. Higher kinded types offer entirely new ways to program, and stuff like optional typing in Python makes it all much more complex than just “how long do I spend writing and reading type declarations”. I was left with an impression that the author was content with go, and that’s pretty much it.


I agree. The graph of static checking vs. lines of code should really be factored into static checking vs. amount of annotations to achieve that level, amount of annotations to write vs. how much that slows you down, and amount of annotations that are already written (in your own code or libraries you use) vs. how much that speeds you up. And those will vary wildly depending both on the language and the programmer.


It has been interesting to see the to and froing of arguments for and against static typing in the discussions here.

Though I am not a type theorist (I only dabble in compilers and language design), I have noted that many people conflate static typing and dynamic typing with other additional ideas.

Static typing has certain benefits but also has certain disadvantages, dynamic typing has certain benefits but also has certain disadvantages.

What I find interesting is that few people fall into the soft typing arena, using static typing where applicable and advantageous and using dynamic typing where applicable and advantageous.

Static typing has a tendency in many languages to explode the amount of code required to get anything done, dynamic typing has a tendency to produce somewhat brittle code that will only be discovered at runtime. The implementation of static typing in many languages requires extensive type annotation which can be problematic.

But what is forgotten by most is that static typing is a dynamic runtime typing situation for the compiler even when the compiler is written in a static typed language.

Instead of falling into either camp, we need to develop languages that give us the beast of both world. Many of the features people here have raised as being a part of the static typing framework have been rightly pointed out as being of part of the language editors being used and are not specifically part of the static typing regime.

Many years ago a similar discussion was held on Lambda-the-Ultimate, and the sensible heads came to the conclusion that soft typing was the best goal to head for. Yet, in the intervening years,when watching language design aficionados at work, they head towards full static typing or full dynamic typing and rarely head in the direction of soft typing (taking advantage of both worlds).

S, the upshot, this discussion will continue to repeat itself for the foreseeable future and there will continue to NOT be a meeting of minds over the subject.


Maybe part of the problem is I can't picture what you're actually talking about with soft typing. I can tell you C#/.NET has the DLR which allows you to do dynamic types whenever you want. Outside of a few gimmicks, you rarely see these used. I've rarely even seen them for quick prototyping, because generally you mess around with using them for prototying, then the first time they go bad, it's really obnoxious, and you realize you're compiling the code and writing function signatures anyway, might as well save the time later and do it right the first time.

Then there's the whole tooling aspect of trying to mix type systems. It's different lifestyles. Dynamic programmers aren't going to start compiling their code to run it, static programmers aren't going to switch to a language with weaker tooling around the IDE-ish features, which are mostly built on the type system.

My conclusion is this: New languages should all be statically typed, because we shouldn't need new languages at all. We should be fine. The reason we need new languages at all, is because the trifecta of C++/Java/C# basically encompassed the entire statically typed world, but they're all infected with this fully overblown OOP obsession, and the null pointer bug--which newer languages have fixed, through more static typing. Basically we need to replace those languages with similar ones and then just stop making languages for a few decades, until whatever we're doing now looks as dumb as OOP and null pointers. In the long run, Go/Swift/Kotlin/Rust will take over the statically typed world and it's going to be great.


Soft typing could be characterised by having the compiler do static type analysis where it can, but leave the type analysis to the runtime when it can't.

A simple example of this is a list. Now in statically typed languages, list are homogeneous (this is includes type unions). In dynamically typed languages, list can be heterogeneous, essentially anything can be added at runtime.

In soft typing, we can indicate that a list is homogeneous and the compiler will ensure that this is true or we can specify no type checking (as such) and this will be done at runtime.

Contrived yes, but I regularly use other aggregates (tables and sets) into which I do not want them to homogeneous.

One of the aspects that I like about functional languages is the polymorphism available, but in all that I have come across, there is no way to make a tree or list heterogeneous without declaring union types before hand.

My problem with C#, C++, Java, and their ilk, is that code is multiplied with their generics.

How the IDE and compiler and type systems interact is a design function and is not inherent to any type system.

One of the reasons I don't use specific main stream languages such as C#, C++ or JAVA is that they don't provide the specific programming features that I desire.

I have looked at Go, Swift and Rust and I am not at all impressed by the "relative stupidities" within those languages. For other programmers, what they consider to "relative stupidities" is entirely up to their experience and outlook.


Our industry has not yet even scratched the surface of what types can offer: Types for enforcing architectures and controlling effects, types for checking correct use/free of scarce resources, types for verifying protocol implementations etc etc. Currently, half the industry is using schema-less json and dynamic languages; so really it is far too early to generally talk about any diminishing returns.


There's a lot of great things our industry doesn't use: contracts, proper fuzz testing, cleanroom, formal specification, constraint solvers, _checklists_. We might (not necessarily, but _might_) be in a place where types are diminishing returns with respect to other low-hanging fruit.


Yes it's true that retrofitting better type systems into existing languages may not be low-hanging fruit. But developers have shown a willingness to adopt new languages when they see clear benefits.


> Yes it's true that retrofitting better type systems into existing languages may not be low-hanging fruit.

Disagree here, actually! Javascript (Typescript) and Python (mypy) are both seeing pretty big benefits from adding gradual typing.


Glad to hear it!


When you speak of contracts, are you referring to run-time contracts i.e. Racket?


Its so funny how people argue for types everywhere, then use nosql databases and lack type checking on data validation.


I used to agree, but after seeing how easily versioning of schemas, procedures etc in conventional databases can turn into a clusterfuck I have changed my mind. I have begun to like the idea of putting all the schema info into compiled applications that can't easily be changed on the server. MySQL et al is the worst of all worlds.


In fact, the decades-old CSP model, upon with Go and Clojure's core.async are based, outlined compile-time assurance that there are no race conditions in your multi-threading. You are correct that these two modern implementations of CSP do not go there.


For data, schemas ala clojure.spec are a competing idea - it makes the "type system" much easier to metaprogram and apply selectively.


I'd argue that any schema is a type system of sorts.


Well said. This article is essentially dismissing a technique that is barely used, with the argument that the technique is not the entire solution. Of course it isn't, but that doesn't change the benefits that it can bring.


The industry has other issues, the constant cruft, tech debt and turnover. Lots of people make money through this, they won't accept making better software if it make them appear smaller and too cheap.

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: