Hacker News new | comments | ask | show | jobs | submit login
The memory safety problem isn't bad coders (medium.com)
250 points by steveklabnik 4 days ago | hide | past | web | favorite | 220 comments

I remember when I first added ESLint to our large JavaScript codebase, and then again when I added Flow. I'm a perfectionist when I code, but it was shocking to see some of the things I'd written. Mistakes that I thought were so unlike me, but had been sitting there for months anyway, waiting to blow up.

The human brain wasn't designed to handle the complexity of large codebases. There are just too many compounding implications of everything for us to hold them all in mind at once. The very best coders miss things all the time.

Instead of seatbelts, I would compare tool assistance to aviation instruments. It's not primarily something that just catches you before a disaster; it's something that helps you make the most effective use of your mental resources. We don't ask pilots to fly commercial airliners without advanced software and dozens of readouts and displays to guide their attention and decision-making in an overwhelming sea of information. We shouldn't expect the same of programmers.

The comparison with aviation instruments is apt: I remember seeing a TV special that talked about how in the early days of aviation (circa WWI) the pilots' culture of seat-of-the-pants flying and bravado was resistant to suggestions that human senses are just not equipped to differentiate between certain inertial reference frames, of which one leads to getting into a death-spiral.

It was not until an early aviation safety pioneer came up with a demonstration on the ground (something like a spinning chair and blindfold - I forget the details) that pilots who went through the demonstration were shocked to discover they "fell for" the physical illusion they believed they were immune to.

I feel like programming is in the same state today. Especially the more C/C++ code I see in my career: I'm absolutely convinced humans should just not expected to manually manage some of the things they try to do now.

C/C++, and in many cases Lisp, seem to be the most entrenched communities when it comes to the "culture of seat-of-the-pants flying and bravado". For contrast, JavaScript has as much flexibility and nearly as many foot-guns as those languages, but its community has been much more receptive to safety rails and static analysis. (Hopefully this doesn't start a flame-war; that isn't my intention)

How. Dare. You.

More seriously, JS has had its own resistance movements. TypeScript was actively disdained until, as far as I can tell, Angular switched to it. Probably partially because it was MS, but also because there was a lot of resistance to static typing despite the demonstrated safety benefits.

Yeah. My armchair analysis is that more flexible languages tend to attract smarter people, because they have a higher cap on how much your codebase can benefit from cleverness. Unfortunately those smarter people tend to over-estimate their own cleverness, and trying to take away someone's dynamic language features tends to be analogous to trying to take away Americans' guns.

Complex JavaScript codebases used to mostly be this way, but in recent years the JavaScript community has seen a huge influx of new, inexperienced programmers. There are obvious downsides to this, but one upside is that they don't have the pretense of being able to do everything perfectly without help, making them less resistant to things like TypeScript.

Being very careful not to start a flamewar, my armchair analysis is the opposite: JS has seen a large influx of serious backend developers, who generally have no time for silly arguments against static typing.

My guess is it's some of both, really. There's the old saw about levels of SWE expertise:

* Beginner: doesn't know much and knows it

* Intermediate: knows a lot, hasn't lived with the consequences

* Expert: knows a lot, including when not to use what they know

The two comments here seem to be articulating the community shifting between I->B and I->E respectively. I think it's safe to say we have both a maturing developer base and a huge influx of newbies. That intermediate "look what I can do!" stage still exists, probably as much as ever, but is being professionally tempered by the caution inherent in the other two cohorts.

The Intermediate stage is the most fun to watch, though, and probably a lot of what drives innovation. The coolest projects come from people who ostensibly should've known better than to try that thing.

Doesn’t static analysis requires static typing? There certainly has been a strong resistance to the introduction of static typing in the js community.

`let val = 3;`

This statement can be statically analyzed using the inferred static types that are already present. There is a lot of idiomatic javascript code that already has inferr-able static types in it. In fact, if you turn on the "implicit any" option in the typescript compiler, then you never need to add any types.

Static types are one kind of static analysis, but there is a lot more that can be done.

A linter can look at anything from micro-quality like "don't use more than one ++ operator at the same line" or "there's an = at this if test, this is probably wrong" to macro ones like "I couldn't prove your program terminates", "I couldn't prove that memory does not leak", or "a deadlock can happen here".

I'm not a very good developer, though I do have plenty of released-and-working-well things out in the world. I will say, it seems to me like static typing would reduce my creativity and flow and require me to spend more time planning stuff, which is not fun, and would make updates take more time as I patch stuff in all over the codebase. It's sort of the polar opposite of Ruby-style "duck typing".

Do they really help enough to be worth it?

I'm actually kind of surprised that programmers, of all people, would oppose guard-rails.

At university, it was a meme that no one's C program ever compiled the first time they tried.

I can't imagine even the best programmer in the world can avoid that kind of problem, without static analysis and static typing. Maybe you can have a 90% success rate on short programs, I'd believe that (my first-compile success rate on AoC was closer to 50%, and I was like #30 or so on the leaderboard [1]). But to do it perfectly, in real-world code? That I don't believe.

The nice thing about static typing is about catching mistakes earlier, and making it easier to figure out what your mistake was. Without the type annotation on the function, when you get a crash a few days later, it's harder to figure out "is the function wrong, or is the code calling the function wrong?" With it, those questions are automatically answered.

The best I've heard is that for short scripts, static typing doesn't help _enough_ to be worth the extra time it takes. But with modern type inference, it takes very little extra time at all.

And the extra time it does take, to write out interfaces and stuff, you can avoid in TypeScript with type assertions and `any` assertions. But honestly, it's usually worth it: It's not like you _don't_ plan out the interface; you just used to keep them memorized instead of writing them down, and then regret it a few weeks later when you want to make a change.

[1] To be fair, it was 50% because I was optimizing for programming speed for the leaderboard rather than accuracy; I'm sure I could have gotten closer to 90% if I were optimizing for accuracy. But anyone who argues that avoiding static types saves time has to accept that you make a lot more mistakes if you're trying to save time.

> it was a meme that no one's C program ever compiled the first time they tried

Ah but see, the compiler is a guard-rail. Technically, it's a static-analyzer. I don't think anyone is against making existing guard rails more helpful via editor integration; the conflict comes with adding new guard rails. Take the following example:

  let foo = {
    bar0: 12,
    bar1: 14,
    bar2: 16

  // example 1

  // example 2
  for(let i = 0; i < 3; i++) {
    console.log(foo['bar' + i])
Example 1 can be statically verified by a JavaScript type checker. Example 2 can't. There are advantages of being able to do the second thing - more code reuse, possibly less refactoring needed. But it's impossible to verify that those properties exist on that object without running the code. The argument is that oftentimes, the type checking is more valuable than the loss of "creative" freedom. But it does mean limiting the kinds of things you can do.

> Ah but see, the compiler is a guard-rail.

Yes, I agree. It's a guard-rail which is missing in scripting languages like JavaScript and Ruby, and which some people are resisting adding.

> Example 1 can be statically verified by a JavaScript type checker. Example 2 can't.

With type assertions, it's not really limiting at all. For instance, I'd write example 2 as:

    for (let i = 0; i < 3; i++) {
      console.log(foo['bar' + i as keyof typeof foo]);
If you weren't as experienced with TypeScript, you can also just throw around `any` assertions until the problem goes away:

    let foo: any = {
      bar0: 12,
      bar1: 14,
      bar2: 16
Or if you wanted to be tricky, you could even just assert a specific key:

    for (let i = 0; i < 3; i++) {
      console.log(foo['bar' + i as 'bar0']);
It's true that there are a lot of things typecheckers can't check. But that's no reason to give up on the things they _can_ check.

In case anyone missed it, example 2 is allowed in typescript.

It's not, not without an assertion:


Well, it's allowed in that it will still compile the code (because TypeScript is a superset of JavaScript), but it'll still yell at you.

Yes, without an assertion. Just change your compiler options.


I regularly work in a statically typed language (Scala) and a dynamic/gradually typed language (Python.)

I never find that static types force me to plan more – it's pretty easy to change them on the fly.

I frequently find that static types let the IDE highlight an error the second I write it, instead of waiting until running tests. And every codebase I've worked with has taken at least a few minutes to run tests (and sometimes much longer), so this is a pretty substantial savings.

Or in other words, static types free me up to be more creative, by requiring me to spend less energy & attention on certain types of errors that were common before.

In fact with the right IDE, static typing makes refactoring much easier. Right click on a parameter, rename, and it will rename anywhere where it is relevant (as opposed to a replace all).

To be fair, you can do that with dynamically typed Python and a good IDE too. I've used PyCharm on Python 2.7 for years and never really had any issues with refactoring.

You could probably become a better developer if you did more thinking ahead (or 'planning', as you put it.) If you are writing a bunch of variables, and you don't have a good idea of precisely what data they hold, there is very little chance that your program will work properly first time (not that thinking ahead guarantees it, of course, but it improves your odds.) For every programmer, there comes a point where you cannot effectively track how everything needs to fit together to work, unless you are doing some sort of planning.

Duck typing only works when it works, which is not as trivial a statement as it might first seem.

Respectfully, do I care if my program works the first time, or by the deadline? I don't care if the first ten times I run it, it just bombs out with interpreter errors. Doesn't seem to result in my having slower output than anyone else I'm working with.

If you truly knew what you were doing when you started, then everything would work the first time you tried it. Since that's not the case, you're clearly making mistakes, so why wait until running it to find mistakes when you could find them before even running the program?

Think of it this way: you only ran it 10 times, are you sure you found all the bugs? What about 11 times? Are you sure you found all the problems? Numerically speaking, if the first 9 attempts didn't work, then 90% of the time you ran it after coding, it was still broken. The fact that the program was broken a majority of the time you developed it should make you skeptical if it's even "correct' after you've determined it to be "finished". I understand you picked random examples, but this is still relevant.

It seems to me like you're arguing a common argument about picking tools which is "I've always done it this way, and it works for me, so why look for anything better?"

I agree with the sentiment, but types don't guarantee correctness. I would put such a developer on "TDD-duty", be thorough about code reviews and do some pair programming.

If all you aspire to is to be about the same as your co-workers, you will probably do just fine continuing as you are.

What are these 'interpreter errors' that you write of? It is actually quite unusual for someone to find even just one error in the interpreter for each new program he writes. What interpreter are you using?

One thing I can be pretty sure of: Wall, Van Rossum, Matsumoto and Hickey didn't produce their languages without some insightful thinking ahead.

I’ve been programming in Ruby for years (as a hobby / small passive income project), and about a year ago decided to reimplement a small part of it in Go. Both for learning and because it was a bottleneck in the application.

Recently I needed to fix some things in the original Ruby code, and it felt like freeclimbing after getting used to ropes... I called a function and started wondering how I could have any guarantees that it returned something useful. Benefits of static typing really ‘clicked’ for me at that moment. I still love ruby though

Well, I remember times when my program crashed because I tried to add a dictionary and an integer - so, yeah, they do help.

The problem lies on enough: the overhead of static typing has to be lower, in the lifespan of a project, to the time spent fixing bugs like this one.

(Other things, like liability from bugs and crashes, may give an edge to static typing. The age-old debate is how much)

If you know what sorts of data your variables denote and your functions take and return, being explicit about it is not much of a burden for the benefits it brings.

It's hard to stereotype the "C/C++ community." The C++ community popularized RAII for safer management of mutex locks, reference counts, and other resources; smart pointers for null checking; and other ergonomic improvements motivated by the knowledge that getting those things right in plain C required an unrealistic and/or wasteful level of vigilance. When I encountered Rust I immediately recognized the same desire for safety and protection that motivated usage of RAII in C++.

And, funnily enough, back in my C++ days I remember dabbling in Common Lisp and loving that there were macros that achieved similar results.

I put a form of RAII into TXR Lisp, unified with GC finalization:


Oddly, most of the complaints against lisp just don't acknowledge that they have actually had type safety for a long long time. Not just in racket's types. But common lisp has had valid compilation steps for a long long time.

In fairness, at my company we use a custom dialect of CL which does not have a type system, so I may not have enough perspective to make broad statements about the Lisp community. Though, intuitively, I'd think that in a language all about dynamic data structures and self-modifying code, static analysis would be really hard. But I may just be ignorant in this area.

I can't claim a lack of ignorance in this area. It is always baffling when you see just what some of the older lisp machines were capable of.

Too many posts reduce this to some sort of "magic of lisp" message. Which is certainly fun, in many ways. However, I think most of it is just a lot of hard work that was done by folks. Much of which will never be considered by the rest of us, almost solely on lack of information.

So, yes, I imagine in many ways it is harder. Doesn't make it impossible, though. And indeed, most type checking can be almost trivially done. As an example, typechecking the arguments to a format command. Trivially easy in most invocations. Only when people do build up dynamic programs is this hard. Thing is, those programs are hard to even think of, so most of us don't write them.

[drifting off-topic…] This article about early aviation covers what you're talking about and I found to be a great read: https://www.theatlantic.com/past/docs/unbound/langew/turn.ht...

> (something like a spinning chair and blindfold - I forget the details)

The test they do is blindfold you and spin you in a chair. But they switch directions and you have to point your thumbs in the direction your are spinning. It is possible to determine which way you are, but it is freaking difficult and really you're probably cheating by knowing the initial spin direction and keeping track of sharp decelerations (change in direction).

Basically it is simulating how you might be spinning in clouds and not be aware of it. Unlike the chair simulation these spins are more gradual/subtle and you're not paying extremely close attention (your attention SHOULD be elsewhere, i.e. flying the plane).

I couldn't agree with your post more if I tried.

The first time I ran a static analyser over an old codebase of mine it was deeply humbling.

I'm a competent programmer who works deliberately and carefully and the tool still went "did you really mean to do this stupid thing?".

Of course sometimes you do mean to do stupid thing because there is a good reason for it and as long as the tool gives you the ability to say // @allow-stupid-thing: punch-myself-in-face it's all good.

Since then I throw as many linters, formatters and static analyses tools as I can reasonably get my hands on at every codebase.

Yeah. That's one thing I love about Flow/TypeScript as opposed to "real" compiled languages: you can actually leave type errors present if you don't know how to take care of them at the moment, and do a successful build. Or, if you know you're doing something safe and the analyzer just can't understand that, you can literally comment-out the error.

And of course, Rust has "unsafe" mode. In my experience it's important to have tiered strictness; start out with the highest constraints, but provide mechanisms for stepping that down in special cases where you really have to, without giving up safety completely if possible.

I too like this about Typescript - allows for fast spiking too, where you don't actually yet know what you want

Could you name them? Especially those that find more stuff than the others along with how you rate them on false positives?

I try to collect data to pass on to folks interested in trying static analysis.

Sure though the line between linter, type checker and static analysis is blurry (so I'll use the "suggest improvements without execution") because of the languages I work in.

For PHP :-

PHP Inspections (EA Extended) (commercial variant, free one is good as well) - Hands down my favourite for PHP because it integrates so well with my IDE, I wouldn't write PHP without it - very few false positives, when it flags something if I don't immediately understand I go look at why it's flagging it, I've learnt stuff I didn't know after nearly a decade of using PHP.

Phan - Very good and getting better all the time.

PHP_CodeSniffer (more useful for making sure you stick to a particular standard say PSR2 - IDE's can do this but code sniffer can be part of you pipeline/commit handling)

PHPStan - Very powerful but if you didn't start with this or are bringing it on an existing codebase..well it'll find a lot, on max it will flag what seems like everything but it's configurable and its most technically correct.

For TypeScript - TSLint w/ tslint-microsoft-contrib.

For C# - ReSharper

For bash - ShellCheck

I also tend to use an IDE that has a decent degree of real time analysis for the languages I use which is nearly always Intellij (with it's PHP, Python plugins) or one of it's spinoffs (eg. Rider on Linux) - I'm a jetbrains fanboy basically, I adore their tools.


LGTM recently came by my project offering to integrate with GitHub. It's found a decent number of bugs other linters and my test suite didn't find, and it didn't require any setup at all; you just open your project on their website.

Their staff also reacted really quickly when I asked them questions or reported bugs.

For the most part, their only false positives are things like unreachable code, which I sometimes used to leave in for future-proofing, but I eventually did things their way, because it does make sense that it could be confusing to a third party.

The only alert I've ever suppressed was a string search for "example.com", which could have matched "example.com.untrusted-attacker.com". It was just code for displaying a reminder, not anything security-critical, so I didn't think it was worth fixing.

It's amazing how little we talk about tooling, both in work and in education.

I had an internship where I kept on trying to work on better tooling for the developers, only to be told that I should be working on the actual product because that's more important. Except, if I make the other developer's lives even 5-10 percent easier, then that will have a much longer effect than whatever work I can get done in 3 months. The most useful contribution I made in that internship was a document explaining all the features of the product with images, short videos and both the internal names and the client facing names.

And in school? Forget about it. Nobody shows kids how to use IDEs or editors properly. Nobody teaches git properly or command line tools. Nobody learns how to set up linters or heck, how to turn on -Wall -pedantic. Imo, every systems class should have a first lecture that explains Valgrind, proper compiler flags and Makefiles, and a last lecture that explains Rust. Realistically most students are not going to use Rust, but just understanding that it exists as a possibility can be useful.

Hey stranger, thanks for telling me that the lecture I gave last week was the right thing to do, even though I will be told that "those things do not belong to 'Data Structures and Algorithms'" :-)

> and Algorithms

If the complaints are annoying enough, I propose cheating - for example, working makefiles into the graph algorithms part of the course.

Tools are an absolute must for large codebasses.

Humans struggle with large codebases but by restricting what's allowed to certain well understood and prearranged patterns and by forcing them to write specs and docs they are possible to handle.

"The human brain wasn't designed to handle the complexity of large codebases."

Because I'm a very simple bear, I always try to find the most simple thing that works. Because I'm certain that the next programmer that comes along, which is most likely future me, will see my efforts and think "WTF?!"

Another facet is our human mind's propensity to make errors. Instead of blaming stupid users, just deal with it. Design artifacts and systems which prohibit errors.

FWIW, Donald Norman's book Design of Everyday Things completed my transformation from technophile to humanist. (I don't know how well it's held up since. Or what knowledge now supersedes it.)

"I would compare tool assistance to aviation instruments."

Great analogy. Many, many designers (of APIs, frameworks, UIs, simulations) conflate abstractions with mental models. Our tools should make the underlying systems obvious, vs trying to hide the complexity. And if something remains too complex to explain, try again, until a mental model that is simple and obvious is found.

I think that distinction between mental model and abstraction is something I've tried to put into words for a long time with little luck. Thanks for that.

I think that any person working with a system must be able to form a "morally correct" mental model of the entire system from top to bottom (as deep as it practically matters) or the abstractions of that system have failed.

When I read highly abstracted code, the times it gets frustrating is when I can't form a coherent mental model based on the abstraction so that I could understand how the system behaves. Usually that's because the abstraction is too generic or interwoven with a non-local mechanism that I can't predict.

The contrarian argumentation that I see on HN is always something to behold (I view it as a positive!).

An article comes out yesterday about programmers being a problem: comment section points in another direction.

An article comes out today that says that programmers aren't the main problem: comment section points out how awful all of us were, if we just go back far enough in time.

My main takeaway: Programming is hard.

Are you talking about monoliths vs microservices from yesterday? If so then I think that the points made are somewhat different; there is little reason to believe (that I have seen) that microservices are "better" than monoliths, but there is a lot of reason to believe that strong type systems (an other tools) provide long term benefits.

Not only with dynamic types, where more bugs are of the "very stupid" kind.

I run against my rust code, and damm.

However, this also prove how MUCH better is static typing. The rust linter provide hints about more serious stuff, not things that the type system already avoid!

In race cars (though maybe not F1), I’m told that the inertia of the driver would throw them out of the car on those turns if not for the safety equipment.

I think you can make the analogy work for professional drivers, but not for amateurs.

I'm not sure I follow this; are you referring to the great improvements in safety that F1 adopted relutantly, only after a long campaign by Jackie Stewart who'd seen too many friends die?

I did a review of the entire backend of a pretty big project just now. Very very worth it. I'm going to plan for that every 6 months or so from now.

In general, in favor of as many correctness and other checks at compile time as possible. Make tools as powerful as possible. I really liked this tweet:

"What if...

- your programming language required you to write useful docs,

- using those docs, it checked your program for mistakes,

- it even used the docs to speed up your program,

- this feature already exists!

And what if it was called static typing."

- https://twitter.com/DWvanGeest/status/1092095822559358976

Although I mostly agree with you, it's worth noting that static typing only goes so far. When I worked in a large-scale Java codebase, it always seemed like half the code only existed in order to "work around" the type system. (Don't even get me started on Spring, which might as well be a whole additional language on TOP of the actual application code.) I'm perfectly willing to grant that might just be a Java thing, but still, it was a huge and constant frustration.

Because of the above, types really don't seem like "useful docs" to me. I get extremely irritated when a library links to its API documentation and it's all just autogenerated lists of methods which proudly state their parameter and return types but say absolutely nothing about what they do.

That's one major thing I've always appreciated about the Ruby and Javascript ecosystems: because they can't rely on type information, most libraries go out of their way to describe how they work in depth.

I'm willing to grant that I didn't have a ton of experience with Java before moving on to other languages, and so I could very well be wrong... but one thing that definitely worries me about Typescript is the increasing Java-fication of browser languages.

Now, all that being said, I gave Crystal [0] a spin the other day and it's a freaking dream. Ruby, but with types, and compiled to LLVM? YES PLEASE.

0: https://crystal-lang.org/

I am beginning to suspect that Java is almost singlehandedly responsible for the prevailing opinion that static typing doesn't do anything useful and mostly gets in the way. Which is sad, because I'm also hard-pressed to think of a worse poster child for static typing than Java. It's like a stereotypical bureaucrat, asking you to fill out a mountain of paperwork before it can proceed to do a whole lot of nothing useful.

There is better out there. Sadly, for the longest time, the best examples didn't receive much interest outside of academia, on account of languages like Java being so entrenched. But it is worth spending some time with a member of the ML family to get a taste of what static typing could have been like.

Really? There's a ton of boilerplate in Haskell that goes into working around the type system (to the point where people have made alternate Preludes to avoid having to explicitly specify the goop), and I can't count the number of times I've had to massage types in Scala for no other reason than to push a class into a certain cats monad. Lets not get started on error types in Rust. I'm a fan of static typing, but let's stop this incessant drumbeat of propaganda saying that typing is free, and the pain comes from no-true-scotsman implementations.

> of a worse poster child for static typing than Java.

C, golang

Java is really the worst example for a typing system. It's basically the way you shouldn't do it. There are much better type safe languages like Haskell, Erlang, Rust, etc., even just Kotlin because it drops so much of the boilerplate and excessive unnecessary verbosity, while retaining the static typing and its benefits.

Kotlin is an improvement insofar as it does a decent job of resolving insanity like the "object-oriented-but-you-have-to-cross-your-fingers-behind-your-back-when-you-say-it" type system. But it's still not a great example for static typing, insofar as you're still working with Java's half-baked type semantics where the set of types the language understands is a superset of the set of types that are expressible on the underlying platform. Meaning that the set of types that can exist is quite a bit (I'm guessing uncountably infinitely) larger than the set of types that you can non-contortively deal with when doing generic programming.

Scala gets around this by attempting to extend the type system in deeply incompatible ways. Which mostly works, as long as you don't care about compatibility.

Clojure gets around this through a combination of dynamic typing and Rich Hickey including a section titled "I Don't Need Any Sour Grapes" in every lecture he delivers in public.

Of the three, TBH, I think Clojure has the best approach. You can't build proper static typing on top of the JVM without breaking compatibility with the rest of the Java ecosystem. At which point, why bother with the JVM at all?

You're right, yes. Thanks for elaborating.

> the worst example for a typing system

Not nearly as C or golang.

Well, we were talking about static/strong typing. Java is a bad example for static typing because of all its issues, which shouldn't be taken as proof that static typing is bad, especially when there are plenty of great languages that do use it well.

> Java is a bad example for static typing because of all its issues,

So are C and golang, even more so arguably.

It seems strange to say that something (types) isn't worth anything more than nothing just because it isn't everything (complete documentation)

You should consider overhead and completeness. If people spend lots of time writing boilerplate and coding around the type system but aren't saving a corresponding amount of time, there's a fair cost to that, especially if this means end up writing different classes of bugs but not really fewer of them. (This is basically an enterprise Java bingo card)

There's a certain argument that this is the sign of an inadequate type system: you're getting the overhead but it's not rich enough to be self-documenting or provide really rich analysis tools.

Types gain you some things, but there is an overhead, both of expressibility and of terseness. The costs (and gains for that matter) are different for different languages, and also different languages are used for different things, which make different trade-offs worthwhile.

This means that the trade-off Crystal give could be way better than the one Ruby gives, despite Ruby giving a better trade-off compared to Java (for some use cases). For other problems Rust, Go or C could give a better trade-off.

It is not a discussion about if types, all other things equal, are good or not. The reason behind this is simple: Not all other things are equal and not all type systems are equal.

>Although I mostly agree with you, it's worth noting that static typing only goes so far. When I worked in a large-scale Java codebase, it always seemed like half the code only existed in order to "work around" the type system.

I don't think that's fair.

I'd say that "half the code only existed in order to" use patterns and OO hierarchies for the sake of it.

Java can be programmed as lightly as Python or as a full-on J2EE monstrosity. The difference is cultural ("idiomatic") not brought about by types.

> Java can be programmed as lightly as Python or as a full-on J2EE monstrosity. The difference is cultural ("idiomatic") not brought about by types.

I wish I had the money to burn to buy a billboard in Silicon Valley to broadcast this.

I find that the extreme majority of the hate Java gets isn't really about Java itself, but the extreme adherence to the worst paradigms and idioms I've ever seen in the programming world.

Stop writing so many classes. I'm sure everyone here has seen FizzBuzz Enterprise Edition [0]. Sure, it's a parody, but a lot of truth is said in jest. There's too much Java code that looks like FizzBuzz EE, and there's no damn reason for it.

> I'd say that "half the code only existed in order to" use patterns and OO hierarchies for the sake of it.

I really wish I could understand why so many Java programmers feel like you can't just write a class. Nope, you have to make an Interface, with an abstract class that implements it, followed by your Implementation class that extends the abstract. But directly calling that constructor is verboten! You need to create a Factory class (With associated abstract class and interface, of course!) that returns instances!

[0] https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...

Java itself makes things a pain. Compare establishing a tls connection with a client certificate in python and in java. Guess which language has a secure one line way to do it.

>Compare establishing a tls connection with a client certificate in python and in java. Guess which language has a secure one line way to do it.

Nothing in Java prevents a Java network lib from doing it in one line. In fact many third party network libs do just that.

(And Python had more or less a similar unwieldy network lib, which is why they now say "Requests is suggested" on their own standard lib documentation: https://docs.python.org/2.7/library/urllib.html ).

So again it's not the language (syntax + semantics), it's what's considered "idiomatic" and what's prevalent.

Still it is possible to write J2EE on any language (wrote J2EE on purpose).

I have it seen done in Clipper, C, C++ and I bet even Go will one day have its Go2EE, it just needs a bit more enterprise love.

> Still it is possible to write J2EE on any language

Right. The difference is that for some reason, writing Java in the J2EE style seems to be the default, and rather than stopping the practice, people choose to hate the language instead, as if they think the J2EE style is required by the language.

Yep, I love how Java bashers get to deploy micro-services at scale, using Docker containers, managed via Kubernetes, with DSLs written via Terraform APIs.

Formal specifications can do all that if you build thetoing for given language and platform. Static typing is a subset of formal specifications. I suggest using the other phrase so they find more interesting stuff. Meyer's Design-by-Contract is an easy one you can do in basically any language.

Having used a bunch of libraries which seem to think that their source code with types (or auto-generated docs based on the types) is enough to be considered documentation... source code with types or auto-generated docs based on types isn't enough to be considered documentation.

The problem with that reasoning is that nearly every language with static typing also has some kind of type inference. You'll just wind up with a bunch of autos/vars.

Type inference is perfectly ok; it's still statically type checked and all the parameters and return-types are properly typed.

Type inference takes away the main argument against static typing: That static typing requires you type in a bunch of useless obvious types.

The problem with eating pizza is nearly every restaurant with pizza also has some kind of ice cream. You'll just wind up with a bunch of people eating ice cream and pizza all over the place.

Like, type inference is pretty widely regarded as a definite good. Additionally, technologies like row polymorphism make it possible to have statically typed and inferred duck typing. To me it's hard to hear this as anything other than mission accomplished. We made a static language that looks and feels like a dynamic one.

If you don't like type inference, you'll need to offer additional explanation as to why it is bad. Because otherwise you're statement only sounds like the one I made above. It sounds weird because one doesn't necessitate the other AND because they're both widely considered good.

Haskell has very powerful type inference but it also lets you ask the compiler "what type is this expression?" There even exists tooling that lets you automatically insert inferred type annotations. Sometimes the type it infers is more general than what you wanted, so you get the opportunity to fix it in a way that makes sense to you.

Similarly, the intellij plugin for Rust will reveal implicit type annotations.

Which isn't bad if the type inference is good.

Nearly all the type errors I see are things like

    var foo = "Hello World";

    bar = foo + 10;
That should scream it's head off.

To be fair, that's valid in Typescript because it simply uses Javascript's coercion rules.

You could imagine it having built-in rules for JS' coercions:

    string + number -> string
    number + string -> string
Maybe your snippet is provocative to some people, but in real code it would quickly fail once you actually use `bar` and were wrong about your assumptions.

At which point it's similar to languages with `Int + Float -> Float` and `Short + Long -> Long` in that perhaps you wish those operators weren't defined but it's at least consistent. And you'd find the error once you've passed those results to a function that expected Int or Short.

It's not the fall, but the stop that kills you. So in case of TS we don't know what was the intent.

Maybe it was this.

function suffixWith10(s: string) { return s + 10; }

But if the intent can be discovered by analysis of usages of the variable later in the code, then TS will scream loudly.

This is more of a knock against the compromises TypeScript's wacky type system has to make for JavaScript compatibility than anything else. It's not possible in a language that was really designed for strong static typing like Haskell or Rust.

The kind of errors I'm interested in preventing are more like this:

    var foo = 5;

    var bar = 100;

    baz = foo + bar;
Why is that bad? Well... what do 5 and 100 mean in that program? Did you just add age in years to pixels from edge? How do you know?

Appending an integer to a string is comparatively sensible next to adding age in years to pixels from edge.

Then you should not use a general numeric when you really should be using Pixel and Age types.

My broader point is that low-level representation doesn't matter nearly as much as semantics, and yet by default type systems are obsessed with low-level representations.

Appending a number to a string is a perfectly sensible thing to do when you're displaying information to a user, yet adding two numbers or concatenating two strings could be completely senseless and proof that the development process has gone off the rails.

My dream type system is strict about semantics and only handles representation in specific circumstances. The simplest way to talk about this is in terms of units of measure: Height is a type. Whether it's in inches or centimeters is a representation. Whether that representation is a number or a string is another representation. Keeping track of whether you're showing height-in-inches or height-in-centimeters to a user is useful, but burdening your mind with whether this height-in-inches is a string or a number right now when the runtime can just as easily convert between the two is senseless, and fretting over height-in-inches versus height-in-centimeters adding person-height to shoe-sole-height to get height-in-shoes is similarly senseless.

Your dream type system is out there, yet not manifest in any popular or somewhat-popular language.

I've yet to see a strictly/strongly typed language that makes this easy to do. Through some combination of plugins, esoteric languages, macros, or boilerplate, sure.

We're decades behind what is possible due to the practical.

Some languages don't care either way, and will let you add those types.

> Why is that bad? Well... what do 5 and 100 mean in that program? Did you just add age in years to pixels from edge? How do you know?

You know because you should be using sensible variable names that make it obvious.

Type inference for static languages are just syntactic sugar.

The IDE and the compiler will both scream if you try something like

var x = “John Smith”;

x = 5;

In C# var is always a place holder for the actual type which the compiler only accepts if it can figure out the type. If you do anything to the var that you can't do to the actual type it will still give you a compile time error.

Languages with type inference still require you to give enough information for a type to actually be inferred. Type inference isn't an escape valve for the things named in the GP comment, it's just a shorthand.

That's still good. In that case the language even writes the documentation for you!

I don't really use auto unless I'm interfacing with some templatized nightmare body of code where the typename is very difficult for my tiny human brain to interpret. Using "auto" is usually a code smell, because it means your type system is too complex for you to reason about.

This is a valid, strongly typed C# object:

var foo = {bar = 1, baz = “Hello”, boo = “World”};

You get auto complete help and compile time type checking.

foo.bar = “Goodbye”;

Won’t compile. What would it buy you to not use ‘var’ and create a one time use class/struct?

I use auto almost always. Makes the language "feel" more dynamic but you still have type safety.

It would make me ask why you need a one time use struct at all and probably remove it.

More complicated example:

var seniorMales = from c in customer where c.age > 65 && c.Sex == “male” select new {c.FirstName, c.Lastname, c.Age}

foreach(var customer in seniorMales) { Console.WriteLine(customer.Firstname + “ “ + customer.Lastname); }

Why would I create a class/struct for that use case?

Side note: this is why I find ORMs in most languages besides C# useless. Here, “customers” can represent an in memory List, a Sql table or a MongoDB collection and the LINQ expression will be translated appropriately to the underlying query syntax.

The “ORM” is integrated into the language and yes anyone can write a LINQ provider.

This us one of the cases where I would want you to write an explicit type for seniorMales. Reasoning about the LINQ expression is just complex enough that by not constraining its type to your expectations you can easily obscure a mistake in your thinking that would otherwise show up in the data types.

How so? You couldn't pass the anonymous type to another method or return it so the locality would be strong and the anonymous type is just a strongly typed POCO. Creating a class doesn't add anything semantically.

How do you feel about deconstructuring that is available in most languages?

I think the anonymous type is much better. It's type safe, you can easily see how it's defined but you don't pollute the rest of your code with one-off classes that make only sense within the function.

If I had to review the code and saw an explicit type I would recommend an anonymous type.

It is not better. I don't know what type FirstName or Age is. Can Age be negative? Is FirstName an array of char or a string? Can FirstName be more than 10 characters long? It could be a double value for all I can tell from that line of code!

The select statement is projecting the Customer type from either an external data source or an in memory source. That information wouldn’t be encoded into the POCO whether or not it was anonymous.

Changing select new {...} to Select new SeniorPeople {...} wouldn’t give you any more information. At most if I was writing that as part of a repository class I would be sending you back an IEnumerable<SeniorPeople>.

If you were to send me an expression to use in the where clause you would send me an


You have no idea what that expression is going to be translated to until runtime.

Either way, you are just working with non concrete Expression Trees that could be transformed into IL, SQL, a MongoQuery etc. All of the constrainsts would be handled by your data store.

You don’t even know before hand whether you are iterating over an in memory list, or streaming from an external data source.

I could switch out Sql Server for Mongo and you as the client wouldn’t be any the wiser.

And we haven’t even gotten to the complication if the result set was the result of a LINQ grouping operation.

You are comstructing an elaborate strawman that does nothing to work in your favor. If anything, yur essay just makes it clear that it is impossible to glean the data type from the LINQ expression itself.

Yet, any code using its result will contain implicit assumptions about that data. Therefore, in order to maintain type safety, the expected type of the return value must he stated.

That type must also be independent from whatever LINQ does internally to produce that return value. Otherwise, the LINQ providers wouldn't be as exchangeable as you claim.

You also included information about “constraints” like is it a positive or negative number and string length. That wouldn’t be encoded in the type. Why would I go look up the types either way? I’m using a strongly typed language, the IDE would tell me that anyway. While I am writing code, I see a red dot showing me immediately if I’m using the type incorrectly. Any assumptions that were incorrect, I would get immediate feedback.

Not necessarily. Is Age unsigned or not? This constrains the possible range of values. In other contexts, there is a lot more information inferred by the integer type used.

Do you actually work with C#, LINQ and generics?

Yes. I am quite familiar with them.

Then tell me how you can model in C# a type where a property can’t be negative or a certain length.

    class Foo {
      public uint Bar { get; set; }

      private char[] _baz;

      public char[] Baz { get { return _baz;} }

      public Foo() { _baz = new char[5]; }
This is cheating a bit on the array part. But the length of that char array is set in stone and cannot be changed from the outside. However, the Bar property uses a standard C# datatype.

I honestly thing that you haven't done any real work with LINQ or you are imagining problems that don't exist. The char array is not usable as string or we have to go back to 0 terminated strings like C.

I am not imagining problems that don't exist in the application domains I work on.

If your environment is so relaxed that aspects like signedness of an integer don't matter, so be it. But then you are also far removed from a domain where software correctness is non-negotiable.

It’s not aspects “like signednes” that’s the only constraint that you could embed in your defined POCO and still use as a select projection that would survive a LINQ expression that was turned into an expression tree. Your data store would enforce the other constraints. Your C style character array as a string wouldn’t work with either LINQ nor any of the standard library.

Are you really trying to go back to the bad old days with Microsoft C/MFC/WIN32/COM where you had over a dozen ways to represent a string that you had to convert to and from depending on which API you were using?

And your class wouldn’t work as part of the LINQ example. EF for example wouldn’t know how to translate it.

How do you want to model this in C#? Write a full class with all kinds of checking for a single LINQ operation? That will be a very bloated codebase.

Not necessarily. The types may just have very long names. In C++ for iterating through STL container auto is a godsend. The types are very straightforward and easy to reason about but just long.

for (auto it = s.begin(); it != s.end(); it++) {

is much easier to write than

for (vector<int>::iterator it = s.begin(); it!=s.end(); it++) {

I try to avoid using STL because it A) throws exceptions, B) has really convoluted type names, and C) dynamically allocates memory. These are all things you avoid in the embedded space.

Makes sense for embedded.

Any use of templates has the same problem, including templates that are used to generate highly efficient static constants and non-branching-at-runtime code.


Templates (template metaprogramming) help to avoid exceptions. You can check invariants first and jump into efficient code.

or almost like python:

for (auto &element: s){

even better with structured bindings:

for (auto [key, value]: map){

why we can't have nice things, in one comment.

auto is not a code smell on the contrary, it is used when your type system is easy enough to reason about that you don't need to write the actual type.

If it's so easy to reason about, why can't you just state the type of the variable? Doing so helps me understand what's popping out from the rvalue in assignment so that I can follow what your code is doing.

If you're worried about spending time changing type names during a refactor, look at it instead as an opportunity to evaluate the correctness of the code in the context of your replacement type. Use of the auto keyword avoids doing that and as a result enables you to create new and exciting bugs in your code.

> If it's so easy to reason about, why can't you just state the type of the variable?

because : - easy to reason about and easy to type are two differents things - naming everything increases mental load

> Use of the auto keyword avoids doing that and as a result enables you to create new and exciting bugs in your code.

to the contrary, porting a lot of my code to use almost-always-auto did actually remove bugs in the form of silent type conversions happening - e.g. std::pair<std::string, int> instead of std::pair<const std::string, int> (yay memory allocations), bad fp conversions...

> “The problem isn’t the use of a memory unsafe language, but that the programmers who wrote this code are bad.”

This really goes to show how anti-worker the media is even among high-skill jobs.

In my mid 20s I now mentor a bit in coding. I’ve worked with young devs that inherently know many of the obvious security pitfalls that have caused massive security breaches a la Equifax.

Are devs at the front of these breaches bad? I don’t believe so. Many went through grueling multi-part interviews to land the position.

No, the issue a management one. Anti-worker is a meme in America and this is just another manifestation of the disingenuous “lack of tech talent” whining used to recruit more H1B.

It’s also why unreleastic growth targets cause gaming tech workers to face layoffs despite record breaking sales.

Well, although it could be anti-worker, it could also be one-upmanship among programmers with fragile egos: I'm not a bad programmer; they are.

And I think ego is a barrier to accepting tools that prevent errors. To accept the tool is to admit to yourself and others that you are going to make the error (sometimes) without it.

I've seen the ego issue pop up a lot in some communities. The biggest example I can think of is the OpenBSD mailing list. Any time I'd see an outsider bring up Rust or other safer languages, the overwhelming sentiment is that languages with built-in safety features are for people too stupid to write C, and that safety features restrict good coders ability to write performant code.

While that sentiment is certainly alive and well, that's not the actual reason why the OpenBSD crowd is not gung-ho about Rust: https://marc.info/?l=openbsd-misc&m=151233345723889&w=2

The takeaway of that thread is that OpenBSD's "shut up and code" mantra is applicable here. If you want a Rusty OpenBSD, then start replacing pieces of OpenBSD's C codebase with Rust (you might have to fork OpenBSD for awhile in the process).

You'll probably run into quite a few issues. Most severely, OpenBSD must be self-hosting (i.e. able to compile itself) on every hardware platform on which it runs; if a modification causes that to be no longer true, then either that hardware target must be dropped or the modification must be rejected. Considering that Rust doesn't support compiling for all the hardware platforms OpenBSD supports (let alone compiling a rustc that actually does run natively on each platform), Rust is immediately a nonstarter.

> This really goes to show how anti-worker the media is even among high-skill jobs.

The quote was regarding "social media" as in mostly the workers themselves, not "the media" as in the independent journalists reporting on the workers.

I think you're agreeing with the article in the end, and I also agree with it, but I'd like to note two things:

- Grueling interviews are not necessarily correlated with skill. In particular, I've done a lot of whiteboard coding and a handful of take-home exercises, and not one has anyone cared either if I was writing buffer overflows or if I was reaching to a buffer-overflow-prone style. On the whiteboard, on either side of an interview, it's generally a loose pseudocode with the understanding that errors are not to be checked, that syntax doesn't need to be exact, and the point of the exercise is not whether you had an off-by-one in a calculation. So the skills of either being awesome enough to write bug-free code in security-sensitive environments or humble enough to use tools (languages, analyzers, sandboxes, whatever) to protect yourself from inevitable mistakes are not tested at all.

- "Lack of tech talent" generally refers to the number of people being insufficient for the job, not the quality of the existing people. It is entirely possible to believe that the problems of insecure code are not a level-of-talent issue and that we still face a lack of tech talent. (I have certainly never heard it claimed that the problems we face with insecure code are that too many Americans are bad at coding and we need to hire smarter foreigners who don't write bugs.)

> "Lack of tech talent" generally refers to the number of people being insufficient for the job, not the quality of the existing people

The "we can't find enough people to do this job" complaint conveniently leaves out "...at the wage we offer".

The principles of supply and demand apply here - if a company pays well enough they will get enough quality candidates for almost any job. It's insane to believe that a company literally cannot find a good C developer.

Even if the problem was "bad coders", the answer is still better tooling.

The tooling we have now is pretty poor, in general. Sure, we can catch certain classes of bugs. Many bugs still occur in areas that aren't covered by those classes, even in the most robust languages.

Obviously memory-safe languages would help enormously. I also think that compilers are still in their infancy.

>> I wanted to avoid spawning a thread and immediately just having it block waiting for a database connection...

>> The problem is that the database connection would sometimes use a re-entrant mutex when it was acquired from the pool...

>> with a normal mutex we would be fine, since you only one lock can exist and it doesn’t matter if we unlock it on a thread other than the one we locked it from...

>> Fundamentally, we just can’t have a re-entrant mutex be involved and also be able to pull the connection from the pool on a different thread than it is being used...

Truly good coders are very rare because it's not just about mastering all the available tools and abstractions, it's also about the ability to come up with abstractions that make it simple for anyone looking at the code to understand what is happening.

Good coders can write simple code to do complex things.

I can see how this article might be comforting for some, but "bad coders" is still a problem. There are a lot of amateurs in this industry (even ones with years of "experience").

You can create an extremely opinionated language which doesn't give any freedom for creativity, solutions and expression, and devs will still find a way to duck everything up.

IMHO, OP saw a problem but didn't arrive to the right solution.

I don't see how the article is trying to be "comforting," or dismissing the value of skill and education. I also don't see how the `Send` trait featured in the article "doesn't give any freedom for creativity."

I read it as claiming that skill and education are insufficient to prevent these bugs, and that automation is still valuable no matter how skilled the programmer is.

In a broad sense, the forces of safety and flexibility in a language tend to be at odds. Rust does a lot of clever things that push that curve some, but a big part of what makes it safer than C++ is what you can't do.

Rust is actually looser than C++ in some respects. For example, there are no type-based aliasing rules to worry about in Rust. This makes writing "type punning" style code less of a headache for me, personally.

Sure, I certainly agree there. I just thought the OP was going a bit overboard- stifled creativity wasn't an issue here at all.

I refuse to acknowledge that C++ is clearly more flexible than Rust. My favourite counterexample is Rayon. Even if you are able to implement something similar in C++ (I tried it), the job of the C++ compiler with understanding when it can inline functions without overhead gets so hard (vs the Rust compiler), that it becomes inpractical.

Generally this has been my experience with higher order programming in C++ actually.

It's good to have bad coders. Not everybody can have an exceptional IQ.

Not everybody can be good, but everybody deserves a chance to be a coder. There are real problems that can be solved by a bad coder as well, that may help your life some day. Great coders are needed for large, scalable, hard problems, but there are little things they don't have time to work on.

It is NOT good to have bad coders, it's good to have a promising talent.

Also when can we finally drop this IQ thing? Your IQ does not guarantee you anything. Attention to details, perseverance, curiosity are the great promising qualities, not your IQ.

Coding is essentially a series of IQ tests. Seeing patterns easily, having lots of working memory... You're not going to be good without it. Life is unfair, and some get this given to them, and some don't. To be fair that probably partly applies to the qualities you listed as well. So it's true, IQ is not a promising quality, it's a requirement.

Is there any proof of developers having higher average IQ than other career paths or are we just assuming that because we're developers we're naturally more intelligent (with a dash of dunning-kruger for flavor).

Google for "occupation iq" readily finds many surveys of different occupations with a fairly expected correlation: Professors and scientists on the high end of the spectrum and laborers on the other. Software engineers/CS isn't at the top but tends to rank fairly highly.

However, narrowing these stats to a single person as meaning "if you have a low IQ you can't be a good engineer" is incorrect, as that is totally throwing out: 1) IQ being only a single of many factors (reading about this from people in the field, I often hear IQ cited as the best single predictor we have, but still only accounting for ~30%) 2) IQ can change 3) IQ doesn't include domain knowledge, determination, or any other factors particular to a specific occupation.

> Is there any proof of developers having higher average IQ than other career paths or are we just assuming that because we're developers we're naturally more intelligent

I don't think the poster is claiming that programmers are smarter; they're claiming that what IQ tests evaluate, and what programmers do, are similar. But IQ isn't the be-all and end-all of intelligence. If you're a programmer, you might score high on an IQ test, not because of how smart you are, but because the skills being evaluated align. You might not be very intelligent at all, just good at IQ tests.

I don't think the parent comment is saying that developers have a higher IQ than all other professions, just that developers are smarter than average. I personally think that there is a huge range but would bet that the average programmer has a higher IQ than the average non-progammer. Of course, the same could probably be said for a number of professions.

You don't have to be dumb to stock groceries or be a line cook, but you don't have to be very smart either, and if you are smart you won't get a lot of intellectual challenge from work.

Another take is that DESPITE there being a lot of 'amateurs' in the industry, the world generally seems to be humming along as normal, if not doing very well.

> Another take is that DESPITE there being a lot of 'amateurs' in the industry, the world generally seems to be humming along as normal, if not doing very well.

Given that the count of software security incidents and data breaches are skyrocketing by all measures (e.g. https://www.varonis.com/blog/cybersecurity-statistics/), that's not remotely a reasonable view to take.

Data breaches are rarely correlated with amateur developers in my opinion. Breaches occur when a company makes financial decisions that it is cheaper for them to deal with the potential fallout of a leak than to prevent it in the first place.

More skilled labor won't solve this problem because it's fundamentally an issue of management and cost-saving measures.

The two factors are equally important; it's just that in certain circles, tooling and language assistance are undervalued and already-good programmers are told to "just get good".

One thing that clouds the issue here is that it really is true that better coders do write fewer of these kinds of bugs. That's a real correlation and not a fictional thing.

So I think what we need to do is acknowledge that but keep it in perspective. While better coders write fewer security bugs, even the best programmers still write some. So it can't be our only line of defense.

Also, suppose for the sake of discussion it were true that top programmers did write zero security bugs. Realistically, as an organization, how would you ensure you employed nothing but exclusively these programmers? Even if you make it a top priority, you can't guarantee it.

Especially since the way you become a great programmer is by starting off as a less-great programmer and getting experience. And while you're doing that, you're churning out software which is by definition written by a less-great or not-yet-great programmer.

And to take it further, let's just consider that we as programmers have a lot of work to do. The current scope of "things people are writing code for" is nowhere near the total scope of possible useful things we could be working on. We want to enable more people, even if they are not "the best" programmers, to build software, safely.

People who suggest "well, just hire better programmers" are incredibly naive and probably have never actually had to deal with the challenges around hiring programmers.

In fact, I think it's the case that the most consequential security issues tend to come from the best programmers. That's because the best programmers tend to be the ones working on high-impact projects such as the Linux kernel. The more widely used the software, the more impact the security issues in that software have.

No need to say 'bad', but there is definitely a shortage of experienced coders.

- Competition/Compensation for experienced coders has risen sharply

- There are many more inexperienced SDE's coming from colleges/bootcamps/etc

- Senior titles are often given to individuals who are still very early in their careers

Now, these factors might be necessary/good in the short term as software continues to eat the world.

But, let's not pretend that the talent shortage isn't a problem.

Way too many developers are working on proprietary implementations of what is fundamentally a content management system. We should be creating half a dozen of these things and doing a little light customization and a few addons. Instead have a couple, and devs look down on people who use them.

Either we are lousy at picking projects, we have a broken culture, or the problem is that we have too many developers and so people can ramp up a bunch of projects that have already been done hundreds of times elsewhere.

We’ve added lanes to the proverbial highway and traffic just gets worse..

Somewhat hilariously, the problem isn't lack of talent in programmers, but lack of talent in companies. Someone on this thread claims everyone is being stolen by top tech companies, that means it's you who needs to make your business more attractive, not that there's a shortage of coders.

Experienced programmers know they don't have to settle for less.

The two issues aren't mutually exclusive.

The supply/demand balance for experienced engineers is such that many companies are (explicitly or not) choosing to go with less experienced engineers.

Where are you specifically experiencing “a shortage of tech talent”?

Your issues in recruiting interviewing and oboarding are not the fault of tech workers.

>> With a normal mutex we would be fine, since you only one lock can exist and it doesn’t matter if we unlock it on a thread other than the one we locked it from.

Sorry if this is a dumb question, but I'm confused. Aren't mutexes always supposed to have ownership which implies that only the locking thread can unlock them?

> Sorry if this is a dumb question, but I'm confused. Aren't mutexes always supposed to have ownership which implies that only the locking thread can unlock them?

Mutexes are supposed to ensure that exactly one thread can accesses resource ("have the lock") at a time. There's no fundamental reason you can't pass the lock from one thread to another, as long as they don't both have it at once. But it may not be supported by the particular mutex implementation. It's not supported by the recursive mutexes the author was using, and I'd bet there are also non-recursive mutex implementations which don't support it. I agree with the author that it's great Rust can catch this sort of mistake.

btw, I think recursive mutexes and handing off locks are bad ideas. Both for the same reason: I want short critical sections to improve contention.

* Code that uses recursive mutexes tends to be sloppy about this; it's unclear from reading a section of code whether it even has the lock or not. (This also sounds like a recipe for deadlock when you need multiple locks.) I'd much rather structure it so a given piece of code is run only with or without the lock. In C++, I use lock annotations [1] for this. If I need something to be callable with or without the lock, I might have a private "DoThingLocked()" bit, and a public "DoThing()" bit that delegates while holding the lock. This should also be more efficient (though maybe it's insignificant) because there's no run-time bookkeeping for the recursive mutex.

* Handing off the mutex to another thread also feels like a smell that you're holding the lock longer than you need. I don't recall a time I've ever needed to do it. From the description here, it seems totally reasonable to hold a mutex while getting a connection from the pool and while returning one to it, but not between. I'd think you could get the connection, then create the new thread (passing the connection to it).

[1] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html

Thanks, this adds a whole new perspective for me wrt mutexes. I wasn't aware of all these other usage patterns for them at all.

Most of my work is in the OS, drivers and low level space, and I'm a beginner there as well, hence the short critical sections under a single owner are the only places I had encountered them before.

A reentrant mutex will allow you to unlock it multiple times so long as you're on the same thread. That means if Thread A takes the lock then later some code on Thread A tries to take the lock again to get a connection to pass to Thread B you'll have the connection the mutex was protecting on two threads at the same time. The rust version of the mutex prevents this by making the data the mutex is protecting unable to be sent to other threads. That means you can only share the mutex which will block on Thread B when you try to take the lock, as expected.

In almost any implementation you don't want to track the owner because it is only an unnecessary overhead and nothing else.

On a similar note the articles strikes me as somewhat contrived example, because the most obvious implementation of reentrant mutex also does not care about the owner thread.

Edit: Another reason why it feel contrived is that in fact reentrant mutex represented as first class datastructure (in contrast to monitor as an language-level construct) is to some extent only an hack to solve issues stemming from improper design.

The main problem is that being secure doesn't increase the revenue of companies substantially, otherwise C/C++ programs would be rushing to Rust. Still, the speed of improvement is great.

I'd love to see Firefox take over Chrome in speed and show that C++ is getting closer to being an outdated language.

Contrasting C and Rust completely misses the issue. Yes, obviously Rust will do away with some important classes of bugs and security vulnerabilities, but it's got nothing to do with addressing the problem. The problem is that there are billions of lines of C out there, and it will take many decades to replace them with software that's written in a memory-safe language. The issue is what do we do with all that existing software, as rewriting it at any reasonable cost or time frame is simply not an available option. A possible solution could be something like sound C static analysis tools that can guarantee no overflows etc. without a significant rewrite. The question we should be asking is how easy it is to use those tools, how scalable they are etc.

No, it is the issue. Before one can even consider the proposition that it might not be wise to write X in C, you first need to convince people that the tool (that is, C) is actually a problem. That's what the OP is targeting: people think the problem isn't the tooling, but the programmer.

The task of figuring out how to actually use the language is an entirely separate, though valid, problem. But it's not some giant mystery. The folks working on Firefox haven't set out to rewrite the entire thing in Rust. They're choosing targeted places where it works. Insomuch as I know, this has been a success.

But even if (assuming that safe languages have reached a point where they can be a suitable replacement to C in all circumstances, which they haven't yet) not a single new project is written in C from now on, it will take decades for our software to become more secure. The question is what do we do until then. There are some excellent tools that can be used, but aren't.

Who are you arguing with? Not me, because I didn't claim otherwise. And certainly not the OP. You're missing the part where people don't even recognize the problem in the first place. You have to fix that first.

I think enough people recognize the problem with C. But a rewrite is just not a relevant solution to the actual problem we're facing. It's a long-term solution, but that's not what we need now.

> people think the problem isn't the tooling, but the programmer

I extend it, and say that choosing a bad tool is a sing of a bad programmer (even if the programmer is so skilled that use C very well!), but the resistance of recognize C/C++/JS to be BAD, REALLY REALLY BAD, PLEASE NOT USE THIS ANYMORE (only in rare exceptions, and still not use it ok?) Make this a full circle.

We need more "good" programmers to express this, alike how say "use of carbon/fuels is bad, and the fact the WHOLE ECONOMY OF THE PLANET is not an excuse to not replace it ASAP!"

And yes, why not make a concerted effort of make real replacement to well chosen libraries? I think using the pareto principle, we can target 20% of the most popular stuff and have a huge impact.

There's a whole segment of CompSci dedicated to doing that. Softbound+CETS and SAFEbound are two of the better examples giving C memory safety. Data Flow Integrity is a newer one thst might be combined to bring security against data-oriented attacks.

The big issue with such tools is that C's lack of design and low-level nature make the tools have to be extra careful in ways that add extra overhead vs languages designed for verification (eg Gypsy, SPARK Ada). So, the slowdowns can be huge.

I still like them, though, given any approach that works turns the problem from "recode critical, legacy apps securely" to "buy a load balancer and some more servers." Huge improvement in feasibility.

Far as proving absence, RVI's RV-Match can do that against a full (or nearly so) semantics of C in K Framework. Their semantics, KCC, is mocked up like a GCC compiler to make it easy to try with code. It also gets stuck (fail-safe) on undefined behavior.

And AbsInt with Astree Analyzer used by Airbus. Those are the three I know. Rosu's group open-sourced K with their language work and publications full of awesome stuff. Since Im a "vote with your wallet" guy, I always mention RV-Match over the others to hook up a company whose people are hooking us up with grest, building blocks for free. Just figured I should be open about that. :)

> ... use a re-entrant mutex when it was acquired from the pool ... normal mutex we would be fine, we unlock it on a thread other than the one we locked it from ... re-entrant mutex remembers which thread it was locked from, we need to keep the resource on the same thread

Sounds like a "bad coders" problem to me! This design is so screwed up that no tool or smart compiler can save you from problems here.

I think this argument grants too much to the C-believer crowd by presenting a case where even an infallible programmer would have ended up making the mistake. The author is arguing with people divorced from reality on their terms.

Next, you'll get counterarguments from the peanut gallery proposing that the merging the changes was just the fault of bad process, and as long as people don't make the mistake of applying a bad process, evrything would be fine.

Possibly sounds like a bit of a design issue to me, author /assumed/ that reentrant mutexes wouldn't be added to the code, and this wasn't documented or tested for properly...

TL;DR: "[Coding perfectly and anticipating any possible change to how the tools work] are not reasonable expectations of a human being. We need languages with guard rails to protect against these kinds of errors."

I definitely agree with this. And it also applies to a lot more than what the article is focused on (low-level security). It seems that right now the entire programming ecosystem seems to jump to "you just don't understand it" rather than "this should be more intuitive, or at least safer by design."

There are several quasi-related variations. Arguments against higher level languages and abstractions.


If you have to use a garbage collected language (eg, just about any modern language) it's because you're too stupid to know how to manage memory properly.

[ various arguments against type safe languages, turning runtime errors into compile time errors ]

Higher level languages and abstractions are just bloat. [or are too bloated, etc]

Managed language runtime systems are too [ big | expensive | bloated | slow ] etc. (eg, the Java runtime, or .NET runtime, to a degree also Python, JavaScript, Lisp(s), etc)

Counter arguments:

Any sufficiently complex program will need the managed runtime, garbage collection, abstractions, type safety, etc.

Greenspun's tenth rule: (https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule)

Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

We could just write everything in assembly language, er, . . . um . . no, in hex code. And key it into the front of the machine on toggle switches with blinking lights. We could! Yes, really! So why aren't we? Why isn't, say, LibreOffice written in assembly language?

I use high level languages, runtime systems, GC, etc because I'm not optimizing for cpu cycles and bytes. I'm optimizing for DOLLARS. A long time ago, in a galaxy far, far away, computer hardware was the most expensive resource. Today developer time is the most expensive resource. Everybody is happy to have their word processor upgrade a year sooner even if it means it uses a mere extra 500 MB of memory.

> I'm optimizing for DOLLARS.

From the outside looking in, I mostly do firmware and test stuff. I see two things that seem to be mistakes developers are constantly making.

Lack of consideration for the how much cash flow each transaction represents. Selling clicks for ad revenue is orders of magnitude less than selling machine tools. And the number of transactions are if anything even more unbalanced. Low transactions high revenue per transaction usually means scale is unimportant and you shouldn't be paying for it.

And a quote from Robert C. Townsend's Up the Organization. Big companies didn't become big by acting like big companies. Trying to emulate GM is a path to failure. When the book was written people thought of GM the same way people think of google and facebook. The subsequent history is potentially a warning as well.

Personally I think 'scale' is important in early stages if and only if you're building a 'bomb' a company that's designed from the start to grow orders of magnitude in five years. So really people should ask where are we really honestly going to be in five years and target that.

The devil is in the details usually here. Bloat can absolutely affect your bottomline in many ways.

One man's 'bloat' is another man's features.

Consider both Microsoft Word and Notepad.

Which is more bloated? Which is more light weight?

Which has more powerful features, some that you don't even notice right away, like spellcheck, grammar check, etc?

Which has fewer features?

People usually complaining about 'bloat' in a high level language or its runtime are really complaining about features that they either don't use, don't understand, or don't even realize are there.

> Which is more bloated?

Microsoft Word

> Which is more light weight?


> Which has more powerful features, some that you don't even notice right away, like spellcheck, grammar check, etc?

Microsoft Word

> Which has fewer features?


Users who are complaining about bloat are usually complaining about either application performance, ease of use and basic task completion complexity and/or general cluttering of the UI. Longer than expected load times can lead users to believe an application is bloated as well. It's true that all these things might be caused by something other than bloat, but these are typical repercussions of adding general cruft and also these tend to be design decisions and implementation details separate from the feature that the bloat was added for.

Developers complaining about bloat (not the ones on HN giving a critique of a github repo after browsing it for a couple minutes, but the ones who are actually knee deep in source code trying to meet a deadline for a specific feature or bug fix and to whom the bloat is literally a gravity well affecting their development pace and agility) are usually referring to previous development cycles that they believe unnecessarily introduced complex libraries for simple tasks or outsourced application logic to dependencies that are not under control of the in-house development that have unnecessarily made the code base unwieldy and created more rotten code overall from a maintainability/extensibility vantage.

In both these cases, a good early signal of detrimental bloat would be someone taking out a PR that introduces 500MB to the application. Justifying it as I'm maximizing for DOLLARS makes me think of someone who just won the lottery deciding that as a result of their net worth increasing a thousand fold, they're going to start willingly paying 1000x the price for things and also buy 1000x the amount of things. Should anyone try to convince them of how this logic is going to work out (perhaps someone stuck behind them in line at the store as they ring out the tens of thousands of grocery items, each of which provides a special value most people don't use, understand, or even realize is edible), the lottery winner calmly explains to the well-meaning individual (who happens to have just as high a net worth as the lottery winner it turns out but has achieved this over time and will have a higher net worth going forward from here) that perhaps this well-meaning simpleton just don't realize how much money has just been won.

And sure, an application like Microsoft Word that has been actively developed for more than 35 years will likely have bloat, especially when maintained to preserve backward compatibility. There's a statistical minimum in this regard. Whether or not the bloat is as bad as it needs to be is another thing.

If the development team behind Word has simply been adding 500MB here and 500MB there and defending it with "we're not going to write the new features in assembly" or "this helps me as a programmer be more productive" or "this source modification is so large because of all the features you don't know about" or "what?! we're not in North Korea" or all of the above, they are 100% creating more work for themselves and this extra work definitely comes with a price tag. In fact, without really have much insight into the general culture or incentives over there, it's a reasonable bet they've taken steps to prevent developers from introducing tech debt and bloat with this sort of defense.

Either way, it's weird to point to Microsoft Word's accumulated debt and overall decreasing performance specs as a result of multiple decades of evolving development as a reason to dismiss bloat as a non-issue. If I were a career non-bloat advocate, traveling from university to university on my canola oil powered moped I'd be pointing to Word as a reason to take bloat seriously and a reminder that the feature your adding now will likely be on the opposing force of a future person's battle with bloat and you should fight it with all the flower power you can muster or risk them tracking you down with a vendetta after they realize it was your commit making their job shit right now.

I'm torn on this, because while we can always do better, sometimes there's things you just gotta know; this is the best we can do, so far. There's also lots of lazy programmers, that think everything should be easy and intuitive, when in reality they need to do more to understand the tools they have and hone their craft. I refuse to call them bad, because I think most people can reach expert level. Maybe not master or grandmaster, but experts would do.

Wanting better, safer, and more intuitive tools does not presume that people don't and shouldn't grow.

Even with the tools there are plenty of bugs/challenging problems in just understanding and correctly expressing the problem domain in those safe and intuitive tools.

The two positions are not incompatible.

Completely agree. That said, we have what we have, and people tend to be very idealistic about these things (favoring one vs. the other). Example, I use Git and Make in C. Others claim "this is the future, everything should be intuitive and graphical". I honestly don't see a way forward for either technology to be more intuitive; a better Git GUIs would be a plus for new users, but that's just the tip of the iceberg.

I'm not the type that think "you must do everything at the command line, or you're too stupid to be a decent programmer"; those people do exist. On the flip side, complaints about not being intuitive enough might be met with a shrug from me. Do we really want to use Eclipse and SVN, because you don't want to learn how to use the tooling? (of course, I'd never actually put it like that) Personally, I don't think it's a better solution. I'm not holding my breath for a complete overhaul of Git / Make, and am wary of trying something more novel for now.

Sure work doesn't need to completely halt while we wait for a perfect solution. But I think that where there are safer and more intuitive tools we should use and promote them.

> It seems that right now the entire programming ecosystem seems to jump to "you just don't understand it" rather than "this should be more intuitive, or at least safer by design."

Stockholm syndrome? Hazing? Fear that skills that moat off expertise will be devalued?

Saying that I should do something that the computer can do for me is telling me that my time is not worth anything. It's pretty insulting.

> This wasn’t caught when I finished writing the code. It was caught weeks later, when rebasing against the other changes of the codebase. The invariants of the code I was working with had fundamentally changed out from underneath me between when the code was written and when I was planning to merge it.

Aside from the wider question about bad coders, I don't understand why he didn't catch this when he wrote the code. Didn't it fail to compile?

I believe the implication is that when he wrote the code it compiled just fine. There were no reentrant mutexes, so one can assume the database connection he was working with conformed to Send at that point, and it was only with the addition of the reentrant mutexes that the connection lost the conformance to Send.

This article hits the nail on the head. When speaking with someone who argues that we don't need memory-safe languages, just better programmers, I always like to ask if they've ever had to use a debugger/ever made a single mistake programming. What you're asking of programmers is that they never make mistake, whether it be a memory-safety bug or just a regular bug. It's simply not possible.

I have argued many times it isn't the language that is the problem, necessarily. Never have I argued against tooling, though. Rather, toolchains can be added to without being replaced.

That is, my problem with this blog and most posts like it. Your tooling should not begin and end with your compiler. It is a vital part of your tooling, no doubt. But if all you did was compile it, you are playing a risky game.

The problem often is needing to fight with less technical higher-ups for adequate time and resources for tooling/testing/infrastructure work.

Reading the argument laid before me in this post. I disagree. It most certainly is due to negligence. Human negligence. Poor communication, and project management.

Maybe, you can't correlate this to "bad coders", however, if the coders have the issues described within this post then most certainly I would consider them "bad coders" when paired together.

As a python user I understand the need for better invariant checking, but should it be encoded in types, contracts or conventions?

I vote for all three.

    * Types so I can prevent all the low hanging fruit.
    * Contracts so I can encode invariants that the types 
      can't always express.

    * Conventions so that everyone is talking the same 
      language and there is less confusion

My goal wasn't so much to promote any specific answer, just to rebute the argument that the solution to memory safety bugs is to have better C programmers

Yes. Often the difference between a mediocre programmer and an excellent programmer is ability to use tooling effectively to understand how their program works and where it fails.

Types because they can be checked mechanically, simply enough that a programmer can understand the checking process and why it fails or succeeds, but are expressive enough (given some relatively cautious/widespread advances - HKT and some level of dependent typing) for every invariant I've ever needed.

The problem isn’t bad coders for security bugs at Microsoft. But outside is the realm of sql injections vulnerabilities, unpatched software, default passwords left unchanged, hardcoded passwords in code, unprotected mongo dbs exposed to the internet, xss vulnerabilities, etc etc.

There the problem lies between the chair and the keyboard.

The argument "it isn't language, it's bad coders" is a pretty good way to identify bad coders.

Is anyone else bothered by the use of the word "coders"? Like, coding is something you do to medical records. We're programmers, god damn it!

What is it with these people and controlling the choices of other, quite possibly more experienced, programmers?

Why is it not enough to offer better tools and let the rest take care of itself? Or to solve problems using a tool that fits your way of thinking? Why do you need a cult/marketing effort if the language is as good as it claims to be?

The more of this bullshit I'm confronted with, the less inclined I am to ever let Rust slip into a project I'm involved in.

Because ecosystems are driven by mindshare (popularity), convenience, adoption potential barriers, public shaming (eww, that's nasty, but .. people are people), and other psychological micro-foundations. Basically it's a cold war of persuasion. Sometimes leading by example works, sometimes by showing how awesome, cool, fast, safe your shiny stuff is, sometimes it works by appealing to people's sense of the "greater good" (how many Korean, Chinese, Iranian, Saudi, etc. democratists are in secret prisons, because broken C code).

And of course there's some truth to it. Look how Py2 is still not dead, because rewriting twisted is hard. (Which no one said it was easy.) And how long it took for distros to make it the default, and how long it took for anyone to not default to it. And of course there were people even complaining about how Py3 broke all their nice code that worked before by accident.

So if collectively everyone had made a push some years ago, we would be long over. But of course organizing these things is an even bigger problem than just sitting down and firing off PRs to twisted.

In what way is this aiming to control programmers? This person is explaining how they would have experienced a difficult bug but a certain technology helped them avoid it. Seems like it's just "offer[ing] better tools".

You shouldn’t rely on your tools protecting you because they won’t.

I do part time work as an external examiner, sometimes for first year CS students, so I get to see a lot of silly code. Like the result of having been tasked with doing Fibonacci recursively. Which, as most of you no doubt are aware, can be done correctly in several different ways. The most basic is to simply implement it with its two base cases and run it until you meet them for every sequence. A more efficient way is to implement it with a way of keeping track of Fibonacci sequences that you have already computed, as to not do them more than once.

Both these ways work perfectly fine within the toolset, one just does it a lot better.

> This is a very basic example, but most students solve it the inefficient way, but in most situations you’d really rather have the ones who did it better.

I would much rather have the student who solves it the inefficient way but who runs a profiler on their code with a real-world input, determines that their naive algorithm is unacceptably slow for certain inputs, and makes an adjustment to optimize it.

If I'm only ever going to use the first 16 Fibonacci numbers I don't really care if they're brute-force computed recursively. I'd much rather spend valuable engineering time on optimizing something that actually matters. Requiring students to prematurely optimize some ivory tower algorithm as a precondition to working in the real world is precisely why we have a shortage in the first place. You should be training the students to optimize the algorithm when it needs to be optimized, and not just hiring people who've memorized a few very specific algorithms and can bark them on command during your interview.

My response would be “just use a loop and get it done so we can move onto the next task”. There are only two pieces of state here. Let’s not drag out the big guns and make something hard to read.

Fibonacci is too simple a case to demonstrate the power of recursion, and people have beef with the textbook examples. It has a confounding factors that actually makes recursion an over complicated solution, and not enough factors to make it a good example for dynamic programming. Also it simply doesn’t scale to inputs that only take microseconds to calculate.

It’s like those interview questions that leave you with more questions than answers, like “are they bad at interviews or just crazy?”

Exactly. It's like a golden opportunity for me to demonstrate some engineering chops and instead the interviewer expects me to just whip out some caching implementation for some nebulous reason because that's what they would have done. If there's a real problem with either of our naive implementations the proper thing to do is to tell us what the problem is and leave us free to solve it.

| You shouldn’t rely on your tools protecting you because they won’t.

Strong agree. It's like when your toilet overflows and instead of using a plunger (because sometimes this causes a splash that will get on your shoes) you just reach in with your bare hands. I've used this technique hundreds of times and I've never gotten any toilet water splashes on my shoes.

[I get what you're trying to say, but it doesn't come off very well. All things being equal, tools are better than not tools. Being able to think and having tools is better yet, but why not give people tools if there are other options.]

How does a plunger protect you? Does the water harm your shoes or were you negligent to spray protectant on them? At some point, humans are involved and this requires a level of responsibility.

Whether you're responsible for maintaining your tool, so that they don't break, or maintaining your shoes so that when something goes wrong there are safety measures in place to ensure minimal impact.

I'd say these measures and responsibility marker a "good" "coder" and the lack of, marker a "bad" or "inexperienced" "coder".

I'm using quotations because people enjoy to pontificate over language. Huge circle jerks that are a waste of time.

I don't really understand how this is relevant.

Sure, memoization is more efficient for recursive fibonacci, but in both cases the code is correct.

The author is presenting a case in which code is made demonstrably incorrect by seemingly unrelated changes, because those changes altered invariants about the objects the author was using - making his assumptions about which actions were safe incorrect.

That's a good time for tooling to throw warnings/errors. Where you seem to be proposing... nothing?

If I saw somebody solving it with iteration I would be suspicious that they have a problem with recursion. Unless performance was part of the problem of course.


The most efficient way it to use iteration...

If I am paying them, the most efficient way is to use Google....

The problem here is the author’s development process. There’s an “old” programmer adage in regard to version control: merge early, merge often. It addresses the author’s issue directly.

So yes, it’s not the code that’s bad. It the development process. Bad (ok, fine, inexperience) developer.


That's the point.

There is always something you haven't thought of (or did wrong), once you get to certain level of complexity.

Regardless of language.

You'll have different kinds of bugs, but as long as there are programs they will contain bugs. Or the languages themselves will, or the libraries used, or the OS.

Assuming the presence of bugs/failures and providing powerful tools to deal with them like Erlang is a more realistic option if you ask me.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact