Hacker News new | comments | show | ask | jobs | submit login
Move Your Bugs to the Left (samwho.co.uk)
56 points by samwho on Sept 6, 2017 | hide | past | web | favorite | 27 comments



Ha! That's a great visual. I think that explains very well why I like strongly, statically typed languages (for large projects) - the larger a project is, the slower feedback is the farther right you go.


Thanks! I'm glad you think so. I couldn't agree more re: strong, statically typed languages. The more information you give your compiler, the more it can help you Do The Right Thing. :)


Just as a counterpoint:

Number of hours I've spent fixing type-related bugs in Python code = n

Number of hours I've spent figuring out how to make the compiler happy in C# ~= n * 5

I'm not being altogether serious (I've written a lot more Python than C# so it's not a fair comparison) - but you're ignoring a lot of useful debate about static vs dynamic, explicit vs implicit, type inference, languages for teams vs languages for hackers etc etc. It's fine to not go down the rabbit hole but at least acknowledge there is some debate on this topic.

Otherwise you give the impression that you think those of us who prefer dynamic languages are deluded oafs we are going to wake up one day, slap our foreheads and exclaim "Doh! It's only just occurred to me how useful explicit typing is. Silly me..."

EDIT - I've always enjoyed this post to python-dev by Cory Benfield explaining why he disliked the idea of introducing type annotation to the requests library.

https://lwn.net/Articles/643399/

Whilst it's not an indictment of all possible type systems it does give some flavour of how the mind sets differ in static and dynamic language communities. There is a genuine cost in expressiveness that surely can't be entirely dismissed out of hand.


That's a refreshingly nuanced comment in a debate where people tend to be very quick to pick a side and stick with it no matter what. You also make a very good point regarding languages for teams vs hackers - this is overlooked very often.

I'd just to like to point out that grouping together dynamically typed languages and comparing them to statically typed ones as a whole is really not useful at all due to the huge variations within each group in terms of correctness and expressiveness.

When people compare dynamic vs static languages they generally compare the languages they're familiar with and often assume that they are representative of the group to a sufficiently high degree. They're not. Using Clojure for example is a very different experience from using Python and not just cosmetically. Clojure with spec is a very different experience from plain Clojure. Haskell is very different from C#.

I'd encourage anyone with an interest in using the best tool for the job and increasing their effectiveness as developers to at least dabble in what I would consider to be the state of the art in practically usable, more or less mainstream languages in both camps - Haskell and Clojure (with an extensive use of spec).

You are likely to realise that statically typed languages can be way more expressive and less bend-over-backwards-for-no-good-reason-to-make-the-compiler-happy than C# or Java and dynamic languages can be significantly more capable of ensuring correctness and 'moving bugs to the left' than Python and JavaScript.


What they really want is "pattern matching on interfaces", aka Structural Typing. This is coming to MyPy soon in the form of "protocols": http://mypy.readthedocs.io/en/latest/class_basics.html


Found it nice as well, but one could argue that a couple of boxes are missing to the left: Requirements and Design. See for instance https://www.experimentus.com/itm/W_06_00055_The_Cost_of_Defe... (love linking to "waterfall methodology" pages - makes me feel old :))


Ahh, great point about design and requirements. If I do a revision of this post I’ll be sure to include them. Thank you :)


I do Python daily, but I also am familiar with Java and C++. In my experience with these languages I would argue that static type checking does not really help catching bugs other than the most trivial ones. Don't get me wrong, I like these languages, because they yield fast code, but being less error-prone to bugs is a myth.

Without a good unit test suite you are doomed in any language.

I actually think that writing bug-free code (ahem!) is most easily done in Python, because it is just much easier and faster to debug, test and fix.

Java is pretty okay too and the code fast. I would choose Java for any large project that does not have deal with string processing. Although it is still much slower to develop than Python, it has the best technology stack ever!

Well, when you get your C++ program to compile, it is still probably wrong. Only after running with Valgrind to check for problems during runtime I could have some confidence that it is working. I would choose C++ only for specific algorithms and data structures that I need to be very efficient. One major advantage for C++ is that it is very convenient to interface it from any other language.

That said, note that I only compared fairly unfraily these three languages and based my opinions on my skills and experience. Still, I could bet on it that if you were write any program in Python vs C++, the Python program not only gets completed much faster, but it also has less bugs.


"I do Python daily, but I also am familiar with Java and C++."

I think if you really want to participate in the "strong vs. dynamic type" conversation, you need to pick something like Rust or Haskell for your strong typing. Mere "manifest" typing is a very weak type of strong typing. It can be coerced into giving some nicer guarantees via some trickery with things like constructors setting private fields and validating them at the time. For people who advocate proactively preventing bugs via type systems, the type system that C++ or Java has is not very good at it.

Even Go v. Python is a more interesting comparison than you might initially think. I prototype things in Go pretty much as quickly as I could in Python now, but the prototypes are way easier to transform into production code after the first couple of thousand lines. In fact, given how much time I spend with network servers and concurrency I am probably prototyping more quickly than I could in Python, and still having an easier time converting to production services. Go also has a terribly weak type system from the point of view of those who wish to proactively prevent bugs with them (weaker than C++ and Java), but since it has GC and array bound checks it grabs a lot of the low hanging fruit.

(See https://www.joelonsoftware.com/2004/06/13/how-microsoft-lost... , under the heading "Automatic Transmissions Win The Day".)


"I do Python daily, but I also am familiar with Java and C++. In my experience with these languages I would argue that static type checking does not really help catching bugs other than the most trivial ones."

Guido Van Rossum seems to be more enthusiastic about static type checking. He's put in a lot of effort recently in to mypy[1], a static type checker for Python.

Dropbox, the company that Guido works for, has committed to adding static type annotations to their entire 4,000,000 line codebase and checking it with mypy. As of 2016 they had done 400,000 lines[2], and 700,000 lines as of 2017 (after 18 months total).[3]

Guido also mentioned that Dropbox's desktop client team, one of the most important teams at Dropbox, has mandated that all new code have static type annotations and be checked with mypy.[2]

[1] - http://www.mypy-lang.org/

[2] - https://www.youtube.com/watch?v=ZP_QV4ccFHQ

[3] - https://www.youtube.com/watch?v=7ZbwZgrXnwY


Thanks for the links!

I am watching [2] at the moment and I would just like to give my perspective on how I would approach this problem.

To rehash, the question in the video is that just by looking at the following code:

    for entry in entries:
        entry.data.validate()
What does it do? Without context it is hard to know and one would have to search for all functions named validate(), which is hard. Or to rewrite it in statically typed language, which is impossible.

The way I've been solving these situations is that I look into the unit test of the relevant function to understand what is going on. Unit tests also act as concise examples on how to use the code with example data. Also, if necessary, I can just debug the relevant unit test and step through the function and see in real-time what is going on.

But I am not trying to claim that type annotations are a bad idea, I just haven't used them myself yet.


I make enough "trivial" errors that I find static typing useful on larger projects. Static vs dynamic typing in this context is mostly about when you catch the bug, not "if" you'll catch it per se, so you're right that static doesn't necessarily mean fewer bugs. But it might still mean they're caught faster!

Individual type errors have occasionally eaten a distressing amount of my time in dynamically typed languages - in terms of debug time and context switching, when I make the mistake of, say, trusting Javascript docs, in a way that mostly doesn't happen in statically typed languages IME. They help keep these trivial bugs trivial. Sadly most types systems (static or dynamic) aren't leveraged well enough to catch, say, SQL data being passed to things expecting SQL commands, so some issues still slip through.

> Without a good unit test suite you are doomed in any language.

I write a lot of code that is difficult to meaningfully unit test. How do I properly unit test a platform agnostic wrapper for gamepads - hook up a raspberry pi to some servos? How do I share that with our build server cluster? When is it a bug if nVidia and Intel GPUs render my scene slightly differently - what specific fuzzy conditions qualify as "bug" vs "expected variation"? How do I catch a missing memory barrier in multi-threaded code with unit tests? Sometimes an ARM processor with it's weaker memory semantics can help, but that still won't catch cases where triggering the bug requires the optimizer to do something specific.

> Well, when you get your C++ program to compile, it is still probably wrong.

While I agree in general, there are counterexamples. Even in the tire fire that is C++, static typing and compile time checks mean that if it still compiles, my rename refactoring in a large project was probably successful. If I rename something in Python, I have no such assurance, and probably updated the name where it was actually referencing something else, or failed to update the name where I thought it was referencing something else and it wasn't. So I have to test "everything". If I'm lucky, my manual audit of the codebase will show that this thing was only used by things that I know are well covered by unit tests. But let's be honest, I'm not that lucky.


The big problem with this is, the more left you are the less do the errors you get have to do with reality.

Fiddling around with (often unsound) type systems, writing mocked tests, lazy/unmotivated reviewers and QAs these are all things that will cost time and money...


Isn't it common knowledge that bugs should be caught as early as possible?


"Just write the program so it works" doesn't seem like helpful advice.


It’s like good financial planning. Everyone knows you should practice it, but lots of people don’t know how. Or they don’t know all of the tips and tricks people have discovered over the years. Or they don’t know how to talk and ask questions about it.


> Prefer using languages with type systems for writing code that needs to be correct. If your language does not have a type system, defensively check that the data you’ve been passed is correct at runtime, and throw an error if it isn’t.

Hmmmm. Several decades of nuanced debate summarised inaccurately in one neat little hand-wavy paragraph.


Can there really be any serious debate about the idea that type-checking increases the likelihood of correct code? Pretty much all the arguments against it have to do with other things; I've never heard anyone claim untyped code is more likely to be correct.


> Can there really be any serious debate about the idea that type-checking increases the likelihood of correct code?

I agree. I didn't say otherwise. The passage I quoted said "prefer using languages with type systems for writing code that needs to be correct" which seemed fairly sweeping to me. Those "other things" you mention sometimes matter a lot.

I also to a lesser degree take issue with "If your language does not have a type system, defensively check that the data you’ve been passed is correct at runtime, and throw an error if it isn’t." - this seems like a fairly overly strict guideline. Most tests will check types as a side-effect of whatever they are doing. And the advice to add runtime type guards into every piece of code I write is worse than telling me to just switching to a explicitly typed language.


In general, I agree with the advice. Even in some typed languages, like C#, it can make sense to open methods with guards like `if (someArg == null) {throw new InvalidArgumentException();}`. If your code is making some assumption about its input, it should validate that that assumption is true rather than assuming nobody will call it the wrong way, most of the time. Maybe if you're writing Ruby you'll check if the object responds to a particular method call rather than explicitly checking the type, but the principle is the same.


> If your code is making some assumption about its input, it should validate that that assumption is true rather than assuming nobody will call it the wrong way

This kills readability. Stuff like that belongs in tests. Code should express the intent and logic and anything extraneous should be abstracted away or not be there at all. Code should aspire to reading like pseudocode.


If its “private” code that is not externally exposed, that may be tolerable though it increases coupling and may harm maintainability, especially if the system is large enough to have different teams supporting the function vs. some or all of the call sites.

If it is publicly exposed, you are more likely to need to validate assumptions (if they aren't statically enforced), otherwise you are prone to creating undefined behavior.


How are tests supposed to prevent new callers from violating the assumptions?


I wasn't thinking about library code but I guess that's a fair point. However I just don't recall this being enough of a problem in real world situations. And I don't recall many Python libraries that bother to do this.

I just looked through a few stdlib functions and other popular libraries and I'm not seeing much trace of validity checking as a standard working practice:

    > from random import randint
    > randint(1, "strawberry")
    TypeError: cannot concatenate 'str' and 'int' objects
This fails further down the stack - and would succeed for any types that allow addition with an int (and might fail somewhere further on in those cases)

This aligns with Python's encouragement of duck typing but directly contradicts what you seem to be claiming as a universal best practice?


I haven't worked in Python, so I can't speak to what Python programmers typically do. But I would much rather get an intelligible error like "you passed a float to the doMath function when it expects an int" or even "the 1st argument to the doWork method must implement the toFoo method" than that kind of error, because the fix is immediate and obvious. I'd prefer that even if we're talking about something that isn't really "library code." It may well be true that most people with this kind of view gravitate away from working in Python; I certainly hated working in Ruby.


Maybe that's it. That error is fairly obvious to me but I admit in some cases you have to traverse the call stack to figure out where the incorrect value got introduced.

I still feel I'd prefer the occasional inconvenience in tracking down a bug over some of the pain I've experienced in trying to adapt my brain to a statically typed language. As another poster commented it really only makes sense to compare a cutting edge dynamically-typed language with a cutting edge statically-typed language if you want to debate the strengths and weaknesses of each. Otherwise you're just comparing specific language weaknesses and strengths.


I don't know; to me it doesn't seem like much of a cognitive tax to have some guard clauses that can easily be ignored at the beginning of a function and declaring arg types is less annoying than digging through a method trying to figure out what in the world the expected input is. It gives you a way to look at a pretty minimal portion of the code to figure out how to use it and otherwise treat it as a black box, ignoring the implementation details (well, ideally; of course there is some leakage).




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: