Hacker News new | past | comments | ask | show | jobs | submit login

There's a distinction I've noticed ever since university CS:

The code-starers and the experimentalists. When debugging, the code-starers would, well, stare at the code and try to reason this out logically. There will be statements such as "this cannot be the case, because ...".

The experimentalists would run the program with variations in input and code and observe the effects in order to figure out what was really happening.

As you can probably tell, I am an experimentalist, and I find the code-staring approach puzzling: (a) the world is not as you expect it to be, so reasoning about what can or cannot be the case is of limited use, as your reasoning is evidently faulty (usually assumptions). (b) there is a real world you can ask questions, why would you not want real answers??

Philosophically and historically, I consider the move towards empirical science (Bacon etc.) probably the greatest advancement humanity has made to date. It certainly is dramatically more effective than any other method of separating truth from non-truth we have, whereas scholasticism...

I'm a "code starer" that uses experiments to inform the "staring" phase. I look hard at the code, think to myself "this can't happen... unless", I formulate the hypothesis and then use experiments to prove/disprove it.

While the shotgun approach to science (run random experiments and see what correlations fall out) can be useful at times, there's a benefit to figuring out precisely what you're looking for before you start looking.

> shotgun approach to science...random experiments

Not sure where you get that from someone extolling the empirical scientific method and science in general.

You obviously don't type random strings until by chance you hit on something.

Scientific method includes cycles of hypothesis, experiment, validation/refutation.

If you present your view as a simplistic dichotomy, you should not be surprised when it is taken as such, and if you put questions in your post, you should not be surprised if someone answers them.

The dichotomy was between experimentalists and code-starers. Not between random-code-typers and code-starers.

Your characterization of 'code-starers' is a caricature, and the questions you posed suggested that you did not fully understand the scientific method. TeMPOral's post was a reasonable response to it.

And "code-staring" is also not simply running your hand over what "must" be right and continually asserting that the bug is "impossible". It is exactly the step of debugging that informs the hypothesis to be tested.

Agreed. I can get caught up in a mess of different 'experimental' approaches and need that 'staring' phase to re-contextualize.

Especially when dealing with concurrency.

I think the experimentation helps identify the piece of code to stare at. Some people are just choosing staring at an earlier point than you are.

As with anything, I think you can swing too far. I had one junior developer under me who would constantly try different things, but this person was not an experimenter. There was no hypothesis being tested or reasonable mental model that they were working against. It was just a painful-to-watch spew of ill-formed and poorly thought out code of questionable syntax.

I tend to fall more on the experimenter side of things, but sometimes code staring is exactly the right approach.

Don't forget the "step-througher". I'm surprised nobody in this thread mentioned using a debugger. I tend to rely heavily on the debugger and stepping through the code. Most common two mistakes beginners make are 1. "printf debugging" and 2. simply trying different inputs and reasoning about what outputs they give. You just wrote the code yourself (or at least have the source)! It's not a black box with just inputs and outputs. Step through the darn thing and see if it's really doing what you think it should be doing.

I agree stepping through is incredibly useful, but I think you're wrong to characterize "printf debugging" as a mistake. A printout of several well-formatted printfs can be a good way to view more than one moment of a program at once. It's less useful for control-flow errors but can be good for other problems.

I tend to write JS and languages that compile to it, and I feel debugging tools are the area most lacking in that space.

I have to admit I find stepping-through generally not too useful (although there are obviously exceptions when it works well).

For me the reason is that the bandwidth is too narrow. To make it work well you often have to guess just right, and the problem with guessing just right is similar to code-staring: your mental model of the code is currently wrong, therefore your chances of guessing just right are also not too good. And if you guess wrong, you often have to start from scratch. Another reason might be that I was scarred for life by early versions of gdb :-) And of course there's a lot of code nowadays you don't want to stop in a debugger, because it will behave quite differently (your network request just timed out...)

I rather collect diagnostics and then try to figure out what happened.

I am in the staring camp. Until you understand what the code is supposed to be doing, what is there to experiment on? Code is not as complex as most things in the world, and reasoning about it is actually extremely useful.

Moreover, I think that if you can't reason about the code you're looking at, you don't understand it and you definitely should not be changing anything in it. Otherwise you're like a cat under the car's hood, saying "I found the problem - the engine is made out of parts".

Experimenting and reasoning ("staring") solves two different problems: the "what" and the "why". Both are useful tools that can lift different tasks, and neither should be used to the exclusion of the other.

I've been a "code starer" for 30 years and emphatically wouldn't do it differently. I want the model of the program in my head to be so faithful that I can, and do, spot bugs by 'running' it there. I'll happily stare at code until that's the case. That state of things maximizes the speed I can work at relative to the stability of what I write. It's not at all uncommon for me to write code for weeks without even compiling once. And usually when I finally do compile and run, there is very, very little to debug.

This approach also answers (to my own satisfaction, if nobody else's) the old question "how can you refactor confidently without unit tests?" Because I can prove the equivalence of any refactoring in my head, the exact same way I can recheck an algebra problem and known it's right. To me, code is a math problem. I don't mean the arithmetic in the expressions. I mean the whole structure of the program is one big math problem.

Reasoning is one of the most powerful tools we have.

I'm a code starer. But I have my limits. If I can't reason about some piece of code and it meets the following criteria:

1. it's sufficiently complex; and

2. I need to understand it so that

3. I can extend it

Then I usually write an interface to it that I can reason about. That interface should add type information and will be peppered with assertions about invariants I expect to hold. As I understand more about the underlying code I add more assertions. If it's a dynamic language I will write many tests against the interface. And I just program against the interface instead of trying to dissect the beast and programming-by-hypothesizing.

For projects where being correct is even more critical or bugs are more costly I'll even reach for higher-level tools like TLA+, Event-B, etc. It's amazing what these tools can do and I wish I'd known about them earlier in my career.

Can you talk about cases that you've modeled in TLA+? I love to read about practical uses of modeling techniques, because the more I code, the more I am convinced that unit testing is just a stab in the dark.


In late March 2016 I was working out a problem in Openstack that had the appearance of a race condition. It had to do with the vif_unplugged_event sent from Nova to Neutron. In certain situations when the L2 service component in Neutron was behind on its work it would fail to respond to the event before Nova carried on with its work leaving the network in a bad state. We'd ultimately found a solution to the problem but while I was in the middle of it I was talking about it to someone who would graciously introduce me to the idea of modelling the system and using a model checker to find the race condition for me. I had heard about TLA+ from somewhere, once, so I listened.

I had been an enthusiastic enough student that this someone would become a good friend. Together we decided to work on some models of some underlying components in Openstack. Driven by the Amazon paper on their use of TLA+ on the AWS services it seemed like a worthwhile cause: see if we could convince open source projects to adopt these tools and techniques in the critical parts of their systems. Improve the reliability and safety of infrastructure projects like Openstack which continue to be used as components in applications such as Yahoo! Japan's earthquake notification service.

We haven't published our models yet but we started with a model of the semaphore lock in greenlet. It started as a high-level model using sets, simple actions, and some invariants in pure TLA+. We then added a model of the greenlet implementation in TLA's PlusCal algorithm language and used the model checker to prove the invariant in the higher-level specification still held when refined by the implementation model. We then refined the specification and the model in TLA+ until we came quite close to a representative implementation of the semaphore in PlusCal that was very close to how the Python code was written. We didn't find any errors which I think was satisfying.

We decided to take our little project to the Openstack design summit in Austin. My enthusiastic partner in maths and I found a handful of naive souls to come to an open discussion about formal methods, software specifications, and Openstack. It went quite well. We unfortunately haven't been able to expand on that effort as I'd lost my employment and he had to focus on his PhD thesis.

Needless to say though I've since used verification tools like TLA+ to model design ideas and I continue my studies in predicate calculus and logic-based proofs. I just don't talk about it too much at work. Tends to frighten the tender souls.

Update I should clarify that the errors we were particularly interested in were deadlocks in the implementation of the semaphore.

> "this cannot be the case, because ...".

What I should have added is that invariably, the problem would be in one of these places that they had eliminated by reasoning.

While you obviously need to think about your code, otherwise you can't formulate useful hypotheses, you then must validate those hypotheses. And if you've done any performance work, you will probably know that those hypotheses are also almost invariably wrong. Which is why performance work without measurement is usually either useless or downright counterproductive. Why should it be different for other aspects of code?

Again, needing to form hypotheses is obviously crucial (I also talk about this in my performance book, iOS and macOS Performance Tuning [1]), I've also seen a lot of waste in just gathering reams of data without knowing what you're looking for.

That's why I wrote experimentalist, not "data gatherer". An experiment requires a hypothesis.

[1] https://www.amazon.com/gp/product/0321842847/ref=as_li_tl?ie...

By simply experimenting without understanding you run a good chance of producing code that is only correct by accident or code that is technically correct but wildly inefficient. Also by first taking the time to build up a mental model of what is going on might spend longer fixing the first and second bug (which tend to be the easy ones), but you'll spend dramatically less time fixing 12th and 13th bug (which tend to be the really tricky ones).

I'm pretty sure, as you've heard from others, that everyone does a part of both. If you never look at the code and try to reason why it gives you the result you got in your experiment how do you get to the point of changing the code? Surely you don't experiment with code changes until the process gives you the result you want.

I'm an experimentalist. For me, it feels too hard to reason about code. I prefer to see what happens until an error occurs, and for some reason I feel a lot more incentivized to reason about why it goes wrong at that moment. With experimenting I can ask questions about the context, hence I don't need to remember the context. Also, I'm not sure if I have the memory to remember the context. All I know is that I process information quite well, but my active memory is quite limited.

For example, I learned about pointers by debugging them and coerceing random memory addresses (which were ints) into pointers. Once the concept was learned, I could reason about it.

So nowadays I can reason about code, provided I understand the concept. But I learn about the concept by debugging/experimenting -- in most cases.

The best way I know to debug a piece of software (or just to understand it) is to rewrite it. I keep and rewriting it until a) the error presents itself or b) I have interiorized the model so profoundly that I can understand where the problem is.

Of course I am a big fan of automated tests.

The old meaning of the word "code hacker" from the 1980's (before ycombinator took and redefined it) was someone who writes and maintains programs by running it with variations in input and observing the effects, until it basically works. "Basically" here is of course defined as passing over half of the unit tests, which of course are performed by the business users after the change is in production. Starers are the ones who didn't send a double to the aptitude test at their job interview.

This also explains why dynamic languages can be easier to work with especially for less-than-expert programmers.

You can start experimenting with the program at runtime while the program still is very incorrect. The type checker would require that the program at least makes sense from a typing perspective.

I do both. Sometimes one solves the problem, sometimes the other does. I like having as many debugging techniques as I can.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact