
Move Fast without Breaking Things - devy
http://blog.pragmaticengineer.com/the-startup-dilemma-move-slow-or-break-things/
======
white-flame
Racing a car has a great analogy: In order to get faster lap times, you
sometimes have to drive slower.

Specifically, you need to slow down for corners in order to maintain your
control over the vehicle. Try to drive too fast and you end up wrecked. Even
if you don't wreck, correcting from having entered a corner too fast loses far
more time than just slowing down properly. Similarly to technical debt, if you
start a straightaway from a slower starting speed because you botched the
prior corner, that damages your lap time throughout the entire stretch.

In the overall scheme of being able to deliver software you're confident in,
it is faster to consider stability inherently in your development process than
try to blast out features and bolt on stability later once you're committed to
the initial brittle implementation.

~~~
nibs
Racing a car applies in two ways I think: the talent is not evenly
distributed, and the masters are defined by their ability to decide when to go
fast, and when to go slow.

~~~
sharps_xp
I agree. To be more explicit, the talent is indirectly correlated with how
slow your team's "slow" is going to be. I think the degree of slow-ness is
what management ultimately feels when they conduct retrospectives on technical
debt reduction initiatives.

We recently did a major refactor of our angular code (~1.5 months) and we've
been pretty fast, releasing every 2 weeks. We've begun to outpace a lot of the
backend development teams, so we now have more time to squash our backlog
tickets. We have the prettiest Jira issue reports out of all teams.

------
bobbygoodlatte
I think this post misses the point.

Company values are only meaningful when they differentiate that company from
others. Everyone wants to "Move Fast and Not Break Things".

"Move Fast and Break Things" was a value judgement. For Facebook, moving
quickly was worth the penalty of sometimes breaking things.

This is why most companies have lame values like "Do your best work" or "Be
honest". Great values, but nobody disagrees with them—so they're meaningless.

~~~
wdewind
I totally agree with this but would add a bit of nuance. I'd love to see all
companies do a "X and the cost of Y" value system, but there is another part
of stating you have values like "Integrity" which is the same as the Google
"Don't be evil" thing. It's a claim that you can be called publicly on, and
that's important as well. You'd be surprised how many companies out there
aren't willing to make a claim for something like integrity. But yeah, who
wouldn't say they value integrity, for instance, when asked?

That being said, I'd also add that most values have those implicitly baked in.
For instance, if you do say you value integrity, you're saying you're
devaluing growth/money for the value/benefit of integrity.

~~~
bobbygoodlatte
Yep, I agree with your nuance.

FWIW I think integrity is a pretty okay company value. Depending on your
industry, valuing integrity may be a trade-off that other companies are
unwilling to make

------
agentultra
I'd add: _think before you code_. Programming is hard because thinking is
hard. If you don't use tools to help you think rigorously you're going to ship
software with bugs regardless of how good your unit tests are or any other
strategy you use to cope with poorly designed software.

I don't believe _not thinking_ is going to help you move faster. You're just
going to spend most of your time fixing the problems you hadn't thought of
before you started shipping code. And yet this is what we're encouraged to do.
We're asked to fix the bugs, do the least amount of work to get it to go, and
fix the problems later. We're asked to not think and just code. There are
probably a handful of people in the world who can write complex systems in
code with consistently correct results. The rest of us need to be more
mindful.

We have wonderful tools available to us that can exhaustively check our
designs. Using a system like TLA+ allows you to model your system in some
pseudo-code and comes with a program that will exhaustively check your code
for dead-locks, invariant violations, termination, etc. I've seen it used to
find errors in graduate students' _binary search_ algorithms 9 out of 10
times. Imagine what it can do for non-trivial components of your application.

Part of moving fast is knowing how to avoid making errors in the first place.
The only way I know how to do that is to _think_. You can't do this properly
in code alone. You need higher-level tools to check your assumptions and find
holes in your thinking.

You don't see structural engineers slapping together the first thing that
works and patching the building later when parts of it fall down. You don't
sign waivers before crossing a bridge that claim no responsibility on behalf
of the designers if it falls down while you're on it. Nor do you see engineers
accepting more money from people to build private bridges that are less likely
to fall down. So why do software engineers have no such liability for their
creations?

Why not take a few hours to write a proper specification for your distributed
locking mechanism and save yourself days or weeks worth of debugging -- or
worse -- liability suits when your mistakes harm the interests of your
customers or the public at large? Why not use tools that check your thinking
so that you can chase more interesting optimization and performance problems?

~~~
jh3
> You don't see structural engineers slapping together the first thing that
> works and patching the building later when parts of it fall down.

That's because if they mess up the first time people can die.

I'm guessing the likelihood of death due to a bug in most web applications is
probably much less than that of a bridge :)

Software is great because it's possible to iterate quickly. But ultimately,
you have a point. Developers should be willing to fix bugs they've created
and/or come across in their projects. We should also try to limit the
potential of bugs by releasing small chunks of functional work as often as
possible. "Releasing" in this sense can mean to production or just to your
qa/test/dev/stage environment. It's just important to have many eyes on the
product before it's live.

~~~
agentultra
Someone doesn't have to potentially die in order for software to be harmful.
Neither should it be the deciding factor about whether we need to think
critically about the software we create.

Besides it's actually not too hard to write simple specifications for the
critical components of your system. It has practical benefits like helping you
to write good, rock-solid software. It helps you to clearly and precisely
communicate your ideas. You can shake out errors in your design long before
you write your code. And there are some errors you will only find by modelling
and checking your system.

Web applications have come a long way since simple CGI scripts. They often
require sophisticated orchestration mechanisms for managing distributed state
and processes. If you get that wrong you end up with corrupted data, deadlocks
and race conditions, etc. If you had a tool which helped you sift out those
potential problems that exist in your design, would you not use it? Or would
you rather risk introducing those errors in your code and hope that your
customers never encounter them?

The approach of throwing code into production and "iterating" is precisely the
problem I was addressing in my OP. In this approach we're asked to not think
and to fix our inevitable mistakes later on after users find them for us. We
have software to find a reasonable majority of them for us instead so why not
use it? It's already fairly common to use unit and integration tests (to make
sure our code operates as expected). Why not have a system for testing our
designs?

As I mentioned, it's shockingly easy to write an incorrect implementation of
binary search that may appear to work under your unit tests. Writing a
specification of the algorithm and checking it with a model checker will show
you problems you hadn't thought to consider. This is an algorithm that is
taught in the first algorithms class you take and yet even graduate students,
who scoff at writing specifications because it's _so boring_ , often get it
wrong. That's because programming is hard and it is hard precisely because you
have to think. And thinking is hard. We should be using tools to help us
think.

------
thinkingkong
I think its weird that weve put stability and speed on the same axis. The
reality is that you can ship faster with more confidence by focusing on
stability from the beginning. Simply slapping some code together and going "it
works!" is - in isolation - a recipe for disaster.

~~~
nostrademons
Only if the requirements are known. If you have firm requirements, you can
test for them, you can plan for them, and you move through them much faster
when you don't get surprised by random bugs that need to be tracked down.

If you _don 't_ have firm requirements, then each new fact you learn about
your market potentially means you may have to invalidate large swaths of work
you've already done. The more planning and testing you've done, and the more
thoroughly you've covered your edge cases, the more there is to invalidate.

Most of the big business ideas in the last decade were in areas of extreme
market uncertainty. Hence, companies that "move fast and break things" have
been at a big advantage. This may change in the future - VCs today are
enamored with Perez, and IMHO one of the signals that we've crossed over from
the installation to deployment phase of the Perez model is when performance,
stability, and security become valued more than features & speed of execution.
But I'd guess that we have a few more years of moving fast and breaking things
first.

~~~
st3v3r
Having tests can help you ensure that, when you do change one part of your
application, unrelated parts don't start breaking. And I don't really buy the
argument that, because something might be invalidated later, we shouldn't put
in the time now to make sure we get it right based on the info we do have.

~~~
nostrademons
You should know _which_ part of the software lifecycle you're in and adjust
your development practice accordingly.

When I first started my current startup idea, the product design was changing
literally multiple times a day. I didn't even bother writing any code, it was
all in pencil & paper notes & diagrams. Things have slowed down to about a
requirements change every 2-3 days now, and I write code but no tests. If you
spend 20% of your time fixing bugs and 80% of your time writing or rewriting
features, tests are not the bottleneck; it's not an improvement to spend 50%
of your time writing & rewriting tests so you can avoid the 20% spent tracking
down bugs.

When I was at Google Search, working on a mature product with billions of
users, every change got 100% test coverage, and you were _supposed_ to break a
test with every change because that's how you know your test suite is
comprehensive. And then it went out through a QA department and full release
process. But adding a link to the page then took 2 weeks and 600 lines of
code, while adding one to my startup takes 5 minutes and about 3 lines of
code.

~~~
dennisgorelik
> If you spend 20% of your time fixing bugs and 80% of your time writing or
> rewriting features, tests are not the bottleneck; it's not an improvement to
> spend 50% of your time writing & rewriting tests so you can avoid the 20%
> spent tracking down bugs.

That is a good line of reasoning about when auto-tests are needed.

However you did not take into account time that is required to discover that
something is broken. Without tests you do not know if your code change breaks
important features. So you should also consider how expensive it is to test
your product after every code change. You do not have to test yourself and
delegate testing to end users. But turning users into testers could be pretty
expensive too (loosing potential customers).

So auto-tests should be introduced much earlier than when you hit "50%
maintenance" threshold.

~~~
nostrademons
You don't need to worry about losing customers if you have none. At the stage
I'm talking about, you'll know something is broken when you go to demo it and
it doesn't work, and presumably you'll be walking through your demo manually
before putting it in front of customers.

------
ktRolster
A lot of the bugs in our code are completely avoidable, but people get in the
mindset of, "I need to move fast, bugs don't matter!"

Even if you don't have QA, when you write a bug, at least ask yourself, "what
could I have done differently to avoid that bug in the future?" In some cases,
it's as simple as, "put repeated code into a function so it doesn't get
written twice." That will cut your chances of getting a typo in half.

~~~
ArkyBeagle
So , really - you actually see typo bugs in the wild? I haven't seen one of
those for 20 years.

I do see lots of cases where people didn't test it or used the wrong
mechanism.

I have to go slowly because somebody already went fast and broke it.

And a friend once told me "LZW can compress code faster than you can." :)
Which does not go to the value of orthogonality of purpose in source code, but
I enjoyed the irony of the comment very much.

~~~
chipsy
Math-heavy code tends to involve a lot of very similar algorithms, leading to
a copy-paste-rename situation that isn't really helped by using a function.

As such it's incredibly common to end up with bugs from assigning to "x, y, y"
and not "x, y, z" etc. Sometimes a clever compiler will warn you that you have
done something odd, other times you'll be left to discover the runtime error.

~~~
ArkyBeagle
Yeah, there's that. I've printed out code like that and colored different
variables with different color highlighters to deal with it.

------
teacup50
I'm actually very happy to work somewhere that believes in "Move slow and
don't break things".

I believe that churn isn't progress, things with lasting value generally take
time, and some problems _can 't_ be solved quickly. I enjoy working on
problems that benefit from careful, considered, time-consuming thought.

You might desire something different. That's OK, but nobody is obligated to
accept fait accompli the universal necessity or value of moving "fast" (and
hopefully not breaking things).

~~~
vinceguidry
Moving slow isn't an option for many businesses.

~~~
pjmorris
Corollary: Breaking things isn't an option for some businesses.

~~~
vinceguidry
Only if your industry is capital rather than labor intensive, in which case
you'll have elaborate procedures in place for safeguarding and maintaining
your infrastructure.

For everyone else, a certain amount of breakage is expected. When things
break, there are generally manual processes that can be put into place to keep
business moving.

~~~
pjmorris
From a Windows 10 automatic unintentional update story [1]:

'The action led to comments where life was actually being put at risk by the
unilateral action: "I needed to set up my department's bronchoscopy cart
quickly for someone with some sick lungs. I shit you not, when I turned on the
computer it had to do a Windows update."'

Question 1: Labor-intensive business or capital-intensive business? Question
2: Whose elaborate safeguard procedure failed?

[1] -
[http://www.theinquirer.net/inquirer/news/2450852/updategate-...](http://www.theinquirer.net/inquirer/news/2450852/updategate-
microsoft-is-reportedly-upgrading-pcs-to-windows-10-automatically)

~~~
slededit
> 2: Whose elaborate safeguard procedure failed?

Since Windows was never intended to serve a life support/saving function - it
was the fault of whatever OEM chose to use it in that capacity. They took on
the onus to have it work correctly when they chose it as a platform.

~~~
DanBC
MS used to specifically include "Do not use this for safety critical stuff" in
the EULA.

Does anyone know when they took it out? And why?

> “The Microsoft software was designed for systems that do not require fail-
> safe performance. You may not use the Microsoft software in any device or
> system in which a malfunction of the software would result in foreseeable
> risk of injury or death to any person.”

------
swalsh
You ever Notice how the engineering departments at these move fast and break
things companies grow exponentially? Thinking of etsy, and Facebook, and a few
others.

Part of it is if they survive, they thrived... But I think the other part is
that the code starts to become sharded across people's brains. The technical
debt piles up so high in some places that it becomes impossible to maintain
productivity with the same number of people.

~~~
debacle
I think that's a non sequitur. Every startup engineering department grows
exponentially. It's the only way for an engineering department to grow.

------
grandalf
Move fast and break things for 0.001% of users.

In other words, try bold things, bold refactorings, etc. within bounds.

If a system is designed with good modularity, this can be done with very low
risk.

When someone deploys new code at Facebook, I recall reading that it initially
goes out to internal users, then to a small percentage of actual users, then
eventually to the entire system.

This is simply a systems-based approach. The same applies when thinking about
code in a testing environment, staging server, etc. At each level there are
additional failure modes or adverse interactions.

When done stupidly, a cowboy approach can lead to downtime, poorly-reasoned
quick fixes, and blame within an institution.

Often the same cowboy/girl who acted stupidly and caused the bug is lauded as
a hero when he gets things working again a week later after the bug is
discovered... when it was his/her bad judgment (or "move fast" mentality) that
led the offending code to be shipped in the first place.

We must be honest with ourselves and our teams about bad decisions (even in
hindsight) and build processes that let us be bold when we need to.

------
julie1
Well, in physics we compute the speed with half life of what we want to keep
alive regarding the costs.

Mass transportation is an example of industry where half life is
decades/century. Fast for this sector is slower than what the life expectancy
of a coder represents.

Games made to support a product ad are ephemeral, 1 month. Faster than a new
release of firefox.

Some dangerous radioactive isotopes are 10000 years. So nuclear plant related
automation should not be using any CPU.

A citizen data should last for the government as long a citizen live.
Government if they aim at remaining stable should consider the half life of
their automation to be long, way long. Hence slow changes. Very slow

So do banks. But banks have to adapt with the speed of trades.

Fast and slow can be set by doing something agile hates: careful business
analysis.

The problem is how we coupled all the businesses with insane costly rhythm
that are forced down on every economical activities by the mean of
noncompetitive business practices. Like useless obsolescence driven by HW
monopolies, cheap energy, cheap regulated and poor education.

Do we really need 1 Pb hard drive when entropy predicts we will not be able to
find any relevant information given we have to much data without increasing
the means involves in costly ways?

We don't need fast changing technologies, we need boring, slow changing
technologies that are reliable.

------
aledesma
I have found that using BDD with gherkin syntax can really cut down on many
conserns when it comes to software stability and ensuring business worth. Many
companies do not put enough focus on automated testing and even when they do
they are only testing the surface level with simple funtional tests. If one
simply throws the technical requirements into the BDD tests and have your
system validate every affected integration prior to shipping then you simply
have less debt to pay off as more bugs should be caught prior to shipping.
Couple BDD with writing a negative functional test for every bug that is fixed
and you should never trigger that specific bug again. In the end these tests
will take more time, but if that time is bundled into the time required for
the feature, and your code is already layed out in a testable and reusable
way, then it is win-win.

------
JBiserkov
Possibly related: "Move fast and fix things" by GitHub engineering.
[https://news.ycombinator.com/item?id=10739129](https://news.ycombinator.com/item?id=10739129)

------
jbob2000
I think this article misses what is meant by "Move Fast and Break Things". I
never took it to mean 'physically (or programmatically) break things apart
while moving quickly', I thought it was more about deconstructing things we
considered "normal". Like another way of saying "think outside the box", and
do that without hesitation or inhibition.

~~~
icebraining
_“Most companies mess up by moving too slowly and trying to be too precise.
When you’re…moving quickly or doing anything like this you want to make
mistakes evenly on both sides. We wanted to set up a culture so that we were
equally messing up by moving too quickly and by moving too slowly some of the
time. So that way, we’d know that we were in the middle.”_

[https://www.youtube.com/watch?v=h7_UNu7zEVs](https://www.youtube.com/watch?v=h7_UNu7zEVs)

------
cpfohl
Great article!

> Be Aware of What You Break

I work at Rollbar (www.rollbar.com), and that point is probably the key to our
existence. You can't afford _not_ to know what's going wrong. And the best way
to know what's going on is to get a full stack trace with all the information
you need to reproduce into a platform that will notify the appropriate people
of the breakage.

------
frogpelt
Facebook/Zuckerberg have obviously been successful because this particular
motto has been dissected countless times in blog posts and HN comments.

Try this search:
[https://hn.algolia.com/?q=fast+break](https://hn.algolia.com/?q=fast+break)

------
Animats
Von Braun's team launched over 600 V-2 rockets before any of them reached a
militarily useful target.

------
mgrennan
I'm not worried about the "Move Fast and Break Things Often" mentality. This
will only last until it kills a much of people. Think code in cars, plans,
X-ray machines, pill dispensers etc.

------
gavinpc
> In reality more often then not moving fast and breaking things will result
> in shipping scrappy software. This is because in the rush to get stuff out
> the fastest way time consuming things get skipped. Like user testing,
> automation, analytics, monitoring, manual testing - just to name a few.

It sounds like the author could use a dose of his own medicine. The first
sentence contains a (common) grammatical error. The third sentence is a
fragment. The second is nearly unparseable.

