
Startup Suicide -- Rewriting the Code - terrisv
http://steveblank.com/2011/01/25/startup-suicide-–-rewriting-the-code/
======
jacquesm
Except when it isn't!

Sometimes during the initial phase of building a product you realize you're on
the wrong road and it's actually faster to toss out what you've got and start
over. Typically I take about three tries to get it 'right', the first is to
get a good feel for the problem space, the second when I have a first working
version and the third one will actually last for a long long time.

The first two last for respectively as long as it takes to type them in and a
couple of days to weeks, and I think part of the secret here is to actually
plan to throw away that second version instead of hanging on to it after
getting past the point of no return in terms of sunk cost.

Most 'rewrites' are not done for valid reasons but simply because a new guy
was brought on board that has not yet learned to read code in order to
understand it but whose gut response to anything they didn't write themselves
is to trow it out and do a rewrite, even if that means killing the company.

Just look at netscape to get an idea of what that mentality will do to your
corporation.

~~~
silverbax88
Just had this experience. Took an existing code base and worked about four
weeks to try and modify it to support new functionality. Had an epiphany that
'this shouldn't take this long', started from scratch and duplicated all
existing functionality as well as the new code within two days.

Sometimes tossing badly designed code seems like two steps backward, but that
isn't always the case.

~~~
jacquesm
The big trick of course is that you were in a good position to actually make
that call and it worked. The difference is that there are people that will
shout 'rewrite it' without being in a good position to make the call and as
some kind of personal 'NIH' syndrome.

I've done what you just did to a package that I maintained literally for years
for a company that I contracted for and at some point the same realization hit
me, it shouldn't be this hard to just toss this and do it again. But by then I
had a pretty thorough understanding of the problem _and_ of the flow of the
code that solved the problem (even if it was weighed down with a lot of junk).
Rewriting it made life much better. But typically if someone is new to a
project and needs to fix a minor bug and starts to say we should re-write all
of this they're just plain wrong.

~~~
igrekel
I don't think it is just NIH, the new people just have a very superficial view
of the amount of issues, choices and decisions that were faced in building the
system. That means they have a tendency to underestimate the costs that go
into building the system.

~~~
cdavid
Exactly, that's the core of the issue: it is very easy to overestimate the
advantages of the new design/rewrite. But even worse very easy to
underestimate the advantages of the old design. Because in a bad system (the
ones you want to rewrite), things are generally not that well specified.

That's also defines what may work as a rewrite and may not work: if your
application has a lot of external dependencies, and is used by customers in a
very tightly way, rewriting it will take forever unless you don't care about
losing your existing customers (because you lose what works and what does
not). What makes matter worse is that you are more likely to make those
mistakes early in the business.

If your application does not have tight integration with the customer, then it
becomes much easier to replace it, one part at a time. Otherwise, you are
likely to just recreate the same monstruosity anyway once you managed to
support half of the features from the old version.

I think rewrite make sense in some cases, but the natural reaction should be
don't, especially when you don't have that much experience. Successful
rewrites are the exception, not the rules. To make a dubious analogy, that's
like the junior programmer would think that his bugs are actually in the
libraries/compiler he is using. The senior programmer knows it is almost
always his own fault, and the very senior one has a few stories about long
night debugging caused by compiler bugs in the old times.

------
ericb
I have watched a large rewrite fail and cost an engineering manager his job.
The next manager, perhaps learning from his fallen comrade, did something that
worked spectacularly well. He did a gradual, component focused rewrite. With
each release, they would carve out a part and rewrite only that chunk. For
anyone looking at the big rewrite, I would suggest this as an alternative.

~~~
silverbax88
I am a big proponent of gradual change like this. I've used it multiple times
on major enterprise systems.

I'm still baffled by why more people don't do this.

~~~
ericb
I feel like the mental stumbling block that stops people has something to do
with our ideas of purity and cleanliness. Maybe people intuitively feel
contact with the old system would make the new code unclean.

------
enjo
I'll lend my anecdotal experience here as well.

I was the fourth employee at a startup (that is still going strong nearly 10
years later). Pretty early on we were forced into a major pivot which saw us
make the transition from Palm OS to Symbian. At that point we were left with a
bunch of really difficult questions.

The code-base, as it was, was not in very good shape. It was full of bugs,
suffered from some _ahem_ questionable design decisions, and ran rather
poorly. On the other hand, it was the basis for a product already doing
millions in sales.

At that point I had grown into a technical lead role (we were probably more in
the 25 employee range at that point). I looked over everything and championed
a strategy in which we would simultaneously port the "bad" code over while
splitting off a much smaller team (that I ended up leading directly) to
rewrite everything from scratch.

That rewrite literally saved our business. Our 'old' code base had suffered
through that year as we struggled to patch it up enough to meet acceptance
requirements from a device OEM we were working with. It was a terrible
project, and no one was particularly happy. Meanwhile our direct competitor
had brought out a new product that raised the bar in terms of quality by about
1000%.

As we transitioned to the new, much saner, codebase we were able to very
quickly respond. We built better features that worked more reliably. Our next
experience with that same OEM went incredibly smoothly. They went from nearly
dropping us to being among our biggest champions going forward.

So in that case, rewriting the thing probably saved the business. In the
ensuing years, as more and more platforms have come about, that same code has
continued to evolve nicely to meet the demands of those platforms as well.

So I'll agree with jacquesm: except when it isn't indeed:)

------
aschobel
We did a full re-write last summer.

It was absolutely worth it and has since allowed us to iterate at a much
higher speed. It was a bit terrifying to be at a feature standstill.

Our re-write was from Java(struts2) + BDB to python(pylons) + MongoDB.

If anybody is interested we are giving a talk at PyCon.

MongoDB + Pylons at <https://catch.com>: Scalable Web Apps with Python and
NoSQL: <http://us.pycon.org/2011/schedule/sessions/131/>

~~~
zeemonkee
Off-topic, do you have any plans for moving over to Pyramid ?

~~~
aschobel
Yeah, Pylons 1.1 will have some forward-compatibility stuff with Pyramid. From
what we know it should not be hard to migrate since it provides a lot of the
same fundamentals as Pylons.

------
erikstarck
Famous rewrites:

\- Microsoft Word, "Project Pyramid". Never finished, the company decided it
would take too long to rewrite + keep up with adding new features.

\- Netscape 6. Practically killed the company. Dragged on for years.

\- IE4. Turned out OK and made IE the leading browser.

\- Ericsson AXE-N. Huge project to rewrite the succesful AXE phone system in
an object oriented way. Failed miserably.

I'm sure you can think of a few more. I wonder what Microsoft did right with
IE

~~~
bokonist
Windows NT - a complete rewrite of windows, and became the core of XP and
windows server. This was a major success, IMO, NT/XP was far, far more stable
than the 3.1/95 family.

~~~
nonane
It also took Microsoft almost 10 years and millions of man hours to get NT to
a stable state, iron out comaptibility and performance issues before they
eventually replaced 95 with XP. It was a huge undertaking - probably not
something a startup can afford. IIRC Microsoft had a completely separate team
working on Win NT initially, lead by David Cutler.

edit: fixed 'not something a startup can't afford' -> 'not something a startup
can afford'

~~~
Splines
> IIRC Microsoft had a completely separate team working on Win NT initially,
> lead by David Cutler.

If you're interested in this, I highly suggest you read "Show Stopper!". It
provides some interesting insights into Microsoft's early days with NT.

[http://www.amazon.com/Show-Stopper-Breakneck-Generation-
Micr...](http://www.amazon.com/Show-Stopper-Breakneck-Generation-
Microsoft/dp/0029356717)

------
mjw
My lessons from painful rewrites:

Even if you have little time to invest in refactorings along the way, at least
do this: try and plan from the start to componentize wherever practical as you
go along -- as soon as you start to get a decent feel for how the parts fit
together, but not any sooner.

That means that if things do become a mess, at least you have the option to
rewrite or refactor different components without having to tear down the whole
edifice.

You'll never achieve the dream of perfect decoupling and don't die trying, but
at a minimum doing what you can to break a big problem up into smaller ones
will make it all a bit less scary from a psychological point of view.

Not going to help you if you get your component architecture wrong, either,
which is why you don't do it all upfront. But do try and decouple fairly
aggressively as soon as you _do_ get a respectably stable insight into the
structure of the problem (or of a particular part of the problem), because the
longer you leave it the harder it's going to be.

If the structure of the problem never stabilises, _and_ it's a complex
problem, then good luck to you.

It also helps a _lot_ if you start out with a framework and development tools
which make it easy to be modular and easy to develop in a modular fashion. As
always there's trade-offs between this and speed of development, but I suspect
the suggestion of an extreme mutually exclusive trade-off, eg along the lines
of "rails vs j2ee" is a false dichotomy. Both can (and should, and do
gradually seem to be) meeting in the middle.

As always YMMV.

------
dailo10
I've also been part of a successful rewrite for an enterprise software
project. Keys to success:

1\. Cut features. Aggressively scoped the project to only core functionality
and cut down a bloated feature set by about 50%. This allowed us to complete
the rewrite much faster. Existing customers weren't forced to migrate, but
often wanted to because we'd...

2\. Deliver new features that weren't possible on the legacy code base. This
included: \- 100x Performance improvements (hours to seconds) \- Versioning
and audit tracking \- Better, faster, sexier user interface

3\. Leverage new technologies and frameworks. This is an obvious part of any
re-write because it enables a team to move faster. Technology changes so
quickly. Think of how a small team today can accomplish so much more than a
large team stuck on a platform from 5+ years ago.

I'm not saying that a rewrite is always right, but I believe you can make a
good case for it. In this case, I guess you could consider it a part of a
business pivot.

 _Sometimes you just burn the boats..._

------
fleitz
You never want to 'rewrite' the codebase, but you do want to 'refactor' it.
The concepts are basically the same, but you gain the incremental benefits by
not starting from a new codebase. Rewrites always suffer from second system
effect and disperse your efforts away from the actual problem. Rewrites are
the economic equivalent of quantitative easing, it's just a nice way to avoid
dealing with the actual problem while pretending to fix it.

The process I always use for refactoring an existing project is to first
isolate the code that you want to remove, then create an interface that so
that the new interface is EXACTLY the same as the isolated code and then build
a new implementation that utilizes the same interface. Then you change the
interface so that the leaky abstractions from the old interface are gone.

This is the process I used to migrate from OpenLDAP to SQL Server. And yes,
there was a DN field in SQL Server for many years that emulated the DN
structure of LDAP, and we had a bunch of hacky stored procs that would emulate
LDAP semantics for search.

Rewriting is the 'easy' answer, but its usually not the best. Even if you're
'refactoring' your app to another language it's usually best to create some
bridges so that both work in parallel, if you have a web app use some proxy
hackery, if you have a desktop app, use IPC. I've switched apps from Perl to
ASP.NET using this method.

The other thing that refactoring in this manner benefits is team cohesion, by
switching languages slowly it allows the people proficient in the old language
to add a lot of value to the new team, transfer knowledge slowly, and also get
up to speed on the new language. When people feel like they are going to be
out of a job when the rewrite is complete they will not be thrilled about it.
When they have an opportunity to learn a new language, contribute meaningfully
to getting rid of things they always hated, they will be much more engaged.

------
silverbax88
No offense to the author, but going through one bad rewrite doesn't make
anyone qualified to declare the idea unilaterally unsound.

Having gone through dozens of complete rewrites, I can agree that engineers
too often want to start from scratch, because it seems easier to build it 'the
right way' than continue to wrestle with old code. But that doesn't mean it's
always the wrong idea.

I've seen it work brilliantly. I was over one rewrite where we were struggling
to get the existing code base to adapt, so I pared off a couple of devs,
rewrote the whole app in a few weeks (under a month) while the primary team
continued to support the existing app. Maintenance became a breeze.

But I've seen the other side of it, too. I've seen total teardowns and
chucking years of QA'd functionality go horribly wrong.

------
ajju
1) Often the perception that code quality is so bad that a rewrite is needed
stems at least in part from the "not built by me/us" syndrome (related to the
"not built here" syndrome). Developers tend to overestimate their ability to
write good quality code in real world situations.

2) A lot of folks here are talking about "throwing away the initial
prototype". That makes a lot of sense at an early stage when it's just you and
may be two other people and progressively less sense as you grow larger modulo
some other factors. If the company in question has a 50M run rate, we are
talking about a really mature product. I don't think you can call it the
initial prototype any more.

My opinions are colored by observing, at close range, a failed multi-year
attempt to rewrite a mature product from scratch at a major corp and success
at refactoring in parts to significantly improve code quality at my own
startup.

~~~
matwood
_Often the perception that code quality is so bad that a rewrite is needed
stems at least in part from the "not built by me/us" syndrome_

That plus the lack of understanding of _why_ the code is complicated. Corner
cases and exceptions are a huge PITA.

~~~
Chris_Newton
The trouble with special cases is that they come in two very different
flavours. Some are essential complexity, inevitable consequences of the
problem you're trying to model. Others are accidental complexity, artifacts of
the development process, often things that came along when requirements
changed after the initial design was set and didn't fit in neatly but didn't
justify reworking the whole thing either.

You can never get rid of the essential complexity, but with the wisdom of
hindsight you can often produce new design that integrates the accidental
special cases into a coherent whole. I've seen modules cut to 1/3 their former
size and various "can't fix" bugs eliminated as a consequence.

------
badmash69
A lesson can be derived form the Facebook , PHP and HipHop story. They did not
rewrite the PHP code in another language, they just made it go faster with
HipHop by compiling PHP into C++, and the compiling that into binary.

The end users neither cared nor knew about this change. I do not think the
development or release timelines of new features were impacted at all.

What this example serves to illustrates is that one should consider all
possibilities before making a decisions regarding code that has been in
production and has active users. Rewriting production code should not be the
only option you should have on the table.

~~~
derwiki
Sure, HipHop gives a fixed percentage increase. It feels a little strange to
compare this to a software re-architecture though, since fixing something
Fundamentally Wrong can have compoundingly good effects down the road.

~~~
OstiaAntica
HipHop makes PHP run as C++. It is a massive boost and a kind of a port.

<http://developers.facebook.com/blog/post/358>

------
sofuture
I think the answer is not to 'not rewrite' but 'rewrite well'. Rewrite, to me,
doesn't mean throwing out the whole thing -- it means start from a blank slate
and leverage what you have as you rebuild something that matches your problem.

I'm a big proponent of 'rewrite often'. Instead building your software like
you're playing Katamari Damacy, take the time to rewrite to your current specs
as a whole -- playing Red-Green-Refactor on a bigger scale.

I know the article is addressing 'bad rewrites', but I think all rewriting
gets an unfairly bad rep.

------
agentultra
It's a highly contextual decision. If it's coming from an experienced engineer
who doesn't want to make more work for themselves than absolutely necessary; I
would take their advice in stride. If you've just hired college kids with no
experience, do the opposite of what they say.

I currently maintain a legacy web application built over 12 years ago in C++.
When it was originally built, it was that developers first programming
project. It has since survived several aborted attempts to extend it with and
port it to Python before landing in my lap. It's horrible to say the least.

However, the approach to the problem was decided before I got here. It's a
smart plan and well executed by the very smart developer who worked here
before me. He's wrapped the legacy application with an FFI and has written a
slew of heavily tested code to sync the legacy data from the old application
to a relational database that they want the new platform to be written on.
From there most of the application has actually been ported to a Python web
framework. Those parts that haven't are still supported by the legacy
application. My job is to finish this process and then look at "re-
normalizing" the data and start re-developing and designing the features our
clients have been asking for.

The problem with this approach is that it's not cheap. It's not glamorous work
dealing with someone else's poor design choices, bugs, and lack of
documentation. It's not easy grasping the amount of complexity that goes into
running a system that maintains the legacy application and the new code in
parallel. A green developer simply cannot do it. People with the kind of
expertise it takes to manage this approach to dealing with legacy applications
come with a premium.

One of the first questions I asked was, "why didn't you just rewrite this?"
They certainly had enough time. The decision to take the approach we're on now
was made four years ago and was not expected to take this long. A rewrite,
even a mis-estimated one, would not have taken near as long and would have
been far cheaper. They also wouldn't still be suffering some of the crippling
bugs that are left in the legacy code that are affecting their customers _to
this day_.

"Rewrite," isn't an ugly word that should be avoided like the plague. In many
cases it's a very reasonable answer to a difficult problem. Like anything you
just have to evaluate the pros and cons effectively. Only experience can help
you there. So if your seasoned technical lead says, "rewrite," you might want
to consider it.

~~~
michaelchisari
Creating a legacy compatibility layer, so that you can rewrite each component
or area piece by piece is definitely the way to go. That's how I've been
reworking Appleseed into a component-based MVC, and it's worked really well.
People still can use the legacy code, they don't have to wait in the dark
while things get rewritten.

------
ladon86
I think that there is a scale depending on how far you've got with the
startup. The pain of a rewrite rises the further along you are, which is why
it's important to:

a) Make good architectural decisions (good luck)

b) Rewrite a lot as early as possible, as those decisions turn out to be
wrong.

You know, fail fast.

I'm at an early stage, and I have rewritten twice this year The pain has
definitely been worthwhile, as my system is now beautifully designed and
organised.

It might be possible to design the perfect architecture on a whiteboard and
then go ahead and execute it, but that's an order of magnitude harder than
writing a subroutine and having it execute first time with no errors. And most
of us can't even do that regularly.

Your product is like a bit of jello which is solidifying fast; you need to
make the dramatic changes early to avoid being stuck with an ugly lump later
on.

------
spidaman
I led a counter case where a rewrite was very successful. There had been a
major component of the architecture that just approached the problem wrong at
its inception. Changes that one would have thought should take a few hours or
day to make took weeks. And once you dug into the code, you learned why. It
lacked adequate tests, proper componentization, error handling and operational
visibility. It was far beyond refactoring, it was just ill conceived. It's
poor functioning had cascaded into other systems; they were riddled with hacks
to compensate for the problem system's deficiencies; technical debt had become
a cancer that spread around the architecture. At a certain point, we declared
the technical debt had reached technical bankruptcy and acquired buy-in across
the organization (execs to engineers) that we needed to start a new code base.

However, part of making the rewrite succeed was sucking it up and doing
continued maintenance on the legacy system. It was no fun but it had to be
done. Things that _couldn't_ be implemented in a reasonable time with the
legacy system but were high priorities were implemented in the new system to
assure that the win wasn't just one of purity of essence, it was enhanced
functioning. Enough was learned from what worked and what didn't in the legacy
system that we had a good deal of clarity on what requirements we wanted to
fulfill. The hand wringing over excessive feature creep and other foibles that
can make rewrites fail were attacked with discipline.

I've heard of many big rewrites that failed but don't buy the argument that
they demonstrate that it can't be done. It can.

------
codewhiner
I am torn.

We are not really a startup, but we are small and have a startup attitude
among the technology people. The platform we are running on goes back about 10
years, and it's starting to show.

There is a single shared database that is used by MS Access, classic ASP, and
ASP.NET applications. This means you can't change any one piece without
affecting all of the others. We've had people leave because of the resistance
to change inherent in the platform. Tiny little changes are very very hard to
make sometimes. Small changes can take weeks. Certain major changes might as
well be impossible.

But then all of the advice I've ever heard says "don't rewrite." What if we
don't rewrite, but build a new platform? Solve different problems than the
original?

~~~
jerf
Are you familiar with refactoring? It's difficult to give specific advice
because it depends so critically on the local landscape, but in general,
factor out a layer and give it a good healthy coating of unit tests, and
repeat until done. It is more complicated than I make it sound, sure, but in
some sense it is also really that easy. Factoring out the data access layer
sounds like a first step, creating some sort of service that actually unifies
the data access patterns and then moving up from there, but I can't guarantee
that.

All but the truly worst worst scenarios are better met with this approach than
a true rewrite. In this case by "rewrite" I mean the creation of a new system
next to the existing system that doesn't work until all (or at least most) of
the new pieces are in place. If you've truly got an epic snarl, rewrite may be
the only option, but odds are you don't actually have _that_ epic of a snarl.
With the proper approaches and tons of unit tests, a refactoring is like a
rewrite in the end, except you have a running system the whole time. It can
actually be slightly slower in total, but you also get the value of being able
to choose when to stop and a continually improving system that is always
actually running; it's only slower vs. a rewrite that actually succeeds and
completes and that is _not_ a sure thing!

With a database backing you do also have the option of trying to bring up a
new system that also hits the old databases, but that will in practice require
the first step I laid out anyhow, the refactoring of the data access layer,
and once you've done that the use of the rewrite value goes down a lot.

~~~
codewhiner
Refactoring only goes so far in this case, since we can only realistically
refactor one application. Unit tests help the refactored application but they
don't guarantee that all of the other pieces (DB stored procs, MS Access,
classic ASP, assorted other bits) still work.

~~~
troels
Why can you only refactor one application?

------
grammaton
Ah, this old chestnut again. Just because Joel said it, doesn't make it true.
Granted, the line between "refactor aggressively" and "rewrite" can be pretty
blurry at times.

~~~
michael_dorfman
_Granted, the line between "refactor aggressively" and "rewrite" can be pretty
blurry at times_

Not at all. The line isn't blurry in the slightest. In the "refactor
aggressively" case, the code is continuing to run while you make the changes.
In the "rewrite" scenario, you start with a blank page, and don't have an even
minimally functioning system until you build enough of it to get to that
point.

In my experience, the advantages to the former approach cannot be overstated.

~~~
nollidge
I think that's an overly narrow definition of "rewrite". In fact, I'd argue
that piecemeal is the smart way to go about a rewrite, if that is at all
possible. In which case the rewrite is just an aggregate of refactorings.

~~~
michael_dorfman
That "narrow" definition is the one being used by Joel, and the other authors
who have weighed on in the debate. A rewrite isn't an aggregate of
refactorings-- for the purposes of this debate, it is the _opposite_ of an
aggregate of refactorings.

------
chadaustin
Man, I regret not spending more time on the quality of the code at IMVU. I'm
not a big fan of rewriting from scratch, but we've basically scaled a
prototype into a crazy successful business, and it's had some nontrivial
effects. For example, product owners now believe that you may as well timebox
all refactoring, because you can never get it all. I wrote about the hidden
costs of dirty code a while back: <http://chadaustin.me/2008/10/10-pitfalls-
of-dirty-code/>

In general I'd prefer to refactor, but you need to _explicitly_ adopt that
cultural shift early on.

------
InclinedPlane
Sometimes a full rewrite is useful and necessary, often it isn't.

The typical failure scenario for rewrites is that development teams don't
appreciate that they are more difficult than writing something from scratch.
When you create something new you have time for it to grow to scale, you have
time for bugs and design defects to be worked out of the system, you have the
ability to concentrate on all of these problems without distraction, and the
consequences of a late project (which should be expected) are less.

A naive rewrite will ignore the fact that the rewrite will take more time to
reach the QA level of the old system (especially at scale), that it'll take
more time to develop because resources are split between supporting the old
system and the new, that it'll be more difficult to support backwards
compatibility or data migration, and that it'll be risky to deploy (you either
do so incrementally or you take a huge risk and do a big-bang deployment).
Combined with the regular schedule pressure of software development what you
usually end up with is your typical big-bang integration/deployment CF.

Stepping into a time machine, going back to the origin of your software system
and rewriting it to be better is far, far easier than writing a new system to
replace an existing, running system that's supporting a large volume of
business.

An important question to ask is: when are you fully committed to the new
system? The right answer should be "when it's proven itself better", but
typically the answer is "when we've started coding it", the latter is hugely
risky. It's best to think of rewrites as creating an internal competitor that
has to prove itself at every point.

tl;dr Rewrites are a different kind of software project than normal
development, failing to appreciate that typically leads to failure of the
rewrite.

------
metageek
At one company, I went through two rewrites, one bad, one good. Sort of.

In January 1999, I was hired by appoint.net (which became eCal) to be their
Chief Scientist; my main job was to rewrite their Web calendar in C++. (The
original version was in ASP.) It was a disaster. I started by building the
infrastructure API, but I made it _much_ too complicated. The team working on
the rewrite got up to about 5 people before they admitted that they just
couldn't understand most of what I had done. We canned it sometime around
November, having spent something like 3 hacker-years on it.

The team then went off and implemented a bunch of API features that talked to
the existing application's database, but were written in Perl. They discovered
they _loved_ Perl, and were highly productive in it.

About June 2000, the company decided to pivot. The company's business was an
OEM calendar; the whole app ran on our servers, but anybody could pay us to
get a custom skin, accessed through their own domain. It was popular with dot-
coms who wanted sticky features to keep people coming back. One thing that we
billed as an advantage was that all the customer calendars were actually
running against the same database, meaning that someone who had a calendar
with Foo could send invitations to someone who had a calendar with Bar. (What
we didn't emphasize was that users could switch back and forth between Foo and
Bar at will. That was less sticky.)

But, in mid-2000, we started losing customers--not to competition, but to
bankruptcy. Management decided we needed to switch to the enterprise market--
and, in the enterprise market, Foo did _not_ want Bar users being able to see
their calendars. We needed a packaged solution; but our ASP implementation
couldn't do it.

So, we started on a rewrite. All Web stuff would be done in Perl; all database
transactions would be stored procedures. I was asked to build an
infrastructure for the team to use. I made a template engine--fairly standard
stuff these days, but nice; the team loved using it. I eventually got it
pretty fast, too, by compiling the templates down to Perl on demand. (We had a
budget of 140ms on the Web server and 140ms on the database, which was
computed to support 1,000,000 users on 4 Sun ES-450s--about $16K, I think. We
hit both budgets easily.)

It rocked. Technically, it was a complete success. Business-wise, well, no--we
released in June 2001, a time when enterprises were just not willing to spend
money. Our pitch was that it would save them money compared to Exchange or
Notes, and we were probably right; but they already _had_ Exchange or Notes.

------
fleaflicker
My favorite Joel essay: "Things You Should Never Do, Part I"

<http://www.joelonsoftware.com/articles/fog0000000069.html>

This discussion comes up a lot. People make strong arguments for both
approaches.

If you have an established product/business, rewriting is very dangerous. I
prefer to refactor aggressively.

~~~
shipit
I was about to post this very link! It takes upwards of 1 year to get stable
and rewriting at that point without considering refactoring and/or creative
solutions is cumulative team/company failure.

------
bpyne
Steve's advice is sound in some situations but one factor not mentioned by him
is the stability of the current platform. Are customers affected by bugs,
poorly thought out process flow, etc.? Rewriting simply to add new features
may be a losing proposition. But, rewriting because the current code base is
so poorly thought out that it affects customers...now we MAY have good
justification.

Also, the following struck me as odd, "Our CEO doesn’t have a technology
background, but he’s frustrated he can’t get the new features and platforms he
wants (Facebook, iPhone and Android, etc.)" Not knowing the situation better,
I'm a little leery that the CEO is driving the new features when a product
manager or someone closer to the customer base might be in a better position
to determine what's needed. Perhaps in this company the CEO is close enough to
the customer base, but I can't tell from the post.

------
kenjackson
One of the things I've seen people consistently screw up in rewrites is perf.

I was in the unfortunate seat more than a decade ago of having MY code up for
rewriting. It worked just fine, but people argued that I was the only one who
could work on it (not true, but it could have been cleaner, no doubt).

Anyways a small team dedicated six months to rewriting my server code. Six
months later they scrapped the rewrite. It ran at something like 20% my
sustained throughput. I used to joke that their code was, "copies all the way
down". And sure my code made a lot of use inline asm and some nasty tricks,
but it flew.

Would I write it the same way again, probably not. But when you're doing a
rewrite people have something to compare you against.

------
kennu
The interesting thing is how easily you can 'slide' into a full rewrite, when
your old code is aging legacy and everybody wants to get rid of it. I've
participated and seen it happen many times. It has never worked particularly
well.

In my experience, what you mostly underestimate is the amount of hidden
features lurking inside a mature product, which all need to be rewritten in
the new implementation. The stuff you can easily estimate is 20% of the work
(main features), and 80% is all the small stuff you afterwards realize you
also need.

------
KevBurnsJr
This is why you MUST throw away the prototype.

Ditching bad code is much easier when there’s less of it. Throwing away the
first iteration takes discipline and must be executed cold and ruthlessly. It
is not enough to admit the code needs rewriting. You must be prepared to
delete code from day 1.

3 months is a good life cycle for a functional prototype. Budget for 1 month
of overhaul for the first 3 months of development and 1 month of overhaul for
every 1 month of development beyond that.

------
ankimal
I can vouch for this having gone through this myself. We built our beta in
about a year with a codebase full of "technical debt" and a very bulky
application which was "feature rich" but "usability poor" and also started
introducing all these back end problems (some a manifestation of our web
framework as well). So we thought we were "too cool for school" and started
architecting (in hindsight, "over-architecting") what was at heart a simple
application which just needed a facelift and some consolidation.

We distributed it into multiple pieces and started a massive re-write with all
these services which ended up with more code in the shared plugin (duplicated
across all services) than in the service itself. Eventually, we realized that
we re "over-engineering", cut our losses and quickly glued our little pieces
back together into a slightly slimmer application which had all that we really
needed to begin with - a facelift and some consolidation.

------
Aegean
This is one of my favorite lessons learned, also covered here:
<http://www.joelonsoftware.com/articles/fog0000000069.html>

------
singular
I think a good analogy is a messy house. Do you bulldoze the house and rebuild
it, or do you just go ahead and tidy what's in it?

Only in the most unbelievably dire and awful circumstances do you take the
former option...

~~~
mgkimsal
The situation is never 'tidy up' though, when it comes to software. When was
the last time your assignment was to 'move 4 buttons on the screen, and make
them shinier'?

Assignments to existing systems involve activities like adding new db fields,
new validation logic, integration with external systems, rerouting/duplication
of existing data in to new modules, and so on.

Comparing these to activity on a house, all would involve
construction/building of some sort. If the foundation is weak - to the point
where hammering a nail in a wall causes the floor in another room to cave in a
bit, most sane contractors would not get involved, or require severe
structural work to get things (back?) to a minimum safety code.

Admit it - you've worked on projects where introducing a variable in module X
causes havoc in some screen which seems unrelated to module X. You can harp
about test cases catching all this stuff, but the types of systems we're
talking about - the ones we're talking about replacing wholesale - _don't have
that infrastructure in the first place_. They weren't _built_ according to
best/good practices, which is why the people in the article were considering
rebuilding from scratch.

Only in software do we think we can order people to continue to work on
systems that are visibly falling apart without having to put in the required
infrastructure work to make sure things don't keep breaking.

~~~
singular
See my other reply below; there are definitely times when shit needs to be
done. Perhaps the analogy should be the state of the house in general; if the
foundation's fucked you really have no choice but to start again.

And yes, there are _definitely_ situations where that's been necessary and
I've worked with some. I've seen a piece of software that got so damn
complicated that the _only_ thing you could do with it was to add stuff
exactly according to the retarded design, anything even slightly varying from
that would have taken literally weeks to implement.

I think the point here is that utterly fucked software occurs way more times
than people would care to admit. A lot of the problem is the disconnect
between non-technical managers and coders. Coding isn't factory work.

------
Maro
It depends.

For example, if your startup is a web-based service written in PHP, then in
most cases you shouldn't rewrite it, because your customers won't get much out
of it, and you'll be left behind.

But if your're a database startup and your storage engine is doing too much
random disk I/O (which is slow), then you'll have to rewrite that part,
otherwise you don't have a usable product. However, you should keep in mind
that this is super-dangerous, so you should get something out the door ASAP,
even it means taking terrible shortcuts.

~~~
robryan
Depends how bad the code base is, I can see security being a big problem in a
PHP code base loaded with technical debt. Speed is another one, certain things
in PHP are just going to be slow with bad implementation.

~~~
Maro
IMO the prudent thing to do as a businessman is to wait until your startup is
no longer a startup but a well operating business when you can deal with such
technical debt by hiring a smart guy who'll refactor piece-by-piece in the
background. Until then you just patch security issues and cache the hell out
of it.

------
bourbaki
For me it depends. I think that rewriting the 100% of the code is a crazy
thing. I'm working for some web startups, and expecially in the beginning,
with really low experience, you or your fellows could create something that
fit the need in that specific moment, but maybe in the future is not good,
maybe haven't good scalability, or something similar. In that case, is better
rewrite. Maybe you could save the company and learn a lot of things about that
terrible experience.

------
vannevar
Show me an example of a startup that scaled up massively and _didn't_ rewrite
their code. I'd suggest here that we have a case of correlation without
causation. Startups tend to need a big rewrite early in their history, which
is also coincidentally when they are likely to fail. People looking for a
reason (or an excuse) for failure will blame the rewrite. Not unlike the
'vaccines cause autism' meme.

------
didip
rewrite is an overarching daunting task. It's so much better to break rewrites
into multiple steps. I went through several major rewrites with various
startups so I think it works, but take it with grains of salt:

1\. Throw away garbage first. Most companies including startups accumulate
garbage code fairly quick. By just throwing away old ideas that didn't work,
you have achieve a lot of gains already. There are many benefits doing this
alone: tests run faster, compile time reduces, awk/grep became faster, etc. As
startup founders/CTO, you can even held garbage throwing party once ever few
months. Every programmer I know loves throwing away garbage code.

2\. Ask stakeholders what are the primary use cases. Don't write a single line
before doing this because it will be wrong, again.

3\. When new technology is involved, perform load tests. Even the most trivial
load tests would do. Doing this will inform you basic knowledge on how robust
the tools are.

4\. Rewrite 1 thing at a time and run tests in between so that your confidence
stays high.

------
GBond
counter point: Greenplum.

An anecdotal evidence that it is possible to fire all of sales, rewrite and
not only survive but also have a good exit.

[http://www.beyondvc.com/2010/07/emc-buys-portfolio-
company-g...](http://www.beyondvc.com/2010/07/emc-buys-portfolio-company-
greenplum-more-behind-the-story.html)

------
droz
Sometimes when you have a blighted building on your hands, the only thing you
can do is burn it to the ground, clear the lot and start fresh. It just
doesn't make sense to replace a window here, the plumbing there when the
entire frame is rotten and about to collapse.

------
codelust
If Steve had written the ending of the post as the beginning, it would have
clearly conveyed what he was trying to say.

The key factors involved in such a decision are:

1) Why 2) When

I don't think it is a blanket vote for or against a rewrite.

The important bits come in the end under "Lessons Learned".

------
gozzoo
Joel Spolsky has written a very good esay about this using netscape as
example: <http://www.joelonsoftware.com/articles/fog0000000069.html>

------
brown9-2
"don’t rewrite the code base in businesses where time to market is critical
and customer needs shift rapidly."

I don't mean this sarcastically but in 2011 what businesses are there where
"time to market" _not_ important?

