
How To Survive a Ground-Up Rewrite - codegeek
http://onstartups.com/tabid/3339/bid/97052/Screw-You-Joel-Spolsky-We-re-Rewriting-It-From-Scratch.aspx
======
freework
The trick to making a rewrite work is to only rewrite what you know how to
rewrite. What I mean by that is a lot of people will be working on a project
and will at some point say to themself "This project is really hard, I can't
get anything done, I have no idea what is going on. My only solution is to do
a complete code rewrite". If this is your mindset going into a large refactor,
you are bound to either fail, or end up with a system that is just as badly
composed as it was before.

On the other hand, if you're saying to yourself "This code really sucks, they
wrote their own complex CSV parser, I can rewrite this to use the CSV library
in the standard lib" and things like that, no matter how big the task may
seem, it'll work out in the end.

~~~
krichman
That's a good point. It's actually hard for me to recognise when it is bad
code and when I am just feeling NIH syndrome.

~~~
nevinera
The important thing to realize is that it's _all_ bad code.

Our job as engineers is to prevent those criminal routines from murdering one
another by carefully confining them into their prison modules and never
letting them get too rowdy.

I am the Code Warden, and within these bytes my word is law!

------
Joeri
One thing to pay attention to is the never-quite-done migration. For example,
i worked on a deep refactoring (near rewrite) of a room reservation system
where the data model needed restructuring for faster querying (but with
identical data). The migration was done by making the old data model
automatically write to the new from triggers, and then migrating the read code
to point to the new. Ofcourse, once the performance bottleneck from the read
code was solved, suddenly migrating the write code seemed way less important.
Higher priorities intervened, and the code is still in a halfway state two
years down the road.

Another danger i ran across were moving goal posts. Long rewrites will
typically be interrupted to extend the old system with new features, which
increase the effort to rewrite.

~~~
philsnow
I don't know the specifics (obviously) but this reads like creating a better
index on an existing table.

The gross part is (I'm guessing, obviously) that now each write to this data
store incurs an additional write and some cpu load on the data store servers
(you mentioned it was done via a trigger). But now the value of finishing the
migration that you started is just recovering this extra write and cpu.

~~~
ams6110
And as an aside I'd _strongly_ advise against using triggers in almost all
cases. They tend to result in systems where things happen by "magic" rather
than in well understood ways. If you're tempted to write a trigger, use a
stored procedure instead and invoke it explicitly.

------
InclinedPlane
There is one common mistake that teams tend to make when deciding on a
rewrite: committing to the rewrite too early.

Here's some ways to do a rewrite well:

Set up the rewrite as an alternate, competing product and let the two compete
in the marketplace. This requires having the resources to support developing
two such products at the same time using separate teams for each, which is not
something many companies can afford. The advantage to doing it this way is
that it gives the rewrite time to mature before it's fully replaced the old
system. What most people tend to not realize is that mature software with
fundamental flaws can often be superior to immature software on a better
foundation. Real-world use is orders of magnitude more effective as a QA
process than any sort of in-house testing, and always well be, products that
have been tried by fire and made the cut should always be respected no matter
how horrible their underlying code is.

If you look at a lot of the major switchovers in software you'll see this sort
of trend fairly often. For example, consumer Windows took several generations
and almost an entire decade to fully switch over to the NT core.

Another good way is to rewrite piece-wise, in place (refactoring). There are
lots of different ways to do this regardless of architecture, ultimately
you're essentially creating a bubble of new code that grows and grows until it
completely replaces all the old code. The advantage of this is that it takes
much less effort than a full rewrite and you can do it incrementally and
slowly. Often it's easier to mature each new component in isolation than to
try to mature an entire new product.

And, of course, there's the hybrid approach. Work component-wise in an "add +
deprecate" fashion. Create the new component, have it work in parallel along
with the old component, and then deprecate the old component and work toward
moving client code to the new component.

~~~
johnbellone
> This requires having the resources to support developing two such products
> at the same time using separate teams for each, which is not something many
> companies can afford.

This is definitely the ideal situation, but as you said most companies do not
have the resources to greenfield a separate project. Additionally convincing
the higher-ups that its worth it is a difficult task.

> Another good way is to rewrite piece-wise, in place (refactoring). There are
> lots of different ways to do this regardless of architecture, ultimately
> you're essentially creating a bubble of new code that grows and grows until
> it completely replaces all the old code.

> Work component-wise in an "add + deprecate" fashion.

Everywhere that I have worked this has been the route we've taken. We isolated
the old behavior until we were able to fully ween clients off the product (or
perform a data conversion and completely use the new code).

~~~
ucee054
_most companies do not have the resources to greenfield a separate project_

Well no, sometimes they have the resources but don't want to spend them, then
they end up spending more overall because they botch the replacement.

------
sps_jp
Having been the person responsible for migrating data from an old system to a
shiny new one, I wish I had read this a few years ago. However, since I will
be embarking on rewrite from scratch project in a couple of weeks at a new
position, I am glad I read this now. Migrating data is hard and should be
given the same level of importance as the back-end or front-end development. I
remember vividly the cycle of "sleeping" with my laptop next to me churning
away over night, waking up every hour to check the process and having it fail
at odd times. As the author said, make your migration process repeatable,
testable and logged. Good luck.

------
16s
Over the years, I've noticed that many programmers who want to re-write, just
aren't comfortable modifying other people's code, or it wasn't written in
their favorite language. So, they are critical of the code and urge for a re-
write. These people generally make bad programmers. If you have people on your
staff that want to re-write this program, or switch to this framework (because
it's what they are used to), then watch out. That's a bad sign.

Competent programmers can maintain other people's code, those who cannot want
to re-write everything. Train wreck in progress. Run from these people, or
fire them if you're in a position to do so.

~~~
sergiotapia
I've come to a situation in the past, where there would be endless lines upon
lines of PHP code intermixed with random static functions and HTML markup.

I'm talking 40, 2000 LoC files that all called the database whenever they felt
like it, all echoed out data and modifed each other in wierd ways.

I spent about two weeks trying to untangle the mess and threw the towerl in at
day 15. We just rewrote it using a PHP framework that not only let us query
the database using a sane ORM, but also gave the project STRUCTURE and added
mantainability value.

Sometimes a rewrite is the best cure. ;)

------
ams6110
_We have a pattern we call shrink ray. It's a graph of how much the old system
is still in place._

Immediately brought to mind the first large shop I worked in out of school
(about 1992) where they had a big chart on one wall titled "Punch Card
Elimination" with a downward trending line. I was actually shocked that a
large, multinational manufacturer was using punch cards for anything,
particularly when my project involved building cutting-edge (at the time)
client/server systems that included wireless handheld devices on the client
side. My introduction to the counterintuitive world of enterprise IT: you will
often see opposite ends of the technology spectrum in use, simply because the
value proposition of eliminating the old was never high enough before.

------
derefr
> If you really are committed to just rewriting the entire thing from scratch,
> I don't know what to tell you.

What about when what you're rewriting is basically an intensely-overgrown
prototype that was never meant to run in production in the first place?

You're _supposed_ to completely rewrite those, aren't you?

~~~
stormbrew
It ceased to be a prototype when it went into production. No dissembling can
change that fact at that point, all the caveats of how and what you can
rewrite take effect whether you like it or not.

------
dkuebric
Data migration becomes a lot easier the closer to "raw" data you store. The
farther away you are, the tricker it will be to write those migration scrips.
And, if what you're storing is rolled-up enough, you may not even have the
information you want/need any more. Note that the source data doesn't have to
be what you're serving queries from; it could be an archive with batch scripts
to re-process that's only dusted off in case of disaster/migration.

------
Mz
This was great. I don't really code but I have been trying to migrate
handcoded websites to Wordpress since dinosaurs ruled the earth. I still
haven't figured out what I need to do to pull that off satisfactorily. Copying
pages requires me to then strip out html. I feel like I have never wrapped my
brain around getting the new navigation to really work even though navigation
problems (with an overly long and growing list of pages) was one reason for
the migration.

I am wondering what sorts of info I need to go looking for to get this
finished. I have thought about it before, contemplated asking around, and
realized I didn't even really know what questions I needed to ask.

I also found this, which struck me as helpful but not quite what I wanted to
read about:
[https://www.linkedin.com/today/post/article/20121226171356-6...](https://www.linkedin.com/today/post/article/20121226171356-658789-doubtliers-
dangers-learning-from-the-exceptional?trk=mp-reader-card&_mSplash=1)

~~~
johnbellone
I've always just dumped the Wordpress database using the export tool. After I
have the data I rewrote the template.

Most of the time the Wordpress template is so wonky that its better to just
rewrite it (especially if you're migrating away from Wordpress in the first
place).

~~~
Mz
No, migrating _to_ WordPress. But thanks for replying.

------
chiph
A lot of this rings (painfully) true.

In the past, we used subsets of data to test our migration tool. Always.
ALWAYS, there was something bizarre in the the excluded subset that caused the
migration to break.

~~~
philsnow
Additionally, if your data is highly graph-structured and not trivially shard-
able, merely selecting a valid subset can be complicated.

------
skastel
If you want more insight into the "Unhappy Rewrite" here's some of my lessons
learned, though I've learned a lot more in the 2+ years since I wrote it.
[http://dev.hubspot.com/blog/bid/58082/Leads-API-Lessons-
Lear...](http://dev.hubspot.com/blog/bid/58082/Leads-API-Lessons-Learned)

------
ldng
Rewrite are bad, true. But sometimes unavoidable. Really.

When you have a PHP project where there's a mix of procedural function (the
code is at the top and the html at the bottom, so it's easy to split right ?
Hum and those includes, what do they do exactly ?), ... classes (We used
classes to put our functions in so we do OOP, right ? Err... no) and parts of
a heavily customized Wordpress. Plus two-three half started in-code
reorg/rewrite that went nowhere because the devs jumped the ship.

You're told you have to I18N the whole spaghetti meatball soup. You try a few
weeks and then see you're going nowhere and start to realize that you're
doomed to a rewrite. Oh, and you're a team of two. Sometimes you really have
to. And migrating data .... such a pain in the ass ! Test encoding I'm looking
at you ! At of hand int code (would not call that enum) I'm looking at you
too.

But. It's not always the case. And right now I'm starting the reorganize and
refactoring a project that was already in a descent shape. And it's much more
interesting, less straining and business owner have more visibility on the
work done and are happier.

So yes, whenever you can, incremental rewrite is the way to go. But never say
never. There are contexts where doing a rewrite is just plain necessary
because the code is in such a bad shape you just can't add any feature
anymore.

~~~
purephase
Voted you up because you basically just described the situation I'm in now.

While successfully in production, the web servers were crashing multiple times
a day (still Apache-modphp). So, I started to work on the PHP upgrade from 4-5
to get onto php-fpm + Nginx to stabilize the site. Lo and behold, the legacy
db abstraction code (poor man's ORM) was not playing well with later PHP
releases so I started down the path to patch what I could.

I successfully upgraded PHP, fixed a few of their outstanding issues, but I
was weeks into this project and had not completed any of the new feature sets
on my plate yet. Looking into the horizon I imagined months of wrestling with
legacy code and the deeper I went (there was a SOAP layer I hadn't even looked
at yet, they wanted REST/JSON) the more worried I became.

We evaluated a bunch of options and opted for a re-write. We're still deep
into it but there is light at the end of the tunnel and I'm pretty sure it
will be worth it in the end.

------
jacques_chester
Incrementalism is sometimes called Burkean or Oakeshottian conservatism in
political circles.

The analogy used is of a ship at sea. Replacing the ship in a single step,
while it is underway, is impossible. Replacing too many planks at once is
foolhardy. (I thought the analogy was Oakeshott's but I can't find the
original source).

So instead you replace the planks one at a time. Yes, this is tremendously
slow, wasteful and frustrating. But when the ship is a system on which you
utterly rely, there is no other safe path.

Hayek expanded on the problems of pure rationalism more generally; it's a trap
we as a profession tend to fall into. We substitute _our_ preferences (for a
tidy, orderly system) for an appraisal of actual reality. Real systems have
accumulated complexity that cannot be wished away. Worse: the complexity is
often essential and not merely accidental.

------
PixelPusher
Thank God I've never HAD to do a complete re-write of anything.

What the author describes sounds like a complete nightmare.

On the projects I've worked on, I would like to think that a rewrite will not
be needed.

If it does, it will be because we've now understood the problem space so well
that we can start from scratch and avoid all the 'pivots' that happened in the
original code base.

------
nugz
We fought off a complete re-write for a critical component in one of our
products for many months due to knowing the amount of effort/complexity
required, however in the end it was not the best decision to delay. The
decision was to work on product features rather than fix the known scalability
issues as we believed we still had enough headroom. We grew faster than
expected and ended up in a position where we had exhausted most of our
software optimisation options, database tweaks and even hardware options with
upgrading to SSD's, RAM etc. In the end the component was re-written around
the clock by a few engineers to leverage redis and msmq in a couple of months
and it solved all our problems thanfully - a very stressful time. Lesson
learnt. It was a difficult call when managing a limited set of resources
deciding where to focus the attention.

------
japhyr
To all the people telling stories about rewrites they have been involved in, I
am curious if the code you were rewriting had tests?

If so, do you rewrite those tests? Or do you keep them, no matter how ugly
they are, because new code still has to pass all the old tests?

~~~
ollysb
I've done a few rewrites, in all cases they were projects that were written
entirely by junior developers (company took a while to learn the costs of
cheap developers) and with absolutely no tests whatsoever.

If you're rewriting a project then you're probably going to have to bin the
unit tests. If you're rewriting it's because you want to make some drastic
structural changes and unit tests are just too tightly coupled. Integation
tests on the other hand can be an absolute godsend, you can swap out the
entire stack and they'll still provide just as much value as before.

------
rlbisbe
If you are starting with some new language or technology that you are not very
fluent at, and months later you land in that same project with a wider
experience in the language, would you rewrite that code?

------
ukandy
I dread to think how many complete rewrites have never seen the light of day.

Get your data source in order, then refactor in small achievable projects from
there. Better for motivation and better for end users.

------
bengotow
If you're reading about surviving a ground-up rewrite when you're supposed to
be _doing_ a ground up rewrite, you're doing it wrong. Read about design
patterns, focus on your core business, and stop trying to focus on the "five
things to make your rewrite successful".

------
k3n
Flagging for sensational, link-baity title.

edit: unflagged after modifications

~~~
dshah
Ok, hadn't thought of that, but see your point.

I think the content is really good, so title changed -- don't think the author
will mind (he had two alternate titles for the original post).

~~~
codegeek
I updated the HN title as well. thx

~~~
k3n
Fine, fine....I'll unflag :)

Thanks for taking the time to update it.

------
michaelochurch
A ground-up rewrite is a sign that things have been done wrong. If your
software and infrastructure are modular, you can rewrite pieces without
tearing down the whole system.

Ground-up rewrites are occasionally necessary, always difficult, and often
hideously political. There's no binary right answer on this question.
Sometimes it's the only answer, but that's a terrible place to be.

My shortest job tenure ever (108 days in the winter of 2011-12) was in a
company where I was brought in to fix the cultural and social collapse that
had followed a badly-thought-out "rearchitecture" led by a 25-year-old CTO's
protege who (a) was on his first real job, having burned out of college and
landed in retail, and (b) didn't know what the fuck he was doing. I'm still
deciding whether I should share that story, in full, with the world.

~~~
ldng
You should. Cathartic. Just don't be too specific and don't name names.

~~~
michaelochurch
I've given the generally relevant pieces on my blog, with varying degrees of
cover. However, naming the specific company is what I've been debating over
the past year. Thus far, I've chosen not to do it.

The reason I'm hesitant is that good engineers (who don't deserve it) might
suffer more than the management (which does). I learned the hard way that
whistleblowing often hurts the more vulnerable good guys first.

It's a startup of about 100 people. I could demolish this company's reputation
with a few keystrokes, but the executives (who deserve it) would probably be
just fine. It's the engineers who'd suffer-- having a disgraced company on
their resumes-- and I hate that that's the case, but such is life.

What happened is that there was a "War of Three VP/Eng's". There was an
unofficial VP/Eng (call him Dan) who did a really good job. However, with
_four_ product pivots in three years (the CEO was an idiot, but kept being
able to raise capital because of family connections) there were legacy and
code-quality issues. Top management threw Dan (even though he was a true 2.0+
engineer) under the bus, and the CTO pulled in a 25-year-old protege (call him
Tony) who was fucking _terrible_. He knew basic programming, but this was his
first white-collar job; story was that he'd burned out of college and was in
retail before the CTO "rediscovered" him. Tony's quite intelligent and he can
talk with the best of them, but he's technically mediocre and a genuine
psychopath.

Dan had engineer support; Tony had management support but engineers disliked
him because he was an obvious case of an incompetent protege. Tony's
"rearchitecture" fucked the company badly. I was brought in as a 3rd VP/Eng
(and actually promised the title, at the 6-month mark) to resolve it.

There were good engineers on both the old and new teams, and resolving the
technical disagreements was easy. But in March 2012 it was clear that my real
reason for being hired into the company was to give management a credible
excuse to toss away half of the old team. I either had to sign a bunch of
documentation that'd be used to justify Pincus-type moves against 11 of my
colleagues, or give up my contention for the VP/Eng role since (apparently)
that's just want managers have to do. I chose the latter-- really didn't want
to be managerial in such a toxic company-- but Tony convinced management that
if I was throwing in the towel for VP/Eng, I should be shown out entirely.

This wasn't only out of ethical altruism. I did the right thing for good
reasons, but I also had no choice. If I signed the papers and gotten a bunch
of undeserving engineers fired (or at least put through humiliating PIPs that
would probably involve equity clawbacks) then I'd lose credibility and Tony
would win. Tony (angling for CTO once the existing one burned out, which he
did around the time I left) teamed up with management to set it up so that I
would be the one getting dirty in the Pincus move. He could be the good guy
CTO and I'd be the bad guy VP/Eng, the rubber glove used for evil work and
thrown away. It'd be career suicide because no engineer would work with me
after that.

It wasn't the rewrite that killed that company. The rearchitecture was a
symptom of psychopathic management and a really horrible CTO-protege-cum-
unofficial-VP-Eng. Still, it makes me extremely skeptical every time I
interview with a company and hear, "We're throwing out all the old code"
because I know how politically fucked-up that often is.

