This is madness. There's no real advice about rewrites in this article. The reality of rewrites is that they often fail, cost a ridiculous amount of time and effort, don't actually fix the original problem and/or bankrupt the company. You can't just wave away those realities with "think about the downsides".
I have participated in a few (small) rewrites that were successful, and this would be my advice: Don't rewrite an entire project at once. Pick off smaller features first, get a feel for what your new approach will be like before you commit to the big stuff. This will often mean you have to migrate from a monolithic architecture to a services architecture first. This means your project will grow in complexity first. If you're confused about why a project first grows in complexity before shrinking, watch "All the little things" by Sandi Metz, it's the greatest programming talk ever recorded. If your rewrite gets stuck after adding all of that complexity, then you've played yourself. Worst thing that can happen is you rewrite your entire project, some users or customers start using the new project, but you've failed to rewrite everything and some chunk of your users are still using the old codebase. Now you've got twice as much code to maintain.
Before you start a rewrite, consider refactoring and rearchitecting the old codebase, there might be a gem in there that just needs some love and attention.
That said there definitely are famous successful rewrites, and I think web application backends are especially suitable for rewrites. Most famous are Twitter and LinkedIn, Twitter going from a relational database backend to a more appropriate fan out message queue based backend allowing them to scale beyond imagination. LinkedIn going from a big rails monolith to super fast node.js microservices reducing their hardware costs a hundred fold if stories are to be believed.
I think the only way to end up with a successful re-write is to be extremely realistic with the whole organization about the scope.
For a program that took 10 years to develop, a rewrite will probably take around 10 years as well to reach feature parity. Obviously, if you stop development on your successful app for 10 years waiting for the rewrite, you will fail. If you think 10 years worth of work can now be finished in 2 years with the benefit of hindsight, you will fail.
But, if you start your new app by first attacking new markets that the old one couldn't, so that the new app is genuinely valuable in itself, and then you slowly consolidate it by adding features that the old app had, while still maintaining a crew developing the old app as needed to address the existing markets, you can justify the investment and then some. Of course, this should only be done if it has become clear that the old program is really not possible/worth it to extend and continue in the long run, which can happen for various reasons.
There will be times where you will find massive shortcuts you can take, since odds are things you had to develop from scratch 10 years ago exist pre-packaged today. You may also be able to re-use parts of the old app that were actually in good shape and take them whole. But you should never rely on these things making your work shorter - they are nice boons, but the correct estimate is still in the order of magnitude of the old app.
I'm convinced that bad rewrites happen because coders get bored and care more about the possibility of a little bit better code, then the whole future of the company.
Necessary rewrites happen because they did too much YAGNI thinking in version 1.
It annoys me so much that after honing my skill as an engineer for 15 years now I still fall into the trap of not enough YAGNI thinking. There's a precarious middle ground there for sure. Even though I still fall into the trap of over designing, I believe it's way worse than under designing. I don't remember a single instance where a 'YAGNI' under designing cost us much except designing and implementing the feature the right way, a cost which we'd incur anyway. In contrast I've caused my team to work with an overly complex codebase for years on multiple occasions.
The only time I've regretted overdesigning is when there's been significant original design work involved.
I don't think I've ever regretted overdesigning to meet a common standard or match how everyone else does something.
A lot of the YAGNI I see is in the form of "We don't need a framework, just do vanilla JS" or "I only need one of the features of this library, I'll just write it myself".
Almost all of my own big tech regrets have to do with building something "innovative". Even when they've solved the problem better than existing solutions they often become regrets, just for the simple fact that they aren't better by enough to justify the nonstandardness.
Ah, I think I wouldn't qualify those as YAGNI decisions, more in the premature optimisation direction. YAGNI is preparing for future integrations that might never happen, or preparing for supporting multiple databases, special users, special circumstances. It is always a decision of doing more work now, versus more work later.
The decision to not use a framework or a library is sort of the reverse in that you're making a decision that you can later reverse by reducing the complexity of your own code.
Definitely agree with you that a part of being a very effective development team is owning as little code as possible, by using as many existing solutions as possible.
YAGNI is difficult to get right because we can't see into the future. There's certainly been plenty of times where it would have been great if something was designed from the start with the new functionality in mind instead of having to hack it in. But there's also many times where thinking that we would need something that we didn't created maintenance and performance overhead.
I have participated in a handful of rewrites. Most of them were indeed unnecessary or harmful.
However, "X2" was definitely an example of an appropriate rewrite. It had the following conditions:
* The existing code "X" was the product of an early-phase startup which hadn't yet figured out what it was going to do. "X" already existed at the point we were still asking questions like, "Who is the customer?" with possible answers like "The NSA and similar entities" even though it eventually became a consumer product - and so obviously it was very flexible but had also been greatly distorted from its original conception to produce a product we were actively selling.
* The existing code was developed and maintained by a third party team. So the knowledge of how that system worked was something we'd need to absorb anyway to bring it in-house. Existing code reflected house style of that team, and X2 would be in our style.
* It was modestly sized. I think our plan said our small team (less than half-a-dozen people, not all of them available for this full-time) would rewrite it in one calendar year, with an MVP in maybe six months and from there simultaneously copying less important features from "X" and also new want-to-have features to get the finished "X2", that's roughly the timeline we followed.
* I was pretty confident we could execute from the initial description of the work to be attempted. I went into a meeting assuming (as I always do in all-hands meetings) that I'd get fired, and came out thinking I should brush up my Java (the chosen language for "X2") but it seemed very possible.
* The non-technical leadership actually wanted a new system. "We need a new pig" is how it was explained back to me by the CEO. (From "Lipstick on a pig" the idea that superficial changes to a product can't really disguise its flaws) so there was buy-in of this idea and nobody yelling at us about why the "X2" MVP doesn't deliver most of the things "X" didn't have yet.
[This software, and its predecessor, had code names but somebody let me name the successor software and I have no imagination so although it wasn't literally "X" and "X2" that's about how similar the code names were].
May I ask you something? In your experience, how would you approach the rewriting of an application with a mix of front end and back end code, bloated code from years of patches without any planning for a maintainable design (full of unused/buggy classes/methods/features), lack of tests and barely any documentation explaining its processes?
Well you're not asking me but this is a forum so off I go anyways. This can only be done with serious investment, so you need to ask your management if they are serious about the long term future of your product. If they are there's basically two options.
The first option is to develop an API layer over your old system, that is supported by a full integration test suite, and then either build a new frontend using that API, or port the old frontend over to that new API. As this effort progresses, your integration test suite and well defined API allow you to replace/rewrite portions of your backend as well.
The second option is to look at your most valuable customers, discern what features they use and rely on, and design a fully separate clean slate implementation of only those features in a new "V2" product.
In either case, what you're describing is a very tough situation, and any solution would be very risky. Management usually is not aware of tech debt, they might not know they've accrued a million dollars or more of engineering debt that they will have to pay off to solve problems like being able to quickly respond to new market developments, retaining high quality engineers or ballooning customer dissatisfaction.
Thank you for your feedback. I'm also in favour of unentangled frontend from backend as a first step. Problem here is that the frontend needs to be rebuilt in a different language and $boss wants to keep releasing new versions during the rewriting (obviously) so porting the old frontend allows to release new versions but doubles the time for the rewriting while building a new frontend would leave the customers without new updates for a long time.
I have never done what you describe, and so I can't say "from my experience".
However, I would say that Job #1 is to identify what the end users (and the customers if the two are different) think this does. Reproducing the system they use/ paid for is the important goal of the rewrite, by definition if you can't fill that role you failed. Technical considerations (it would be nice to have Continuous Integration, it should be in C# not Java, it should run in the Cloud) are necessarily less important.
If you currently have little or no documentation this process produces valuable documentation and it concentrates you on what's important versus unimportant in a way that studying the existing software as a programmer does not. The trickiest code in the software could be so vital that without it your rewrite is useless or equally never used and you can't figure that out by staring at the program's code.
Once you have that high level you're in better shape to drill down and then I think you're in a better place to figure out if a rewrite is appropriate versus maybe you can refine the existing software.
Not the OP you're asking the question to, but I was in exactly his case. I hate rewrites and joined a new company (this was 5 years ago) that was failing their rewrite(6 months in, 4 "verticals" to port, only half-way the first one). The original code was a PHP/JS mess done in the Philippines that worked but was insanely awkward to change/deploy/monitor: think all the fad you learn at uni about micro service/design pattern/latest devops but no experience to tie it all in in a logical way - which is fine, startup, successful nonetheless, VERY cheap devs yet motivated.
So, the rewrite was a failure because the new team was smug, very smug. They were in Hong Kong, elites of Asia, trying to rework the "shit" done by the remote guys, and so they... well they over-engineered a monster from scratch in Grails (argl, I nearly died, having joined to do Java), and were hitting the limits of their Grails abilities (no tools, no debugger, no experience, no support, a new version just out the month before they started that change all documented paradigms, an architect who only wanted Grails and nothing else and didn't care whether the product did anything or not).
So well the thing we did to eventually move the mess to production (so we have a dying Philippino software we can't maintain because we fired all the devs there prematurely ofc, and a 25% done product in a state that's going to the wall even faster than the old one):
- Unsmug the team: teach them Grails (sadly, I was... the most experienced grails developer myself, against my will, having been a victim of it in all my previous jobs), isolate the most damaging devs
- Re-hire the PH team and move them to Hong Kong to fully maintain the old software, for 2 years, while they also learned Grails and moved to the new system with their precious new skill (they never thanked us, weird)
- Learn the old system: respect its decision algorithm that were not simple, understand why it was slow/ugly/messy whatever, rather than say it's because the devs were inferiors: we slowly realized they were not
- Slow move everything to the simplest possible technology rather than beautiful one-pager bullshit like Grails or Angular or whatever.
So my advice to you is: embrace the fact you will not do better. You will do worse or the same. Rewrite for a business reason, and if possible smaller parts. Do not be so smug as to say "bloated code" or "without any planning for a maintanable design". All it takes is 2 months of money making pressure to entirely destroy any code base, and guess what, people will want your software to make money, not just be beautiful.
I'm not trying to disrespect anyone's work. I'm just describing a situation. There are smart pieces of code there that solved the problems that were presented at each time with more/less time to work on them but poorly glued together. Similar problems were solved in completely different ways which makes the application less homogenic and requires to apply an extra effort to understand what's happening on each place. Lack of refactoring and tests left a lot of unused/unstable code that is only noticed when accessing specific areas/features of the application.
I understand the effects of time constraints and respect the years of dedications the developers gave to the application but that doesn't make the situation any more pretty.
> The reality of rewrites is that they often fail, cost a ridiculous amount of time and effort, don't actually fix the original problem and/or bankrupt the company.
The reality of rewrites is that they're often blamed for all of that, while the issue is not the rewrite. The issue is that companies that write bad software will, if given an occasion, write bad software again.
The issue with rewrites is actually that they're not large enough in scope. When you rewrite software, you should probably also rewrite contributors' interactions, the way the company handles power dynamics and, occasionally, the way accounting works at that company. Short of systemic changes, rewrites are likely to fail. But again, the company is also likely to fail other initiatives aswell.
Rewrites that fail in an isolated way are few and far between.
> Before you start a rewrite, consider refactoring and rearchitecting the old codebase, there might be a gem in there that just needs some love and attention.
OR it might be a dumpster fire that drives your engineers away, one that is fueled by unreasonable deadlines and expectations. Also good luck refactoring your codebase at snail's pace with noncomitting approval, on your own unpaid overtime.
I just started writing up a formal process on how to improve the small bits gradually. It's actually way more complicated than you think at first... Presenting a user problem, identifying root causes, proposing a solution, mapping out the individual components, listing all of the steps needed to complete a solution, working on each step, and validating each resulted in an improvement based on measurements. There's all these little details to consider to get each one of those right. Maybe this is all well described in some software project management book, but I've not seen it.
That's why proper module and interface matter (and tests). With that you can swap one bit with another implementation with less risk of failure. It's a fine grained migration path in a way.
>The reality of rewrites is that they often fail, cost a ridiculous amount of time and effort, don't actually fix the original problem and/or bankrupt the company.
Citation needed.
All of these is based on random internet wisdom (e.g. Joel's condemnation, based on a different era, different delivery model, and different concerns than most of today's startup and enteprrise code), and anecdotal evidence.
Well, talking of anecdotal evidence I've seen lots of succesful rewrites, that made the thing faster and better.
Heck, most/all of the FAANG stacks are rewrites. Twitter, for example, as you've mentioned yourself, is not the same program that needed restart every day to avoid a memory leak back in the 00s - they rewrote it. Reddit is not the original Lisp, and so on.
It's also about the scope. You immediately undestood "let's rewrite the whole company code" and advise "don't rewrite the whole project". Well, who said it's "a" project to begin with?
Companies more often than not have tons of different services, and rewrites can be easily done for such isolated services (e.g. like the dl.google.com rewrite into Go).
Tons of core services of YouTube, CloudFlare, Google, Facebook, and tons of others have been rewriten with languages as Go, Rust, Hack, etc. and they have written about the experience and results as a success. And it's not like survivor bias, e.g. the keep mum about failed cases, because we hear about failures all the time (just not concerning rewritings).
I agree, not sure why you are down voted because you make valid points.
I can't think of any big public disastrous rewrites except for maybe frontend ones like Digg or enterprise modernizations for example when the government wanted to get rid of their 60s mainframes and the effort cost billions and was ultimately unsuccessful.
I'm generally against rewrites as they're often shortcuts for new developers that instantly reject a code base as 'too complicated' without understanding it.
However, I have had success in rewriting my own code base a number of times. This is a unique situation of a one-man-band that understands the business case, handles support, and every bit that went into every decision of the code base.
With this kind of clarity, when you've spent years with the problem domain, you can uniquely rewrite a project to get at the real business case that, now with years of experience, you see what your customers actually wanted you to solve in the first place, or maybe what they evolved to want in the end.
Jim Keller advocates rewrites in a similar way as the only way to progress in chip design.
I think we could rewrite very basic things in exciting ways with this sort of attitude. For instance, we know a lot more about what we need in a desktop OS, an email client, a search engine, etc. Basic things that have gotten a lot of cruft over the years as they evolved. Taking a fresh look could be rewarding.
I guess one could reframe this as 'first principals thinking" but with the caveat that the problem needs to be truly understood.
> “If I replace all of this Python code with Go it’ll be SO much faster!”
I write a lot of python, and the irony of this comment is that it’s probably true.
Worth the cost and time and effort? /shrug
Easier and faster to keep implementing features in? /shrug
…but out and out faster to build, deploy and run?
Yep, probably.
Im my experience python applications are slow it’s usually because your user logic is heavily implemented in python, and despite alllll the hand waving, that is, in general still slow. The package manger is slow and broken. It’s painful to deploy.
You can probably solve the problem in other ways, by picking parts to move to another language that provides an easy way to expose python bindings… but you know.
Make it faster by rewriting in go will probably work, if you only goal is “runs faster”.
I think a lot of rewrites happen because new people come along and they are not interested in reading and learning existing code. They just want to write new code and in their mind "the way I do it is better". A prime example is the Javascript ecosystem with such a high churn rate it caused a new kind of burnout: Javascript fatigue. Programmers seem to forget when you throw away code, you're not just throwing away text, you are throwing away years of accrued knowledge.
Had met my share of engineers who wanted to rewrite everything from scratch into microservices and the reason they gave was ‘well microservices scales and monoliths dont’ - without taking into account the maturity of the existing application, the scale it already operates at and the scale we expect to operate at, the team resources and skills to required to support the rewrite, and the business requirements and goals.
I wonder if any companies purposely avoid hiring any dedicated professional developers, and have everything done by other professionals who happen to know some code, and how it works.
Developers always come with their own perverse incentive. They like the new, the clever, the smart, and the simple, and rarely appreciate the battle tested and extensible.
Having seen several projects written by people who "happen to know some code," I can tell you there's a good reason this hasn't caught on.
These sort of people can work the very foundations of the company into an ungodly quagmire that would send you screaming back to the fad-chaser dev in a heartbeat.
Also despite the meme, most mid-senior devs are actually keen on making their contributions valuable to the business, having gotten over this phase of fascination with new tools
Mid senior devs do seem to be pretty good. Entry level devs who think they're senior cause most of the trouble....
I've seen some small projects by non-coders that are better than some real devs, because they can be really aggressive about avoiding original code. It might be hacky... but they find ways to leverage stuff that already exists.
> Good luck with your rewrite! I’m sure it’ll be everything you hope it’ll be. It’ll probably be finished way quicker than you expect, and life will be perfect afterwards.
The author seems to be generally pro rewrite in his article so I can't tell if this is sarcasm here.
A co-worker was adamant we needed to spend six months to re-write the application we were responsible for. As near as I could tell, his reasons were the following:
- The application was complicated,
- The application was written in Ruby, and he didn't like Ruby.
- Golang was way cooler.
Fortunately our director pointed out that there was no customer value to a rewrite, so the idea was shelved.
A classic Joel Spolsky post that completely failed to take into account the fact that Netscape rewrote itself into Mozilla, which then went on to eat IE5's lunch.
I'm a big fan (and user) of Mozilla but this take seems wrong to my memory.
Checking Wikipedia, the IE5 timeframe was them reaching their peak before stalling on the infamous IE6, and Firefox (sort of a rewrite of the rewrite) starting to make some small headway as they let IE stagnate.
i think spolsky is a bit too dogmatic about this. gentle reminder: they wrote a transpiler for a subset of vbscript so they could port their bugtracker to linux.
Very good post, that holds a lot of experience. Unfortunately these things have to blow up in your own face, often more than once, for a lesson learned.
> It’s a bit smarmy of me to criticize them for waiting so long between releases. They didn’t do it on purpose, now, did they?
Well, yes. They did. They did it by making the single worst strategic mistake that any software company can make:
A few of us inherited two code bases from a previous team, both for what are now products but were essentially prototypes.
Every release includes new things and rewritten things. If we add new on old, we grow technical debt; if we rewrite too much, fixes and features are delayed too long.
It’s a tough balance. Our code and issues are full of comments re how things should be done, with cross references we added as we figured out what things need to change together.
As Joel Spolsky notes in the twice-linked post, reading code is harder than writing.
Just today I eliminated hundreds of lines of code after writing maybe two dozen over the last few days.
The writing and deletion took no time at all. The reading and forensic mental compilation and execution took days and days.
That’s our golden rule: change nothing you do not completely understand - and you only really understand it if you can explain it in plain English to someone who doesn’t know the code.
The coolest thing is that as we go along doing cool things becomes easier and doing wicked cool things becomes possible.
Just this morning we had to force ourselves to remember that entire blocks of code with poor reporting of errors are as they are because until changes we made elsewhere just weeks ago, there was no point in doing better: the better reports had no place to go.
But after a major refactor initiated primarily to make maintenance and addition easier, we can now do things with dramatic - and wholly positive - UI/UX impacts that were simply impossible before.
I feel there are certain dynamics to the game. Back in the 90s rewrites are everywhere and they fail a lot because people were doing it so recklessly.
Nowadays, people are usually too afraid to even talk about it. As a result, many things that could be straightforwardly solved by rewrites were delayed to the point virtually impossible, causing eternal suffering and valuable features blocked forever.
sometimes repair and retrofit is actually a bigger job than a ground up rebuild. depending on the state of things, this may not be discovered until time and political capital has already been committed to the first approach. i think the key is to avoid dogma and remain agile, not in the buzzword sense, but in the literal sense. be willing and able to scrap approaches and reverse course if things start looking worse than anticipated.
If the current code is crap there is a reason it ended up like that, and the same forces will be in effect to cause the rewrite to end up in the same state.
Perform a root cause analysis for why the current code is in a bad state and why it doesn’t improve over time.
> Someone is likely to point out all the problems your rewrite will bring. You’ll find it much easier to convince people if you’ve already throught these through, and you bring them to the conversation yourself.
I think this is how it should be in a more perfect world.
and the purest implementation of an iterative development process.
The first implementation you write is and will always be a (throw away) prototype.
You learn a lot. You understand much more of the spec.
Maybe users got a go. Maybe its just internal.
Now your write the first version.
It is better.
You understand even more.
Writing a proper prototype first used to be a thing.
As we all know once a prototype was getting close to finished
the budget changed and bam your prototype is in production.
It is not all that practical with huge applications.
Perhaps there is a lesson in that.
I knew about this being a bad idea, and I still fell for it. I bet on a big rewrite to avoid cleaning up technical debt. It did not work out well. I'm currently cleaning up the technical debt.
Here's the only way you should ever rewrite something:
1) Document exactly what the existing thing does.
2) Write high-level tests for this functionality.
3) Make sure there are no features that will need to be added to it. Make sure everyone is onboard with not adding new features to it. Make sure people understand that adding features to an in-progress rewrite jeopardizes the entire thing.
4) Rewrite it in the original language, reusing those high-level tests you wrote in step 2.
5) Let people know that they can add features again.
Rewrites (and other forms of greenfield development) is an important part of my work environment and compensation. Any equation on rewrite vs. maintain that fails to account for what the company and its employers want to be doing will miss an important factor. The ROI of a rewrite isn’t only in new features or performance or subscribers, but in staff retention and overall happiness.
Depends on the market for the software. Personal experience of those proposing rewrites usually is tainted with NIH syndrome. Different strokes for different folks but refactoring to modularize for targeted rewriting is more viable. New code means new maintenance headaches and probable maintenance on legacy codease.
I have participated in a few (small) rewrites that were successful, and this would be my advice: Don't rewrite an entire project at once. Pick off smaller features first, get a feel for what your new approach will be like before you commit to the big stuff. This will often mean you have to migrate from a monolithic architecture to a services architecture first. This means your project will grow in complexity first. If you're confused about why a project first grows in complexity before shrinking, watch "All the little things" by Sandi Metz, it's the greatest programming talk ever recorded. If your rewrite gets stuck after adding all of that complexity, then you've played yourself. Worst thing that can happen is you rewrite your entire project, some users or customers start using the new project, but you've failed to rewrite everything and some chunk of your users are still using the old codebase. Now you've got twice as much code to maintain.
Before you start a rewrite, consider refactoring and rearchitecting the old codebase, there might be a gem in there that just needs some love and attention.
That said there definitely are famous successful rewrites, and I think web application backends are especially suitable for rewrites. Most famous are Twitter and LinkedIn, Twitter going from a relational database backend to a more appropriate fan out message queue based backend allowing them to scale beyond imagination. LinkedIn going from a big rails monolith to super fast node.js microservices reducing their hardware costs a hundred fold if stories are to be believed.