While some of the issues could be chalked up to "not doing it right," at the core of it, the process of strangulation described in the article leaves the overall architecture in a much more confusing state for the lifetime of the project, and if you have to shift, you've created vastly more tech debt then you had with the original service, as you now have a distributed systems problem. Unless you can execute on it quickly, I think it's a very dangerous way to fix tech debt, avoiding fixing the core issues, and instead planning for a happy path where you can just replace everything.
If you absolutely think you need to quarantine the existing code, I'd recommend putting a dedicated proxy in place that routes either to the old service or the new service, and not mixing the proxy and the new code. That separation of concerns makes it much easier to debug, and vastly reduces the likelihood of creating a system of distributed spaghetti. What I’d really recommend, though, is understanding the core codebase that powers the business, and make iterative improvements there, rather than throwing it all out.
If you design a long lived code cleanup project where the only option is "we can't stop this project for another 6 months or we'll have an even bigger mess on our hands," that's poor project design. In this way, rewrites are almost safer. They're usually "atomic" and you can "roll back" by throwing it out if you can't focus on them.
All this is based on dynamic business and product needs that are shifting every couple of quarters, and a technical team that's getting new people, and shifting off more experienced folks every couple years. That seems to be a pretty standard expectation to me, though I'll admit I'm mostly looking at this through the lens of early, mid, and late stage startups. It may be big companies that move slowly and can guarantee resources for a larger project for a long time are a different story.
What do we do?
Certainly, there are times where that's not reasonable. Maybe the core codebase is on a proprietary technology. Maybe it's built around an EoL framework. Maybe it's built on on a FOTM language that you can't hire for at all anymore.
In those cases, which are more rare than developers like to admit, I think a piecemeal migration makes sense. In my experience, it's better to do it by altering the consumers. Whether it's a frontend that can point to another endpoint, or an API gateway that can switch out the service it's pointed to, or making a shim layer (proxy), that can serve that purpose.
Having a service that is both the proxy and the new application is a poor separation of concerns, very difficult to reason about, and makes it too easy to intermingle the logic between the old and the new. In my experience.
For example, refactoring "to provide abstraction so future work is easier" is 90% of the time an error.
The end result was me losing my "mojo" after a year of producing effectively nothing. I'm back together now but my god it's a terrifying feeling when you've been programming for over ten years and one day you can't make the code flow out of your hands anymore.
I must have missed that in the first read. Yes, do not do this.
If anything survives the hype cycle for microservices, I hope it’s this: if your company isn’t surviving more than a couple of technology cycles, especially when they seem to be getting faster, then what are you doing this for?
Stop trying to replace one system with another one. Yours will be old and busted someday too. And certainly don’t let one system be the gateway to the other. Put them both behind a thin layer that handles only a couple of concerns (say, auth, making sure all requests have a correlation ID, perf statistics, maybe pick 2).
What we are both suggesting, I have still always called the strangler pattern. It just has three actors instead of two.
That makes sense, in practice I've seen multiple situations where it was one "monolith" directly in front of another, and rather than a strangler, it was a tumor that ended up with things intertwined. Treating it like a service oriented architecture is a good way to think about it.
You still end up in situations where logic needs to be shared. Do you duplicate Auth? What happens when a new user signs up, do you sync users over? Do they read from the same database(s)? I'll also agree with the over hyped ness of microservices, but I think breaking out services and "rewriting" them to me is the way to achieve the overall goal. Even in the strangler pattern, you'll end up needing to share functionality, and either you have to monolith services calling into each other, or a service oriented architecture.
Authorization gets duplicated, but that’s going to happen at some point anyway (and how many times does it get duplicated in a microservice architecture?)
I’ve seen the tumor thing too. A few of the most egregious services (memory hungry, least coherent to the whole, etc) get split off to be independent, but lots of things do not. And sometimes the old team sheds people due to it not being the cool thing anymore and then you’re really fucked.
[edit to add] I have developed some very serious resistance to the 'low hanging fruit' model of development. I think it is the direct cause of the Lava Flow Antipattern.
If the point is to completely remove a problem from the system, starting with the most popular one and working toward the more boring ones means that at some point you will lose the argument in a prioritization meeting, the lava will cool and you'll be stuck with both. I would expect a higher likelihood of success if you start at one 'end' of the system and guarantee that high priority examples are evenly distributed throughout the effort.
At some point you have to ask how the company let things get this bad, and what they've done concretely to avoid it happening again. And whether you really want to participate in the heroic levels of effort it's going to take to keep the patient alive in 4 years.
Louis C.K has a joke about how his ankle was bothering him and the doctor told him, "Well, it does that now". That's it? "It does that now?!" Monologue monologue. The first time I laughed along with his joke. On subsequent viewings I said, "Hold on..."
Louis is a seriously sedentary guy. He's been letting things go even more than I have, which is saying something. But I do exercise. If I get to pick the feat of strength, I could make the leaderboard. When I bunged up my hip I got PT, and I went, and I could do the exercises. I didn't do all of them though, so it still bugs me from time to time (also on the last day the PT did something to my good knee, which is still bothering me a year later). If I were in proper athlete shape, they might offer more, because my prognosis for recovery from something more aggressive would be high.
Nobody is going to look at Louis, or listen to Louis, and offer that. Louis is not going to do any of that shit. He's just going to crack jokes about it, or about you... as is amply evidenced by that set. So yeah, your ankle does that now, sport. (BTW, my ankle is fucked up too, and I do the exercises he pantomimed. They do actually help quite a bit, which he would know if he had tried them even twice)
And yet we do it at work all the time. And I am seriously beginning to wonder what would happen if we just triaged these situations and walked away from some of them. If we just let the saner competitor win.
Nothing is perfect anywhere, sure, but there are degrees.
You get the spaghetti when you take your eye off the ball. When you start making excuses for why we are going to do dumb things just “for now”. And by the time you notice the three year old comments that say “this is temporary”, well, your processes are set. This is the culture, and if you make a stink, people are thinking more about “why now” than “why not”.
The only way I know to get out of that is starting the conversation early about how we can do the things we always do faster and more accurately, and how we can do things we’ve never tried before. But that tree takes a while to bear fruit. And it requires coworkers who are willing to try something new, because for sure as shit nobody has shown them how to do this before.
I read it like that at first. Think about it, it's also very true.
We should probably get that checked out.
Because what I've also noticed is true with respect to both of those phrases is...
It. Never. Actually. Does.
There are only so many people you get to teach in your lifetime, and only so many projects. If you only take the easy ones you never grow and you will reach fewer people. But if you take every hard luck case you also won't get far either.
A working relationship can be broken the way a married couple can be wrong for each other. Sometimes divorce is the correct course of action.
It's not enough to be right, you have to be productive too. And sometimes you only get both if you move toward people who are closer to agreeing with you.
Devs: Oh yeah... It does that now.
In the words of Monty Pyhon it was a dead parrot, and I came in to say "It's just resting... It's pining for the Fjords!"
I'm almost a month in on a complete re-architect from scratch. Most customers are overjoyed that the business will be back soon, but I'm afraid their good will is going to be short lived if things aren't running soon.
I am sitting on a codebase whose oldest line of code is about 20+ years old and has evolved successfully in that time such that product it was 20 years ago and todays product are unrecognizable from one another. Its database schema is even older, encoding decisions made 30 years ago. Reasonable decisions at the time, but no longer reasonable.
I am working on a major rewrite which will take a year. The nature of the changes required mean I cannot break it up into pieces and do it bit by bit as I have been for last couple of decades with major enhancements. A functioning product is all or nothing. As someone who is anti-rewrite, pro-evolve and accustomed to working on old codebases, and being my own business so its my own money on the line, the decision to embark on a 12 month rewrite is not taken lightly.
It is a refactor, but a refactor that is going to take about a year to complete, where there will be nothing to show for it until it is finished. From old codebase, probably about 20% to 30% of it will be ripped out and replaced.
Its not my first major project in terms of keeping this codebase capable. Previously written compilers/runtimes and IDE modules to keep it going : to gain control of codebase which was written in a commercial/proprietary dev environment into something I have full control over. That effort took a year.
The difference in this case is I am deliberately, intentionally throwing code away, alot of it, for the first time and contrary to my instincts. Still a post mortem might be interesting. My successful efforts to code my way out of a proprietary dev environment was an interesting and risky project, discussed at length with relevant dev communities at the time, but probably worth writing up one day.
With regard to your own project, I find it somewhat hard to believe that you can rewrite a 20 year old code base in just one year's time. Assuming an output of 50 LOC a day - and 250 work days a year - which means the system you are replacing is under 15k LOC.
1. Strangler pattern - which allows you to continually advance functionality, but will take longer and requires delicate care.
2. Complete rewrite - the only successful way to do this is to code freeze your legacy app. It's risky because it means keeping the product features stale for awhile, but less cumbersome to advance once it's done.
Either way is risky. Choose your poison.
When you accept that the old code is expensive but not actually impossible to modify, you can do an incremental rewrite without Strangler-style proxying, and not even needing to prioritize replacement of the user-facing components, and only or preferentially rewriting as necessary to make user-visible feature improvements or bug fixes, unlike Strangler Pattern’s preference for no-visible-effect initial replacements. (I call this “standard Ship of Theseus replacement”.)
This typically is even better for continuing to deploy new features than Strangler, but does leave you with a system without a single clean boundary layer between new-style components (the replacement/proxy system in Strangler)) and legacy, instead you have a system with mixed islands of legacy and new components (possibly multiple styles of legacy, if a third set of standards is adopted before replacement completes.) So, again, it's has its own risks/costs, but it's a third option.
You mean the one for the legacy? (Which I wouldn't call the framework of choice, since it's inherited, the new one is chosen...)
Sure it requires that the legacy code is, at least internally, supported or at least supportable, and sometimes that's not the case. Though on larger production systems, letting them get to a completely unsupported state is rare, and those are the systems where the choice between big-banf rewrite, strangler, and a more free incremental replacement are most consequential, IMO. If nothing other than a big-banf rewrite is possible, there's not a choice, much less a consequential one.
Example - you wrote your app in a custom PHP using PHP 5.6 and you're doing a rewrite in PHP v7 in Laravel.
That's often the case, but the app itself is usually actively (if lightly, sometimes, because the cost of change is high and there is a lack of skilled maintainance personnel) even if the underlying platform is outdated and possibly even out of support (and in some cases long abandoned by the vendor.)
> That's my point, you can't incrementally rewrite that because you'd be rebuilding on a burning platform.
You absolutely can maintain software on an outdated and, often, unsupported platform (the latter can sometimes be a licensing problem, as you may not continue to have the legal right to use it.)
Huh? I mean yes you can technically continue to use an outdated framework, but what do you do when there is CVE identified and someone exploits it? Do you just sit on your hands and wait until you can upgrade the whole framework months later?
There are explicit risks associated with continued use of an outdated and unsupported framework, so why anyone would continue to build on a burning platform is beyond me...
Technically, you can do a big bang rewrite, but they fail at a very high rate on significant systems—and you're still using the unsupported system while you do the big bang rewrite, and quite possibly after it fails, so you even in the best case you haven't eliminated the problem you point to with doing an incremental rewrite. So, the basic problem exists either way. Incremental rewrites (strangler or otherwise) prioritizing the highest-risk components for earliest replacement is one plausible risk-mitigation strategy, but the right choice is going to depend on project details. When you eliminate real options because of fake “you can't do that” considerations, you increase the risk of choosing a suboptimal approach because you preemptively discarded the least-bad solution.
You keep using this term and I don't think you know what it means. The Strangler pattern IS an incremental rewrite. You've suggested that it's possible to introduce incremental rewrites to old code and all I was clarifying is that by refactoring code written in an old framework (e.g. PHP 5.6) does nothing to eliminate technical debt (adds to it, in fact). The Strangler pattern is most often used when you want to switch languages (e.g. Java -> Rails) or when an older paradigm doesn't have a straight migration path (e.g. WebForms -> .NET MVC). The new code using the new framework essentially strangles the old code.
For the record, I've advised over 100+ software companies, which I would say about 60%+/- of them are experiencing some type of major rewrite and of those, 9/10 are because they simply can't upgrade an outdated/unsupported/poorly architecture software framework. Trying to refactor an unsupported framework is simply not an option. You (1) either migrate it to the latest (if possible) and refactor over time, (2) strangle it with the new framework, or (3) rewrite it. That's it. Every other topic discussed is simply just one of those but semantically wrapped in some engineering jargon or nuance.
This is all consistent with the examples here: https://paulhammant.com/2013/07/14/legacy-application-strang...
- C++ -> Java Spring
- Powerbuilder/Sybase -> Swing
- VB6 -> .NET
- Java -> Rails
- Java/Swing -> Rails
The real world is FULL of these. I've seen many of them first hand. The rewrites you hear about in the "SV world" are either superfluous CTOs who are misguided into thinking they need to, for example, rewrite their Rails 4.2 app in Node because they think it'll get them more users, or represent real engineering feats that truly "blitzscale" startups entertain to maintain business continuity (Twitter's migration from Rails to Scala comes to mind).
I'm pretty sure I do. But I'm also pretty sure you don't know what the term “Strangler Pattern” means (specifically, that you think it is equivalent to “incremental rewrite” rather than one specific approach to incremental rewrite.)
> The Strangler pattern IS an incremental rewrite.
Yes, but not all incremental rewrites are the Strangler Pattern; that's why my first post in this subthread points out that the choice isn't exclusively between Strangler and big bang rewrite, because incremental rewrites are possible without the Strangler Pattern. In fact, i alos discussed the specific differences that can arise between Strangler and non-strangler incremental rewrites.
> and all I was clarifying is that by refactoring code written in an old framework (e.g. PHP 5.6) does nothing to eliminate technical debt
That's not at all what you said, though it's possibly what you meant, if you were writing very imprecisely. If it is, though, it's odd to the point of non-sequitur as a response to anything I've written because I never suggested refactoring code while retaining an old framework, I suggested an incremental rewrite similar to what is done on Strangler but without (1) implementing a new-system proxy as a first (or, potentially any) step, or (2) adopting an “old code is deleted but never modified in the course of the transition" rule.
> The Strangler pattern is most often used when you want to switch languages (e.g. Java -> Rails) or when an older paradigm doesn't have a straight migration path (e.g. WebForms -> .NET MVC).
Yes, though there is no particular reason that either of those cases require Strangler for incremental replacement.
> For the record, I've advised over 100+ software companies
Good for you, but that's not at all relevant to the discussion.
> You (1) either migrate it to the latest (if possible) and refactor over time, (2) strangle it with the new framework, or (3) rewrite it. That's it.
No, it's not, unless you are using “strangle” much more broadly than the Strangler Pattern, which isn't just an incremental replacement by a particular strategy for incremental replacement characterized most notably by placing a request-intercepting facade in front of the old system.
So why would I use the word "an" as in, "The Strangler Pattern is an incremental rewrite", as opposed to "the"?
> That's not at all what you said, though it's possibly what you meant,
Weird, the following was my first response to you. Shrug...
>Incremental rewrites assume that your framework of choice is still supported. In my experience, I would say 75% of the time someone is considering a rewrite, it's because their framework of choice is out of date, which makes it impossible to modify old code.
FYI - My choice of the word "impossible" was poorly chosen. It's not impossible, it's just stupid.
The original parent was basically asking "if we can't do strangler or big bang for a legacy app what else is there"? You suggested that incremental rewriting is a 3rd option and clarified that a Strangler Pattern is a subset of an incremental rewrite, which I agree. You implied that this meant continual use of an unsupported framework, which I sought to clarify and advise against.
> No, it's not, unless you are using “strangle” much more broadly than the Strangler Pattern, which isn't just an incremental replacement by a particular strategy for incremental replacement
I am indeed.
> characterized most notably by placing a request-intercepting facade in front of the old system.
This is incorrect. The Strangler pattern doesn't necessary mean you strictly write a facade. Furthermore, like all design patterns, they're up for interpretation. What determines the difference between a router, adapter, proxy and a facade? All technically can be used to intercept incoming requests.
Fowler, who popularized the Strangler concept, says:
An alternative route is to gradually create a new system around the edges of the old, letting it grow slowly over several years until the old system is strangled.
AND then says
In particular I've noticed a couple of basic strategies that work well.
Which implies that there are various strategies (not just one) that are enacted under the term Strangler Pattern. Hence my previous comment "alternatives discussed here are simply subject to semantics and nuance". You could certainly use Adapters, Routers, Decorators, Proxies, Bridges, etc. for design patterns used in an overall Strangler strategy.
I think we're actually saying the same things, you just seem to prefer to be overly and unncessarily pedantic about your use of the terms Stangler and Facade.
 - https://martinfowler.com/bliki/StranglerFigApplication.html
- Compilers/IDEs that only exist on single machines (expensive mainframes, single user VMs, SSH hosts)
- Compilers/IDES that must be air-gapped for any security reason (including CVEs and expired support, not just PII/PCI/confidential-quarantine reasons)
Things start to feel impossible when you have to use a Windows XP VM because it was the last OS Microsoft officially supported the VB6 IDE, who knows how to install and license the ActiveX controls anymore for development (some of the vendors don't even exist anymore to ask), the VM is intentionally air-gapped/quarantined from internet traffic because it is an XP VM so there's now some fun extra steps with virtual folder redirects and worktree-less clones as remotes to get git changes in/out. (If your VM stopped supporting basic copy/paste from the host machine you feel you might go mad. Trying to grep VB6 code in VSCode to make navigation feel somewhat more sane due to scroll delays in the VM feels like its own growing madness as VSCode believes you to be insane using such an arcane language that looks but does not smell like VB.NET.)
Uh, not that that resembles my current situation or anything.
There are similar horror stories from say COBOL devs with airgapped mainframes.
Technically speaking, you can build software on an outdated and unsupported framework, sure. But why on earth would you, for example, continue building your software on PHP 5.6 (EOL Jan 2019) when anytime a CVE is identified, you're basically SOL in being able to patch it.
But once it's too late no migration strategy rids you of the old codebase and it's environment now. With the full rewrite you need to maintain and run it until that's done. With the strangler the old code still runs until it is replaced.
New customers were pushed to the new product. Existing one were encourage to do so and temporarily live without prior features (usually with temp workers doing things manually) for a deep discount. Those who had to stay with the legacy system were told to expect nothing but bug fixes and compliance-related updates (for federal programs and reporting requirements) and that if they needed something more than that, they'd either need to built their own bolt on (there was a robust, if clunky sdk) or pay contractors to do so.
It sucked, yeah, but it seemed like a reasonable way to go about such a transition that was always going to make people unhappy.
1. Do it sooner
2. Get full commitment from stakeholders
3. Agree on feature freeze
4. Get it done quickly
5. Don't over promise, esp about the timeline
6. Focus on delivering big/important items first (MVP)
7. Appoint a benevolent dictator, don't assemble a committee to avoid second-system syndrome
8. Have test scenarios ready (black box)
I will write a blog post when it's done successfully, otherwise I will hide under rock.
It is also difficult to apply if we are not talking a server/client app but a desktop app, being rewritten in a different language or incompatible GUI toolkit.
I've successfully strangled a large codebase that had these issues, though we did have the benefit of a client/server application so there was a place to actually define interfaces.
We started in the middle by creating a logical service layer to group all the bits of like functionality. We left the implementations alone, just moved them to align with the new "service" layer. We slowly worked our way up the stack, including defining a new client API, and then changed the existing API methods to be a shim on top of the new methods.
We were then able to update client code to use the new interface, but the old ones stuck around for about 24 months while we sunsetted older clients. The actual strangulation took about 2-3x as long a stop-and-rewrite effort, but there were VERY few regressions because we were still in a constant test and release cycle and managed the scope of strangulation changes in each release AND all of our testing was still valid since we weren't changing input/outputs or any expected behaviors.
The next time you catch one, Lennie, avoid killing a mouse by strangling it.
Do you strangle it to avoid it, or do you avoid strangling it? The age-old question.
Thanks for pointing the titles actually differ.
Avoid rewriting a legacy system (by strangling it)
The important thing to note is that the new tool does not do everything the old tool does. The workflow is also different from the old one. However the customers loved the new one as it was simpler, faster and more robust to use.
Can you/anyone recommend any?
I mostly work in the android world and the chase for the new and shiny is real.
I see some new libraries get a lot of traction seemingly only because they are written in kotlin/coroutines, not because they offer a better solution (for the one I have in mind, they did not even bother trying to do a benchmark to compare it with the existing solutions).
The thing is, the Android dev ecosystem got WAY better in some aspects.
Having moved to some of the new and shiny, well implemented MVI/MVVM architectures backed by Rx or Flow are very robust and give a good framework to develop on.
You still have to fight back against the zealots yelling that solution x that works just fine should be replaced by solution y even though it would take months of engineering work and does not really improve anything you care about (e.g. a 5% diminution of bytecode size is not something that's worth spending weeks on, or a network stack that hand wavingly 'improves performances' with no benchmark made to actually ascertain where our hot paths are)
PS.: for the parts that got worse : the build system and Android Studio failed to scale quickly enough to keep up with the enormous increase of build complexity. As a result they are slowly become less and less useable for large projects.
Sadly, it takes a lot more skill and courage to say the following in a way that other stakeholders care about: "We should use this tool because when I started using it, data flows were clear in a way that finally silenced some of the anxiety that I don't know what the fuck I'm doing and I am going to ruin everything."
However, it's always a challenge. For example, sometimes you have subsystems that are begging to be retired. For example, on one system we're maintaining about 20 KLOC of GWT code. For the last 10 years or so, it hasn't really been worth moving away from it, but there will be a day (that is rapidly approaching, I think) where the cost of supporting a mostly abandoned Java framework that compiles into JS outweighs the risk and cost of slowly replacing it.
There's a real difference between being pissed off with the choices your predecessors made, or lusting after the new, hot framework and saying: nope... this just isn't viable any more. Planning that transition isn't easy either. Again, it's one of the reasons I enjoy this kind of work.
And sometimes, you even just decide that you're going to work with what you've got. Ironically, though, this usually involves more churn, NIH and reinventing the wheel because code written 20 or 30 years ago did not have the facilities that we desire in modern development. You think, I'd love to enjoy the benefits of that new framework, but there ain't no way that we'll be able to use it. How do I get the benefits using the code I already have? Answer: you study what other people are doing and you build the same damn thing in your environment. Nobody builds the new-shiny for old stuff so if you want it, you have to build it yourself.
I enjoy bonsai trees. As trees grow, the branches become out of scale with the trunks. You can imagine that if your trunk is the size of a pencil, it doesn't take long for the branches to catch up. So if you want a tree that is in scale, you are constantly having to prune off the branches and grow new ones. There is a saying that a bonsai tree is never finished until it is done. Code is the same way. There is no such thing as avoiding churn -- unless you are truly trying to kill off your project. You always have to prune off branches and grow new ones, otherwise development will slowly grind to a halt -- the challenge of adding functionality without changing the code base getting to be more and more complex. But if you prune your branches before they grow you will end up with a stick in a pot. Or if you decide that you want to grow out every bud that pops out, then you will have an impenetrable mass of confusion. Deciding which branches to grow and which branches to prune, unfortunately requires good taste.
Given, however, what is being posited -- a legacy system that is not modular and which contains unrefactorable pathological dependencies -- the old system must also handle this case in parallel, in order to be in the correct state to handle future requests of a type that still need to be delegated to the old system.
This parallel implementation may have to persist well into the replacement process, and the requirement for it to do so may mean that you still have to do double implementation of features and fixes for most of the transition.
If your old system has dependencies that you don't understand, I don't see the strangulation method working at all.
> Here’s the plan:
> Have the new code acts as a proxy for the old code. Users use the new system, but it just redirects to the old one.
> Re-implement each behavior to the new codebase, with no change from the end-user perspective. Progressively fade away the old code by making users consume the new behavior. Delete the old, unused code.
Here is the reality:
1. People do the above incompletely; their deletion of the old system slows down and then they move on to another project or organization, leaving a situation in which 7% of the old system still remains.
2. People iterate on the entire above process, ending up with multiple generations of systems, which still have bits of all their predecessors in them.
It is really hard to replace the functionality of a piece of code when you don't know 100% what that functionality is.
I'm working on moving some functionality out of a system - not replacing the system. And it's still extremely challenging to actually figure out everything that's going on with just the thing I'm moving out.
So far I'm stuck in the overthinking phase of the new application. And as the article states, I'm asked to keep adding new features to the existing application - nothing big (because individual things aren't big), but at the same time, I've been adding a REST API on top of the existing codebase for the past few weeks. It's satisfying in a way but it hurts every time I have to interact with the existing codebase and figure out what it's doing.
Plus we're not going to get rid of the existing application at this rate. I should probably set myself limits - that is, I'll postpone and refuse work on the existing application if it's not super critical. And quit if they're not committed to the rewrite before the summer.
Big software rewrites are extremely risky because they take inevitably more time than people are able to estimate and also the outcome is not always guaranteed.
An evolutionary approach is better because it allows you to focus on more realistic short term goals and it allows you to adapt based on priorities. Strangling is essentially evolutionary and much less risky. It boils down to basically deciding to work around rather than patch up software and minimize further investment in the old software.
Also, there are some good software patterns out there for doing it responsibly (e.g. introducing proxies and then gradually replacing the proxy with an alternate solution).
The old code worked, but was slow. Adding features would make it slower. Lock-free queues and threads everywhere, packet buffers bouncing from input queues to delivery queues to free queues to free lists, threads manfully shuttling them around, with a bit of actual work done at one stage.
Replaced it all with one big-ass ring buffer and one writer process per NIC interface. Readers in separate processes map and watch the ring buffer, and can be killed and started anytime. Packets are all processed in place, not copied, not freed, just overwritten in due time.
It took a few months. Now a single 2U server and a disk array captures all New York and Chicago market activity (commodity futures excepted).
I kept the part that did the little work, scrapped the rest.
C++, mmap, hugepages FTW.
What the article is saying is: don’t rewrite your code in one go, but rather cut the system in pieces that are independent and rewrite each in successive phases.
It’s kind of obvious, though. And the difficult part of the rewrite is actually to slice the original code in indépendant chunks. More often than not legacy systems are riddle with leaky abstractions and dependencies (the infamous spaghetti code), that’s a hell to disentangle.
I've done this, but on a private branch, with a single merge to trunk in the end. Starting with complex integration tests, new interfaces were gradually defined and made the code testable, giving me the needed confidence.
Our legacy system was built as a desktop app for internal uses that became difficult to both scale and comply with our regulatory obligations, so we began buildin out an api around its core business functionality and built various front ends to speak with it throughout our company.
It has been a middling success, mostly because change requires political capital that might not be there six months to after initiating it. However I think overall we've improved the product and I don't think a massive rewrite would've gone nearly as far due to political winds shifting and the rewrite getting deemed a waste of time by the new new powers that be.
what is the lesson here?
Translation: after 7 months you stop mucking about and start trying to produce something useful.
If you wanted to read the referenced article. This was the first thing I thought of. I appreciate Fowler's writing style and his sourcing. He always links some interesting stuff.