Hacker News new | past | comments | ask | show | jobs | submit login
Lessons learned from rewriting code (huseyinpolatyuruk.com)
116 points by hpolatyuruk on June 15, 2019 | hide | past | favorite | 132 comments

"Rewriting a system from the ground up is essentially an admission of failure as a designer. It is making the statement, “We failed to design a maintainable system and so must start over.” "

Or maybe the domain you are working with was vastly more complex than you imagined, so your nice little system turned out to be more of a prototype than a finished product. Or maybe the features needed was vastly different from what you expected when you started implementing the thing. I'm sure everyone can come up with a million other reasons.

But the problem I have every time with these "Never rewrite your software" stories. Is that they are almost always one-sided. You rarely hear about the stories where a company decided not to do a rewrite and was out-innovated because every little feature took months to implement. Or failed because they could not attract competent developers because their codebase was a complete mess.

Yes, the rewrite might have been a partial or complete failure. But would the patient have survived without the operation anyway?

That doesn't sound like an 'or' to me. That sounds like a 'why'.

A rewrite also means no new features for a very long time. And my very cynical view is that people asking for a rewrite are motivated by the peace and quiet that happens during the first six months of the rewrite. In a world where people leave after 3 years and get a pass for their first few months that represents a pretty large fraction of their tenure.

But it's a bit like rehab. You've decided to start over with a new set of habits and hope it sticks. But like rehab or new year's resolutions, the relapse rate is high.

My current take is that there is a paradox here. The teams that deserve a rewrite usually don't need a rewrite. There's a high degree of overlap between being capable of fixing your broken crap properly and being able to refactor to a better system.

The only rewrites I've seen work look like the Ship of Theseus. As far as management is concerned it's not a new system. But the old developers can barely recognize the code.

> I'm sure everyone can come up with a million other reasons.

Indeed. One reason for rewrite is that there isn't a smooth continuum of changes from vCurrent to vFuture. If the discontinuity will have to be bridged anyway, and if other intermediate changes aren't going to bring sufficient improvement, jumping into it might be the most efficient way.

In finance there is the concept of "Risk Adjusted Returns" -- in the case of a re-write, time efficiency is not a sufficient conceptual framework for talking about it, because it's risk laden. For example, I would much rather take 2x as long to accomplish the goal, if its 10x less likely to result in catastrophic failure. A failure to recognize the risk vs time tradeoff I think can lead to a failure of being motivated enough to "work the problem" -- often there is a continuum of changes that exists -- and it might not be "smooth" and could be in fact quite painful -- but it will be way, way less risky. It's often so less risky to transition software incrementally vs bet the farm on a rewrite, I feel like in most contexts a rewrite should just be ruled out completely without further discussion: better to just muddle through and try to figure out a way to survive without it, since you may come up with a better solution that is less risk laden just by waiting (due to new ideas, new technology, new market opportunities, etc) and not stalling your current product development and risking that you'd effectively destroy everything of value you've created so far due to total stagnation.

On my 5th startup - I cannot agree more.

You only have so much cashflow - your first priority is to maintain and deliver for your customers. They pay the bills usually.

Architecture and engineering are important, and they need support and time to rework problems. But incremental improvement is almost always the best choice.

If you have the cash then you should consider not only what’s best for product maintenance, but also how you could use that cash to grow your business.

Sometimes when you're designing a system, you acknowledge it's bad (ie, technical debt). Ultimately, your purpose is supporting the system (typically the business), not crafting artisanal design and code. I suspect there are many beautiful git repos that are deleted when companies close their doors.

Or maybe your customer base and system load has expanded by an order of magnitude beyond the original design.

This should not require a ground-up rewrite though, unless the code is a big ball of mud in the first place. You might need to change the overall architecture (e.g split monolithic code into to multiple services) and/or optimize some critical bottlenecks, but most code (business logic, UI etc.) will be unaffected.

Writing scalable software is all about being as flexible as possible. If system load expanded by an order of magnitude and the original design wasn't flexible enough to scale with the load, that's a bad software design.

Or perhaps the first project was budgeted to only handle limited scale and prove feasibility. In which case it was a success, and now that feasibility is proven, scale can be budgeted for.

I mean, if people tried to build for mega scale from day one, every pie in the sky idea I have would start with Spark clusters and load balancers and Kubernetes...which obviously is all expensive, time consuming, and may simply not be worth the cost given the business objective.

Sure but then risk being dinged by the customer/management for "over-engineering something simple." How do you determine where the cut off point is?

If you try to build the most flexible, scalable solution on day one, you are very likely to make the wrong predictions about the architecture you will need in the future. You will probably need to rewrite the software anyway.

Worrying about scalability before you need it is a good way to ensure that you won't.

A lot of people might disagree, and this is just my opinion, but if you are using object oriented programming than in 99% of all cases you are not being as flexible as possible. Tying methods to data is an inflexibility.

With sufficient complexity under OOP you will always be writing code that feels like it is poorly designed because the paradigm promotes poor design and lack of flexibility.

> Tying methods to data is an inflexibility.

Tying methods to data is an inevitability (whether it is on run- or compile- time). Can you give me an example of a function that does not depend on the data it operates on?

What makes OOP wrong is not tying methods to data, but tying data to ontologies. Limiting a function to operating on what something "is" is an artificial limitation. Limiting a function to operating on what something "has" or "can do" is a logical consequence.

>Tying methods to data is an inevitability (whether it is on run- or compile- time). Can you give me an example of a function that does not depend on the data it operates on?

I mean tying methods to data via a class. A instantiated class is a structure that has data and has methods specific to operating on said data. Either way, any function that takes void input and with any return type is an example of a function that does not depend on data, but that wasn't my point.

Note that a function is different from a method contextually speaking.

>What makes OOP wrong is not tying methods to data, but tying data to ontologies. Limiting a function to operating on what something "is" is an artificial limitation. Limiting a function to operating on what something "has" or "can do" is a logical consequence.

We may be talking about the same thing. Though I'm not sure what you mean by this: "Limiting a function to operating on what something "is" is an artificial limitation. Limiting a function to operating on what something "has" or "can do" is a logical consequence." Can you please clarify?

We might want to agree on what we mean with "OOP", since it's such an abstract term we might be talking about different things.

> Note that a function is different from a method contextually speaking.

I disagree. A method is just a function with an implicit self/this parameter. I see methods as just namespaced functions with syntactic sugar to call them with dot notation from instances.

In Rust this is particularly prominent, since they support UFCS:

    whatever_instance.method(arg1, arg2)

    Whatever::method(whatever_instance, arg1, arg2)
...do the exact same thing. Other languages are more limited in their syntax, but I don't think it's inherent to OOP.

I think I see what you mean with "contextually speaking" but I can't really see the difference neither in theory nor in practice.

> Can you please clarify?

I'll try :P Maybe with an example.

In TypeScript we can define types as "interfaces" (do not confuse with what Java calls "interface").

    interface Named {
      name: String
Then you can make a function that takes Named things:

    function sayHello(named: Named) {
      alert(`Hello! My name is ${named.name}.`)
My point is all functions are tied to the data they work with, whether explicitly or implicitly (via runtime errors).

This is what I mean with "has" or "can do". I couldn't care less whether that Named was instantiated as a Dog or a Person class, as long as it has a name. (Just to clarify, I don't think TS's interfaces are particularly good either, since they're tied to the actual object shape, what if my object has a field "fullName" instead?)

I agree with you if we only consider traditional OOP class-based abstraction. Modern forms of OOP on the other hand are a blessing for me.

- Classes should be limited to creating particular object instances that group together a bunch of interfaces.

- Class methods should only be used to manipulate that particular instance, with its particular idiosyncrasies, never as a means of abstraction. Only the class itself should know about its methods.

- Classes expose functionality via interfaces (that internally use the class methods).

- Therefore no function should declare an argument by class ("is"), only by interface(s)/trait(s)/whatever(s) ("has"/"can do").

Why use classes then? Mostly for the namespacing and the syntax sugar. The name "class" is probably not very fitting but it's probably hard to get rid of.

So, in essence: I think we agree on the problem but disagree on the culprit. It's not OOP. It's C++/Java/C#/etc.

> Either way, any function that takes void input and with any return type is an example of a function that does not depend on data, but that wasn't my point.

I was sure you were going to say this :P We can agree it's hard to write an useful program with constant pure functions.

Just for fun: I'd argue it still depends on the data it operates on... in this case, the empty set!

>Just for fun: I'd argue it still depends on the data it operates on... in this case, the empty set!

Void is not an empty set. It is void. Nothing. I am referencing the category void here. I'm not too familiar with rust but in rust this is just an empty enum.

Note that functions with parameter void can never be called because the void type can never be instantiated. You cannot instantiate nothing. This is a very theoretical concept, but I believe in rust you can actually define (not call) a function of this type.

   fn absurd<T>(x: SomeEmptyEnumType) -> Int { 1 }
It's def hard to write a useful program made up of functions that can't even be called.

>So, in essence: I think we agree on the problem but disagree on the culprit. It's not OOP. It's C++/Java/C#/etc.

Took me a bit to parse what you're saying. I get it and you're right. What is the problem with C++/Java/C#/etc?

Void is just another name for the empty set. It's not an empty set, it's the empty set. Since it has no members it indeed cannot be instantiated.

As a Rust library: https://docs.rs/void/1.0.2/void/enum.Void.html Notice how the empty enum is called... Void.

In Rust you have "!" (never) too, which cannot be instantiated either but coerces into any other type.

> What is the problem with C++/Java/C#/etc?

Is-a polymorphism.

Polymorphism? Aren't interfaces a formalization of polymorphism?

Notice I said "is-a" polymorphism (I should've said inheritance to avoid confusion: class Person; class Employee is-a Person; class Client is-a Person; etc.). I mean an ontology of classes.

> What makes OOP wrong is not tying methods to data, but tying data to ontologies. Limiting a function to operating on what something "is" is an artificial limitation. Limiting a function to operating on what something "has" or "can do" is a logical consequence.

But the “is”es in proper OOA&D, and hence OOP (at least the ones you should be coding to), are of the form “is a thing that can do”.

What do you consider proper OOA&D in this case?

If you mean: never declare a function as taking a particular class instance ("is"), only interfaces/traits/mixins/whatever ("has" / "can do").

Then yes, we agree.

As soon as you declare a function as taking a class instance, you're limiting to what something "is". I don't think ontologies are particularly useful as a means of abstraction.

Yes, you can do better with careful architecture, but the footgun is still there and I'm just saying we should probably get rid of it instead of working around it.

You can accomplish exactly that in OOP with careful use of multiple inheritance or interfaces.

That's kinda my point.

Interfaces should be the main abstraction.

I don't think multiple inheritance (or inheritance at all) is particularly useful. Composition over inheritance. Interfaces over inheritance.

I don't care what something "is" and I don't think ontologies (classes, inheritance) are particularly useful as a means of abstraction.


Designing SQLitec data store and designing Cassandra data store are rather different.

True, but initially decoupling the business logic from the specifics of data retrieval would allow you to write an abstraction layer when you need to switch stores rather than having to rewrite the entire thing.

Fighting for the independence of these separate pieces and layers of abstraction is one of the most difficult problems I’ve had being a developer.

Management always wants time saving, “well if we just make these parts dependent, the fix will be much quicker”, requiring deep explanations that requires justifying some “hypothetical” scenario in the future (which always ends up being in a few months). Other developers, not familiar with the deeper understanding of the problem being solved, knowingly or unknowingly implement the shortcuts, adding technical debt and hidden complexity.

I don’t know what the balance is, but I find the only way to make a long lived system is to fight hard for the separation of concerns, or slowly fall prey to chaos.

> (which always ends up being in a few months)

I have seen something about technical debt that says that it's a win over the span of weeks, but a loss by the time of months.

I would say 90% of the time, there is a failure in design.

Most of the time the designer does not fully understand the problem because the product hasn't been tested yet. Additionally even after the product has been rewritten 90% of the time the new design is wrong.

Often the second design is slightly better or worse than the first design because the second design is also a shot in the dark. Why?

This is because there is no way to theoretically prove a design is the correct solution to the problem. The only way to prove a design is to test it. There is no science or proof behind "design" only "design principles" and anecdotes... all of which can be broken and tend to evolve over time.

This is why designers tend to be artists rather than scientists. Designers are hired for problem spaces where no theoretical solution exist.

If most designs are wrong how do products even exist? Well computers are fast enough to cover up design flaws. Hacks can be made to bridge gaps in designs. Overall designs are wrong but can be patched to make anything work.

Rewriting is a very healthy part of software engineering. Sometimes you just didn’t anticipate what the company or product was going to do. The solution you wrote for 1k/qps probably won’t work at a million qps, and it is kind of silly to expect it to.

Or, did a rewrite and it launched them to success.

A pattern I've started seeing with most "bad" rewrites is that "bad" rewrites are probably usually motivated by burnout than actual architectural problems. People get tired of working on the code, and they would rather be working on something else hence "let's rewrite".

When you hear about "good" rewrites it's almost never framed in a way that is "the old code is bad and complicated." It's usually driven by an architectural need - for example it started in Python, but now we have scaled up and in order to avoid $1MM/month a rewrite in Go/Rust/C would be beneficial.

Or, in the case of changing marketing needs, it can be "we now better understand our customers and a rewrite can help us provide a more stable interface and help us iterate faster". However in this case a full 100% rewrite is never the case and it's usually done piecemeal or by the new system simply proxying requests to the old system.

When the excuse is simply the "code is a mess", then not only are you admitting that the original team failed to design a maintainable system, you are also trusting them to not make the same mistake again. And if their whole motivation is "the code is a mess", they will probably fuck up the redesign, as messy code is a symptom of bad design but not the root cause. If the root cause is burnout, and that isn't addressed, the rewrite may be just as bad.

Iterational design is a thing.

If the same team that did the crappy code gets a chance to learn from the mistakes and rewrite it part by part now when they have the whole picture, it's easier and the end result might be mantainable and nice.

I've heard that the worst code is the 2nd version. You try to fix everything from the 1st version (real or perceived) and over correct as a result.

Yes, I've heard that too. But in our rewriting case, it didn't happen. We didn't make that mistake. Usually, developers made a lot of mistake in the first version and they learned many lessons. Now they are experienced about software and they will try to implement everything in the 2nd version. And they will come up with the complex system again. In our case, we cut off many features and we kept our code base small.

However the third iteration is something that you can be proud of. (but in my experience, 1/2 through the second iteration the funding gets cut off)

Also architectural needs sometimes are not really needs. E.g. you have a program for dos and think that you need a complete rewrite for windows. Then someone heroically ports over the dos program to windows, because the programs are not compatible. You end up with two programs to maintain.

I've been a developer for almost 10 years and during that time, have been involved in small rewrites and major rewrites. None of them have gone well and there wasn't a single one that we didn't regret starting on. What you think is a mess of legacy code is often less messy than you think and a lot of the odd decisions were often made for very good reasons. When you rewrite a codebase, you often tend to over-engineer it for the vague goals of being "scalable" and "clean" but, while clean on the surface, it often comes out far more complicated and with the a similar number of bugs. Another way to look at it is you're taking what was previously an agile approach of iterative development and replacing it with a waterfall approach which sees a grand end goal (parity with legacy code). Software is complicated and there's no silver bullet. Adopting React over jQuery will not reduce complexity as much as you think. Adopting microservices over a monolith will not reduce complexity as much as you think. The only thing that will ever reduce complexity is careful iterative development of good abstractions which are inserted into the codebase one-at-a-time.

> 1. Are you ready to throw away all that knowledge?

This is a massive fallacy. Rewriting code does not have to throw away knowledge. The old code is a specification for the new. You just have to read it.

I've been involved in several successful rewrites and I always use the old code as a precise description of how the software is supposed to work. You have to take the time to understand it. And if you don't understand the old code, you really shouldn't be rewriting it.

The code as "precise" description: The problem is that you can't tell if that weird "if" is correctly handling a corner case, or incorrectly handling a corner case, or is now a "can't happen" left behind after something else changed. You can't tell which from the code.

That means that tribal knowledge is crucial for a successful rewrite. And that means that you need to do the rewrite while you still have that knowledge available.

This is a good reason to have decent, intent revealing tests. Even if the code descends into a ball of mud, you still have the tests characterizing what the function of the system ought to be, allowing a rewrite with more than just blind faith and seat of your pants optimism.

Then again, code bases with suites of extensive tests probably tend to not be the real worst cases. But if they are, at least you have the tools to do lots of iterative/evolutionary refactoring. It's amazing the radical transformations that crappy code can go through when you gain confidence from good tests.

Having done huge multi-year refactoring efforts in both scenarios, I know which one I'd prefer.

I don't agree. The code may not do precisely what was intended but it is deterministic. If you understand the language, you can understand the code.

Yes, you understand what it says. You don't know what it should say, though. The code can't tell you that.

[Edit: "Better architecture, but bug-for-bug compatible" probably isn't the goal of the rewrite.]

> Better architecture, but bug-for-bug compatible" probably isn't the goal of the rewrite.

Why not? From there you proceed to fix the old bugs.

I've been involved in 3 separate rewrites in different industries, at companies of different sizes, over the course of 20+ years.

Not one of them went well.

I've also been involved in maintaining legacy code during that time.

The code sucked, and changes were slow, but it went a LOT better than the rewrites did. We actually shipped things.

I have several experiences with both bad and good rewrites.

A common pattern among the bad rewrites is that they were too late. Rewrites should be early in a project. Expect to do perhaps 2-3 rewrites at the beginning.

Another pattern is that the team was too large. To many chefs and we ended up with an incoherent mess.

The good rewrites I’ve done were gradual. We agreed on the target state of the code and on the intermediate stages. Perhaps you could call these “big, slow, refactoring”, but they were in fact rewrites.

A couple of rewrites were so successful they changed the rules of the game entirely.

People who say rewrites are either all good or all bad are wrong every time.

I don't think anybody is opposed to rewriting modules when necessary. The discussion is around "ground up"-rewriting where the whole code base is discarded and starting again from scratch, like how Mozilla did.

You can perform a long series of refactoring so you in the end have rewritten everything. The important difference is if you in all the intermediate stages have a working program with all tests passing.

Well, to be fair, the Mozilla rewrite was a combination of runaway ambition/scope, too many cooks and too few of them Michelin class cooks. I wouldn't exactly say that the rewrite itself was the biggest problem they had.

When you look up "second system effect" there's a picture of the Mozilla logo. :)

The problem is it’s not linear. You can ship until you can’t (or can but very very slowly). It’s taking all of the hit now or a little at a time until things completely grind to a haunt.

I’ve been involved in a few rewrites as well. Some went well and we had a noticeable productivity gain. Some went badly and we were stuck for a long time fixing bugs and corner cases we had already addressed in the legacy stack. It really depends on the situation.

what motivated those rewrites?

One was pure vanity. We replaced a perfectly functional codebase in order to support a bunch of new functionality more easily. In the end, what we actually did implement could have easily been done on the old architecture.

Another one was a 15 year old codebase. We were also switching to Java. The process had taken 3 years by the time I arrived, and it was eventually canceled. They'd brought in a bunch of consultants (I was one of them), and everyone had their own idea and their own architecture for their parts. The pieces didn't play well together.

The 3rd one was just your standard rewrite of a spaghetti codebase. I still think it was the right choice to rewrite, but it took too long and the project was canceled.

I see... Rewrites have to be both justifiable and shouldn't take too long. I am telling that to myself, we have an old code base in php that we would love to rewrite in python, but it must be justifiable. I believe we will do it incrementally, with each new function written in the python framework we have and any code maintenance that won't take too long to rewrite

When I hear about rewrites, I always think about this comment:


So true, the opportunity to leave things better than as you found them is one of the primary rewards in fixing bugs in existing/old/legacy systems.

Turning one piece of software into another is something which offers a spectrum of techniques which range from micro-refactors up to full rewrites. If you find yourself reaching for the full rewrite option, you should realize you have suddenly reached to the very far, far end of the spectrum of options and are basically taking an extreme viewpoint. If you haven't gamed out and really tried to rule out options across the entire spectrum, you've failed -- not just because you've been negligent in terms of planning and trying to make the best choice for your organization/team, but also because if you do end up doing the rewrite, you've made a specific choice that, unlike most others, infuses a large amount of existential risk to the organization and product.

In the case of a full re-write, there really must be no other way to accomplish the goal. How many rewrites happen when there was literally no other way? Engineers seem to have a bottomless well of creativity when it comes to solving problems, but that creativity often evaporates when focused on the problem specifically of "How the hell do we get from A to B without a rewrite?" No pain, no gain.

Most rewrites are unjustified, either because the old code is good enough, or because the guys in charge of the rewrite cannot deliver better (whether or not the old code is good enough).

But some are.

Arthur Whitney creates each version of the K language from scratch (he is now at version 7). They are not backward compatible, but every single one so far has been better on the metrics that he cares about (succinctness, speed, and revenue).

Everyone says that the Mozilla->Firefox rewrite was unjustified, but ... the Mozilla codebase looked like a dead end to too many people working on it (from first and second hand accounts I heard), it's not clear that it was viable to continue with the old codebase -- and, even if it was, it is unclear that Mozilla would have been better off doing that: Microsoft was the unstoppable juggernaut playing dirty. IE won not based on technical merit (although it actually WAS the better one come version 4) - but arguably because it was integrated into the OS. In fact, it is quite likely that we would not have firefox today at all if they did not start the rewrite back then.

C# is essentially a rewrite of Java; done for legal reasons rather than technical reasons, but IMO the result is a much better technical solution than trying to retrofit and comply with the (legal) legacy of Java.

But ... yes, most rewrites are unjustified and often end in failure.

As a general rule you shouldn't rewrite everything from scratch. This does not mean that you shouldn't, on occasion, rewrite some parts of your system. Rewrites are expensive. On the other hand some code you think is bad really is bad, and it will be a net win to rewrite it.

To make a call like this, you really do need a pretty good understanding of your codebase, your business, your testing exposure, the capabilities of your team, and probably some other stuff as well. In reality, your understanding is never going to be as good as you really need it. This doesn't mean you should never rewrite any portion of your codebase, it just means that you really need to reduce the scope a lot to reduce the risk to a manageable amount.

At my last job all I ever did was rewrite software from scratch. Sometimes the technical debt has led you beyond a point of being able to continue using the solution you have. For example Flash (soon to be ditched by mainstream browsers!) and Silverlight (no longer supported on modern browsers), and heck Java on frontend as well. I rewrote an entire Silverlight application to HTML5 and JS for the frontend and the backend with ASP .NET Core.

Then I also rewrote a Java servlet that I never directly touched the codebase for but I was told you needed a special VM just to get it to compile, I rewrote it in Python using CherryPy. Sometimes the technical debt is not worth keeping around. I sure could of modernized the Java but Pythons a lot more approachable to the rest of the team and anybody can understand what is going on at a simple glance. I can run Python on Ubuntu without a hassle since its out of the box (our main OS in that office).

Am I the only one who will do an initial "proof of concept" implementation of something that I intend to rewrite in another language? E.g. I will write the "first draft" in Python or Racket because I know the language won't get in my way, and then if it needs to be redone in a lower level language or something that I'm less familiar with, then I have a reference implementation to compare it with.

The same concept can even apply without switching languages if you just want a "naive" version that you know is correct.

1) Rewriting code is rarely, as the phrase suggests, about doing all the work over. Usually it is about restructuring the architecture and then adapting the old code gradually.

2) Throwing away code is not the same as throwing away knowledge. Provided that you didn’t lose the entire team, the knowledge is still there. The code is just the corpse of knowledge and it is worth just a fraction of what the team that wrote the code is.

A rewrite is a good opportunity to revisit and review that knowledge.

3) With experience you learn to balance two forces: perfection and pragmatism. You should strive for perfection in everything you do, but you have to demand of yourself that you can release in a reasonable amount of time.

4) Every good software project I’ve been on has viewed the code as something temporary. This forces you to think about how you will replace every bit of the code as you write it. From this, workable architecture emerges unless you succumb to perfectionism.

From your description your CTO’s instincts were right, but he might be missing the backbone to put you in your place.

Your team needs a more experienced lead developer to help you guys mature as developers. And you need to pay less attention to articles and blogs, be less worried about what others think and ready to learn from someone more experienced. That’s going to be hard since we programmers tend to have huge egos. Even when we’re inexperienced.

Rewrites, as opposed to refactors, are warranted when they address fundamental architectural limitations that hinder product success. There is a notable exception for cases where the software architecture effectively is the product and customers rely on observable behaviors good or bad e.g. RDBMS. Poor design is not the only source of architectural problems and it is not always avoidable; maintaining unused flexibility and optionality in architectures is often extremely expensive.

In my experience, few companies rewrite systems with legitimate architecture issues. It is vastly cheaper in the short-term to redefine the business in terms of those limitations, which is what most actually do in these cases. The risk to this strategy, which I've seen manifest many times, is that a competitor without these limitations can change the expectations of the market and thereby render your product obsolete in surprisingly short order.

Rewrites are always extremely expensive, along many dimensions, but they are also sometimes unavoidable and can lead to much greater product success than without. It is much more complex than the mere state of your code base. The calculus for whether or not it is worth it doesn't lend itself to simple analysis.

I never really got the reason for doing this

    var condn = ...;
    if (condn) {
        return true;
    return false;

Maybe there once was a breakpoint at the return false, or an assert, and it was just left like this in case the breakpoint is needed again.

Modern debuggers generally support conditional breakpoints, but they often incur a high performance penalty, so this reasoning makes sense to me.

Perhaps you need to log the value of “condition” twice (once in each of 2+ cases) and you don’t wish to evaluate condition twice.

Because it gives you the opportunity to descriptively name the expression, sparing others from parsing it mentally.

Storing the result of a Boolean expression isn't the strange part. Storing it in a well-named variable can make the code a lot easier to read; the name (hopefully) communicates intent, and it can separate a conditional action into separate steps.

The strange part is testing Boolean value... to return the same value as Boolean constants.

Why wasn't the Boolean result simply returned directly?

    var condn = ...;
    return condn;
The if statement isn't necessary. Even if you wanted to explicitly return only "true" or "false" - using the if statement to convert truthy and falsy results into true booleans - using double not operators removes q potentially mispredicted branch.

    var condn = ...;
    return !!condn;

Ah, sorry, I completely failed to realize that was the question being asked. :)

Can't the returning function be named appropriately?

I mean, it's the same thing as extracting the boolean expression into a function and calling that in the if statement. It's also easy to imagine a quite long expression that would benefit from breaking into two or three descriptive names. It's a good tool for your own sake when coding, to not get the logic tangled up.

Do you also think `if ((x > 3) == true)` is more descriptive than `if (x > 3)`?

To be honest I do think `== false` is clearer than an inconspicuous exclamation mark but I don't use it.


Glad that you agree with the parent.

Oh, sorry, didn't see the !

It's been discussed here before, I don't buy the descriptive argument per se, it's the function name purpose, but another argument is to anticipate further processing then condition description can become purposeful, and you just have to add code in the right branch

One reason is that a code-coverage checker will tell you whether you're testing both cases.

He lost me at the very beginning due to clumsy handling of boolean values.

I've been through successful rewrites. I think rewrites make sense when your product has scaled very fast with 50x more engineers or 50x more demands on it, so the context in what was made doesn't make sense anymore. You need a faster language, a stricter architecture and code discipline to deal with +100 engineers working on one product vs the previous 3, etc.

Rewrites in that case is more like your old wooden 4 story office is too small, so now you need to build a skyscraper out of steel and concrete. While the steel concrete structure is being built, you still keep an maintain your old office since it's running your business, and you have the budget to simultaneously maintain both. Steel structures require different building techniques than your wooden one.

That kind of hyper scaling is really rare outside of SF venture capital companies, so in most cases, you shouldn't do rewrites.

It's also possible that things that grow 50x tend to be newer, meaning less gradual accretion of cruft in the code. Being newer also means more remaining institutional knowledge in the organization about the technology.

For 50x growth, the new thing we are thinking of rewriting probably also has some simple core purpose (like some kind of social network), making it easier to grok the existing code and indeed to rewrite it.

By contrast a decrepit old payroll system will not have the growth dynamic, and will not be at all simple to rewrite - the code will be packed with small fixes and strange little features that are actually critically important, and that it would be disastrous to omit in a rewrite.

These tie in with your last point.

I was the main programmer on a web application with 250k lines of code; it handled a single very complex product selection/configuration process. The time came to expand it to handle multiple products in the same family of products.

We ended up, over six months, doing a complete rewrite that, from the outside, looked like refactoring (meaning, from the user's/product owner's perspective, the old stuff continued to work as we wrote the new stuff beside it, switching over as we went).

We launched on time and on budget. My last week was a death march but that was it. There was now only 50k LOC. Where the previous version was tightly coupled to the product specification, we now had an engine that could perform the same selection/configuration process for any product that could be articulated in the new DSL we'd developed; as a bonus, the specifications were standalone modules so we could use them in completely separate web apps.

The whole time we were aware that we were breaking the 'never rewrite' rules spelled out by Spolsky and others, but we were successful, the codebase took a huge leap forward both in speed of development and maintainability, and a subsequent project to add another set of products took half the time that was estimated.

I don't know what we did right, exactly, except perhaps that our rewrite was never about burnout or annoyance at the existing codebase; our decisions along the way were always justified by what we had to deliver and a reasonable feeling of an experienced dev that refactoring the old code would take more effort than starting over.

I'm not sure I buy the articles rejection of the rewrite they did, because he frames it entirely in terms of delay in updating and adding features, without seeming to consider that this was just legitimate technical debt that they had to pay off. The old code got them into a position where they could rewrite it. How many startups fail because their attempts to get it right the first time prevent them from ever launching and getting the customers that will pay to fix it later?

> Consider refactoring before taking a step to code rewriting:

Refactoring is only an option for anything but a trivial code base if a suite of automated tests have already been written.

The article makes no mention of tests. I suspect the reason is that there were none. Refactoring would most likely have been impossible until an automated test suite had been written.

Many developers are confused about the purpose of an automated test suite. It's not to make your designs better. It's to allow you to refactor the spaghetti that will inevitably arise.

The article is not about refactoring. It's about rewriting the software. So that's why I didn't mention about refactoring in details. Automated testing is part of the refactoring. You will change a small amount of code then run tests to see it works as expected. This is how refactoring works. No need to mention abou that.

That is just not true. There’s always the option of refactoring, make the necessary changes and manually test the system. Even though I’m a big fan of automated testing (not only unit tests), most codebases don’t have complete test coverage, but we still refactor and change the systems.

I'm pretty sure I know the exact product being described. It is Anti-Malware by Malwarebytes.

Being a customer for many years, I saw the downward trend and it corresponds to that described in the article, both chronologically and by meaning.

P.S. I can add up a lot to this story from a customer's point of view. Let me know if you are interested, I will continue in a comment below.

Yes please!

My favorite strategy for code reuse is, make API configurable. That means instead of coding, you configure the system as much as possible.

Only the core , which is the implementation details could be changed without affect the rest of system.

One benefit/downside is that, whenever you change the core, or everything else works at the same time, or all thing breaks.

> Consider refactoring before taking a step to code rewriting

There's got to be a Ship of Theseus argument here, where over the course of time you refactor the whole thing until none of the original code remains. I guess the difference between that and a rewrite is that after each refactor you should still be able to ship it.

Wow, I kinda work in a legacy (3-4 years old), not very easy codebase, where new releases affect other parts and bring new bugs, but I cannot imagine the state of yours. Not being able to do an update?

How can a 3-4 year old codebase be legacy? That's not a lot of time for the platform or technologies it uses to become crusty and obsolete.

I often surprise myself - I see commits that are 4 years old that I could have sworn I implemented just a few months ago

Some codebases are legacy from the word go. Legacy is more about the maintainability than the age of the codebase.

In the book Working Effectively With Legacy Code the author defines “legacy” as “code without tests”.

That's not really a popular or even useful definition. According to that author, a Go microservice I wrote yesterday would be legacy code, while a 25 year old Turbo Pascal scientific instrumentation program running on MS-DOS, which some poor soul still has to maintain is not, because the programmers wrote a few tests for it back in the day?

Interesting. Is your definition of legacy code “code that was written <x> time period ago”?

Wikipedia defines legacy code in several ways. So sure, that term has multiple meanings. Code without tests is a definition mentioned in the “modern representations” section.

Wikipedia indicates the original meaning was code that was dependent on a deprecated underlying foundation such as on os / framework / etc.

This is a snippet from the “modern representations” section of the Wikipedia entry. Basically the idea is legacy code is code that is hard to change:

“Eli Lopian, CEO of Typemock, has defined it as "code that developers are afraid to change".[1] Michael Feathers[2] introduced a definition of legacy code as code without tests, which reflects the perspective of legacy code being difficult to work with in part due to a lack of automated regression tests. He also defined characterization tests to start putting legacy code under test.”


Correct, because the Turbo Pascal program can be safely modified and extended. If the need arises you can port it to a more modern environment. Code without any tests can't be modified without risk of unexpected breakage, so it is basically legacy the moment it is written.

Having tests is not a panacea. There's no guarantee that any of the tests are automatically well-written - they could be too narrow in scope, overly brittle, or be so numerous that each minor change requires hours of automated testing, making progress a huge chore.

That's not to say you can even run, let alone compile some legacy code without a large amount of effort. That's the whole point of the term, that they rely on obsolete, unsupported or just ancient OS, hardware or platform, often requiring emulation, special development environment.

Redefining it to mean "code without tests" just confuses things.

To me "legacy" means when you are stuck with something outdated. Code is not legacy just because it is old, it is legacy if it can't be adapted to changing requirements. For example old mainframe systems where the source code has been lost.

Code can be brand new and on the most fashionable platform, but if it is undocumented and unmaintainable and the original developer is unavailable - then you have legacy code. Because you are stuck with it and can't keep it updated.

If on the other hand the Turbo Pascal program runs fine and you have test coverage which allows you to refactor and adapt to changing requirements - what is the problem? And if you want to port to a more modern environment or platform you can do that with low risk.

Of course tests can be badly written, that goes without saying. Nothing is a panacea - not even microservices in Go!

Maybe it wasn't built with containerization in mind. ;)

maybe it's javascript

I am working on Sciter (https://sciter.com) that appeared in Norton Antivirus in 2007, so it is used in production 12 years already.

And I never did full rewrite. The development started as "C with classes" when std:: was not the thing so I had and still have my own array<T>, string<T>, hash_map<T>, etc.

Since then I've added GPU support, Mac and Linux versions, scripting … a lot of features. But not full rewrite and no need for that that far.

Only incremental refactorings when they needed, e.g. when all platforms got reliable C++11 compilers I started using std::function instead of those ugly inner structures.

Am I lucky or what?

Why do you think electron is more popular than sciter? They seem to target similar things? Is 'sciter html' a bit different than 'chrome html', so you can't make a web app out of the same code base too?

"They seem to target similar things?"

No - different. Electron was designed to run web sites pretty much as they are. It is a standalone browser combined with server (Node.Js) under common umbrella. So it the same client-browser thing - client in one process, server part in another process. Communication between them uses IPC rather than over-the-wire as in case of browser/remote-server.

And Sciter is an embeddable thing - it runs in the same GUI thread as the application. So Sciter is for native applications where window layout is defined in HTML/CSS terms.

So the only common things between Electron and Sciter is use of HTML/CSS.

Sciter actually is closer to WPF (HTML/CSS instead of XAML) than to Electron.

Or to GTK for that matter - GTK also uses CSS styled DOM as does Sciter.

the usual definition of "legacy system" is a system that uses tools, technologies, and techniques that an organization would not use for new development. so, in SF, things change on a time scale of weeks. your 'go microservice' might be considered a legacy system. here in fly over country, COBOL on the 360 might be considered apropro for use today.

I recommend reading about Domain Driven Design. For example reading the book: Implementing Domain-Driven Design by Vaughn Vernon

Also consider how much time you'd lose by not forking a well-designed, open source codebase.

Joel Spolsky wrote a classic piece about Netscape's rewrite in 2000 [1], the article links to it too.

[1] https://www.joelonsoftware.com/2000/04/06/things-you-should-...

This is referenced in the article.

> our CTO was handling everything about AntiMalware. He was the main developer...

There's (one) of your problems. Your CTO (just like managers) should not be coding, they should be defining and leading the technical strategy of the company.

If you happen to be part of a two person team without any money you're CTO is going to need to code. At a team of 4/5 the handoff begins by employee 11 they shouldn't be coding perhaps firefighting if necessary.

As part of a two person team, there are no CEO/COO/CTO/CFO roles...you're just two people building something.

It's a joke to give yourself a management title when there's nothing to manage.

It's not a management title, it's a role.

Being a 2 people company or a 20k+, you still have to manage financial, technical, operation aspects of your business.

Same role name, different implications depending on scale.

It’s a role (CTO), but not one that exist on a small team.

If you are a small handful of people starting off, you don’t need a CFO, you need a bookkeeper and an accountant. You don’t need a CTO, you need a tech lead.

All that rest is a mix of confusion and puffery.

The puffery is in the inflation of the supposed prestige of such titles (both in puny and large settings).

There's a level of strategy, planning, influence, and execution that is completely missing at the small level - it's not even a matter of scale.

Ignoring that when giving yourself a C-level title is what I take issue with.

You're saying exactly the same thing I'm saying.

Same role name, different situational implications.

Being a mayor of a small village or of Paris or New-York, you're still the mayor.

Seeing CEO on the business cards of single person companies always makes me chuckle.

Depending on what form your business takes and where you file, you may have to designate who the CEO or President is. Of course, you needn't always put that on a business card.

All these CFO,CTO,CEO,CMO and etc. titles make me laugh when it comes to small companies. These titles make sense when we talk about the likes of Google,MS, Facebook. If it's a shop of 10 people, you are not a CTO,you are a tech lead at best with a few less experienced developers.The 2 co-owners of the business I work for call themselves joint CEOs, with only one line of management separating them from the lowest level employees.I could call myself Vice President or COO or something like this but that'd be idiotic knowing that I only manage a team of 10 people.This obsession and vanity with titles is complete nonsense.

Not true. A person can manage maybe 10 other people max. Beyond that, you start delegating.

So really there's no practical difference between a company of 11 people and a company of 11000 people from a management perspective.

The technological/practical concerns you'll be dealing with are totally different, of course, but as far as 'people management' stuff goes, there's little difference.

I agree with the part about the number of people being managed, however usually there's a huge difference in management structure. For instance we have some standard big corp and its tech department. So you'd have a developer,senior developer,tech lead, some sort of team manager.Then the team manager would be reporting to the person that runs the particular division or department. The head of the department would report to some VP who would eventually report to the CTO. In a small company that'd be like developer=> CTO. That "Chief",in which case it loses the purpose.

fancy titles are useful when you are dealing with Big Enterprise Clients. "CTO of TinyStartup" means you get taken slightly more seriously than "some rando building stuff".

In fact, it's not even with the people you're directly dealing with where this becomes useful (as, if you're selling to Big Enterprise Client, you likely have a network inside the company anyhow). It's when the person you have a relationship with has to convince his/her boss to approve an invoice from your little shop.

Have to agree with this. Ever since I got promoted, the way people( outside the company) tend to talk to me has changed a lot.Even though good part of my daily activities are the same, people tend to take me more seriously.It does help to chase people to make sure they do things or even for the customers,who want to talk to 'the manager', even though I'd help as much as any other colleague of mine...

Would you rather buy $100K of software from a "Senior Sales Representative" or a "VP of Sales, Western US region". Titles mean a lot to certain people, even if they are all puffery.

Personally I wouldn't care. The reason is, that regardless of the company I'd go on LinkedIn just to see who the person is. The deciding criteria would not be the sales rep's title.The functionality of the product, support, chances of them staying alive for the next few years and etc. are much more important. However, I appreciate that's not necessarily the case of how a lot of people think.

In a two person company you don't have a CTO and a CEO, you have "The Code Guy/Gal" and "The Other One".

> they should be defining and leading the technical strategy of the company.

At most companies (small to medium and a large number of industries), that isn't a fulltime job. Most companies have IT as support or are a very small segment of the total business.

This makes me think of groovehq who just recently forced all their customers on to their new rewritten from scratch product.

This after one full year of their customers begging them not to.

So now they have a new modern (in the 2017 sense) product which is worse in many ways and with a whole set of new bugs. And some very unhappy customers.

Improve your products. Don't replace them.


Personal attacks will get you banned here. Maybe you don't owe the person you're attacking better, but you owe the community better. Would you mind reviewing the site guidelines and taking the spirit of this site more to heart when posting here?


You might also find these links helpful for getting the intended idea:





I can not comment unless I see the code(s).

And yet you did.

Did I really?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact