As far as I know, everyone else still works there. I bet some of them sleep at their desks trying to keep adding more knots to that awful legacy system.
And it didn't get that way WHILE I was there. That project had been around since 1998. They were so afraid of change that they were still using the C++98 compiler. So I didn't even leave that job with relevant (read modern) C++ knowledge.
Funny thing: The server architect at my current job was one of the founding members on that project. He had the same kind of view towards it. I feel like that project has been in a doom spiral from the beginning. So 21 years at this point.
It blows my mind that there are these legacy code sweat shops out there barely holding software together like this. What a miserable existence.
This happened to me too!
I understand the reasons for that: it was highly specialized (scientific) software used in computer chip manufacturing. Most of the people on the team had a PhD in Engineering or Science, but few had extensive Computer Science training before joining the company.
The hardware industry is moving slowly, because you don't just upgrade a fab. Some of the clients were on RH5, so that was our build target.
Part of the codebase was in Fortran (and for good reasons too; this language is common in HPC/scientific scene).
All this resulted in a codebase that was part brilliant, part byzantine, poorly documented, and not even compiling with a modern C++ compiler (hence reluctance to start using C++11 or newer).
Thankfully, the automated test suite was holding it all together; but that was about it. As people left, they took systems knowledge with them that was written down nowhere, and nobody wanted to do a deep dive and document the still known parts.
Predictably, the priority has shifted to "quality" - that is, fixing bugs instead of innovating on the core functionality.
That was not the only reasons why I left for better opportunities, but it was a big factor. It wasn't the legacy system that was scary - no more scary than a hairy prototype that gets the job done, really - it's that nobody was going to put in the effort and take the risk to start moving from a decades-old "prototype" stage.
I believe that things finally started moving when the people there realized that there is no way but forward; I hope they are using modern C++ now. But that train was set in motion after I left.
The definition of "legacy software" in this discussion is: software that grew so many "temporary" fixes and workarounds instead of necessary architectural changes that it's in permanent maintenance mode, and entire parts of it are untouchable because nobody understands what they do due to the exponential increase in complexity stemming from the abundance of these hacks, exacerbated by system knowledge evaporating with engineer turnover.
That doesn't apply to either C++ language and compiler (rapidly evolving in the past decade), nor projects written in it, generally (that's not a property of the language).
On that note, we've integrated some Fortran code that was written in the 80s that I wouldn't call "legacy" under this definition: the algorithm was clear, and the implementation documented well enough that modifying it, if necessary, would not have been hard, and using it with our floor was very simple (it was one of the flavors of gradient descent algorithms that converged much better than several others).
Pretty much this, what else am I supposed to think when I read this : ?
We aren't on the ARPANet anymore - I'm expecting even low-level programming languages to use native Unicode. The way how C equates "char" to both character and byte is fundamentally broken.
(This is also/more(?) an issue with Unicode - IMHO we should have increased byte size to something like 32 bits (during the transition to 64-bit words ?) to be able to fit one character per byte - the increased hardware cost for text storage would have been quickly compensated by the decrease in developer costs. But here we are.)
Careful with that axe, Eugene.
Well, the issue is that they have done it a bit sneakily, they removed all the legacy code they haven't understood. So the code is much more elegant, it has been moved to cpp11 or 14, it ticks every good practice. There is only one slight issue : it does not work. It somewhat work, but is not reliable and fails regularly in unexpected ways. And they've started 5 years ago, and haven't been delivering any business value since then.
At the beginning, it was okay because they had some leeway but now they are blocking the release of new products, and our market share is in free fall.
Heads have started to roll.
To be fair, a few years down the line, their team will likely be more productive and efficient, but I am still not sure that the cost of the rewrite was justified. Still the article is very on point on the risk of not paying your technical debt.
This is exactly why rewrites or huge refactors typically fail. The new programmer sees code and doesn't understand why the code is there and thinks the last programmer was an idiot and deletes said code. Unknown to the new programmer is that code handles some weird edge case.
A rewrite should spend 80% of the time understanding the old code and 20% writing the new. But, that's no fun for most programmers who just want to code in the latest shiny so the new code ends up broken.
"Chesterton's fence is the principle that reforms should not be made until the reasoning behind the existing state of affairs is understood. The quotation is from G. K. Chesterton's 1929 book The Thing, in the chapter entitled "The Drift from Domesticity":
In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, "I don't see the use of this; let us clear it away." To which the more intelligent type of reformer will do well to answer: "If you don't see the use of it, I certainly won't let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it."
The lights stayed on the whole time. We retained the ability to switch back to the old system by setting a feature flag until everyone (including the business users) was confident that the new system was working as well as - preferably better than - the old one.
Throwing out code we didn't understand was simply unthinkable, because we weren't allowed to have that 6 months of time fantasizing that everything would be great before launch day - whatever you were working on now would be expected to go into production within a week or two, without upsetting the rest of the business in the process.
No, it wasn't very fun. But it was satisfying work. I think it was my boss's boss who observed that making all the team members happy won't make a project successful, but everyone's ultimately happy to see a project successfully executed.
As a avid writer of comments I am convinced they also help myself, to form thoughts and remember them later, so at times I will write the comments before writing an actual function.
If that is to bold, commenting while the thing is still in your head makes sense anyways — saves you time later and helps everybody else who will look at your code.
A short summary, preferably including a hint at which subsystem is impacted by the change.
Then explain in detail the context of the change: root cause of a bug and the gist of the solution, use case(s) behind a new feature and how it can be used, ...
Yes, often there's a bug/project tracking tool being used and the commit message contains a reference to the relevant entry there. But from experience I know these tools tend to change: old one gets decommissioned, data gets migrated, what was once the primary identifier is now a mere field or comment in the new system, access rights get messed up, ... Trying to understand the history then turns into an archeological expedition through various eras long gone... unless the commit messages are sufficiently self-containing.
This would have saved quite a bit of headache at my last job actually.
In general, I've found value in figuring out how to improve existing systems where feasible rather than trying to migrate to a new system, since the existing system probably has advantages the new one won't. At minimum, people are already familiar with the existing system.
That, I think is the right way.
When you're at Microsoft and can just walk up to the bar researchers and programmers in the world, maybe. When you're at some corporation where you have to spend half a day on the phone to get your computer unlocked by the desktop support and request to change a config on a web server becomes a ten foot long email chain about whose fault it is that we need this change, I don't think people have any motivation to modernize piece by piece.
Then there is the issue that you'll have to explain why part of the application is in .NET core and part is in dot net framework 3.5...
Maybe this? https://blogs.msdn.microsoft.com/rick_schaut/2004/02/26/mac-...
Couldn't find any other reference, nice reading!
And I say that as someone who has committed a few, in both senses of the word.
The point is to be deterministic, and it can only be deterministic if it is small.
Except none of the contractors agree on what materials to use. So one section is steel, another is wood, and another yet is brick. Meanwhile there is a 3rd party outside attempting to load the whole place onto a truck and ship it somewhere else.
And then someone else comes along and asks "This does fly doesn't it?"
'a series of small behavior-preserving transformations, each of which "too small to be worth doing"' https://martinfowler.com/books/refactoring.html
Given the very first example from TFA, the problem is solved (i.e. the new requirement is satisfied) by only adding a few lines of code. In fact, a massive refactoring was not required. I would suggest that this is actually A Good Thing(tm) and may even be indicative a Good Design(tm). If every new requirement requires the system to be overhauled, you're definitely in a worse situation.
The lack of tests/testing is a wholly separate issue.
If the new requirement comes with new tests, and even better, tests for the old behavior and the new behavior, all the better: you can refactor the system at a later time to make it cleaner and simpler and still meet the requirements, which is even better, since you have a fuller picture of what the actual requirements are.
Refactoring in the face of every new requirement smacks of poor initial design.
I really, really hated YAGNI about ten years ago but have come around a bit. Our users are empirically insane. You cannot guess what an insane person will do next, and trying will only make you insane, too.
I’ve gotten a lot of good mileage and a lot less stress by following the Rule of Three (architecting on the third example). I’ve learned to spot bullshitters versus sincere YAGNI folks (but many more if the former). The critical factor is identifying which decisions are reversible, not investing much energy in them, do everything you can to delay irreversible ones, and failing that make people pay attention to what they are choosing.
This is true.
The idea that a system is well-designed if it allows changes/unexpected new requirements via small, mostly-additive, easily testable changes is also true.
But it is also true that systems remain well-designed until they aren't any more, because too many things have been changed/added. And identifying when that watershed has occurred (or, for extra seniority points, identifying in advance when it is likely to occur) is critical to good engineering over the long term. The point at which "you can refactor the system at a later time" becomes "that used to be the case, but now we need to actually pay down the debt, the tradeoffs have gotten too bad" is the most valuable to identify.
Isn't that the main issue? I understand to not want to do invasive changes when you're not familiar enough with a system, but the solution seems to be:
1. get familiar with the system (doing small non-intrusive
changes is a good way to start to become familiar with it)
2. do more invasive refactoring
For sure don't start directly with the invasive changes, but at some point you need to get a good understanding of the system you are working on, or are responsible to maintain.
If you require "full understanding of our system" for me to add some new functionality in a module, then chances are pretty good that your system has a bunch of problematic dependencies, no?
> If you require "full understanding of our system"
I wrote "familiar enough", and "good understanding", not "full understanding". "enough" and "good" will of course depends on the context.
> then chances are pretty good that your system has a bunch of problematic dependencies, no?
In practice you need to understand some level of the context in which your module exist, hopefully not all of it, though of course it would be better to be able to just focus on the module itself. By "system" I don't necessarily mean the whole, complete infrastructure, it can be the module.
My point was that if people are blocking changes because of lack of understanding of something, the solution would be to actually get some level of understanding.
Edit: also, I assume good faith from gate keepers.
Interestingly, the whole situation was almost exactly what he describes here and a full year later I had failed to correct it (and probably made it worse). Here's what I did so that others may learn from:
* Since the team had stalled on that project for two years, I pretty much sidelined it and attempted to get us in a habit of getting wins so that we'd feel comfortable with changes. Result: We just accelerated the muddying of the codebase. The team wasn't constantly upset anymore but the size of the improvements you could make was decreasing - a sign that you're in the doom spiral.
* Set aside time for us to perform the refactors, but allowed the team to identify the primary pain points and focus on them. Result: The muddied code-base was like an underwater cave, the silt was in established places. The attempted refactors were just more muddying. With two years of living with an on-call rotation, the team's primary focus was in trying to remove the things that caused them direct pain, not the underlying causes of them, because they felt that the time we had was insufficient to do anything meaningful.
* I championed automated testing, replicable builds, and atomic changes but failed to really make the argument since the net takeaway was "doing it that way would be great but it would be too slow". It didn't really take, but perhaps I should have had an enforced approach to that rather than an approach to make an argument that convinces. We got somewhere with the testing but it was a big effort. That cost me a little bit on the culture since it had to come top-down from me.
* I tried 'leading from the trenches' so to speak, but that was unsustainable. Things worked while I was there, working as an engineer along with everyone else, but then I had to sacrifice other crucial things that were necessary to retain the state.
In the end, things got somewhat better there in terms of attrition and big system failures, but the doom spiral was still there. So it was like adding x years to failure rather than putting us on the path to success. I'd like to think that now, with some distance, I have a much better picture of what to do.
To be honest, I really don't think that adding a slew of manual testers is really the solution. The iteration speed will plummet because now there is one more hinge to the arm controlling output.
The annoying bit about the whole thing? None of that is useful to me now. When you're trying to build a startup you wish you had problems like this because it means you are already successful. Ugh. Maybe some day it will be of use.
OK, what, please share!! You told us what didn't work, in retrospect with distance, what would you have tried differently?
Believe me when I say the day you switch the old system off will never come in the vast majority of scenarios like this. The result is either wasted development time or, probably worse, you now have two systems to maintain and keep running.
We recently moved from a heap of matlab code which was started with a student project in 2001 to a huge tool used in the industry today to a new implementation in C++ and python. It has been a huge sucess with our customers.
The reason why people warn against rewriting is that it's a risk, a gamble, and often a conceit by the programmers. Programmers will also often spectacularly underestimate how hard a full rewrite will actually be.
You're taking something that works, and attempting to recreate it. You can find lots of examples where rewrite projects went spectacularly wrong. A commonly cited example was the Netscape rewrite (which killed a hugely successful company).
Your gamble paid off, but it's almost always the worst decision you can make. There's even examples in this thread of when rewriting goes wrong.
If you have reason to think you can do better this time - e.g. the team have learned how to avoid unexplainable crashes - then you could apply this knowledge to fix the issues in the current code, which would a much less risk and less time.
For us the first project acted as requirements analysis. In most cases bad software is mainly because of lack of proper requirements. In hindsight, its easier to make complex tool coherent.