This is a great article, I can relate. I specialise in replacing large parts of codebases with code that does the same thing from a business point of view but that makes future changes cheaper to make. One thing I thing that is worth mentioning is the political aspect of this sort of work. The people in power need to be comfortable with the fact that you will be introducing risk without immediate reward. That is a tough sell to someone who is used to putting out fires and writing root cause analysis reports to management. Sometimes this can’t be done and you have to hide your refactor in real business change work. This is not fun because it usually makes it look like you’re a slow dev.
In addition, most developers are fiercely defensive of their code and you need to be aware of that when you chose to replace it. A trick I find useful is publicly declaring, in your team meeting, how useful you found x members tests in covering your refactor. Or their comments or documentation or domain knowledge. When you are picking their brains for implementation specifics try to sympathise with them when you see a bit of hacky or confusing code. Say “I’ve had to do something like that before because of xyz”. It will save face and you will get more out of the developer. Never criticise, they will know what they have done wrong without you telling them. Just, be nice.
If you are looking for devs that can do this sort of work effectively then get them to read code in an interview and explain what it does to you. They can even offer suggestions and you get to see how they deliver criticism. Not complex algorithmic code but simple vast swaths of business junk.
That's basically been my job the last five years. As soon as I joined the company (a scrappy startup of 30 people), I started refactoring large portions of mission critical code prioritized mostly by how terrified other coworkers were of touching it. In the beginning, folk were rather skeptical, but now that the company has grown by an order of magnitude, my earlier work has apparently become a topic of folklore in other parts of the company. As it usually goes, I spend most of my time these days mentoring and in meetings, but I still try to find time to refactor more fragile bits of code before they fall over completely. I encourage my team to tackle technical debt head on rather than work around it whenever possible. Far too often, folk spend more time avoiding solving a problem by patching around it, and it's usually because they're too afraid to dive in and change code that's hard to understand and therefore scary. For whatever reason, I've always had a can-do attitude when it comes to that type of work. You're paying me to get the job done, so give me an impact driver and the biggest hammer you've got. If I break something, it usually means it wasn't built strong enough to begin with. I've broken a lot of stuff over the years...
> If I break something, it usually means it wasn't built strong enough to begin with. I've broken a lot of stuff over the years...
Based on my experience, this statement scares me :)
Not to pass any judgment on your impact or abilities, however the types of devs that have been the most challenging for me to work with are those with this attitude that aren't quite as good as they think they are. It can be incredibly toxic to the rest of the dev team and generally bad for business.
You need to have a very strong handle of both the business side an tech side to do this type of work effectively. Meaning: no matter how much technical debt there may be, some stuff cannot afford to be broken. Judging risk there is quite challenging as you need a holistic view. I would strongly caution people from diving in and making sweeping changes if they don't have this.
The other internal flag that went off is refactors that improve parts of the codebase in isolation while leaving a less cohesive / congruent codebase a whole. This is often worse in the long run than just patching it and actually can make changes harder.
Disclaimer: I am in mostly a management role now so you can take the above with an appropriately sized grain of salt.
Certainly you want a refactoring effort to improve reliability and maintainability rather than harm them. I am a strong proponent of writing an "architecture document" before touching code to do anything but patch a straight forward bug, and soliciting feedback on it long before code review. This is precisely what develops that holistic view you mention. One of the first things I tackled in this codebase was to introduce abstractions to enable unit testing of code that was previously considered not unit testable. As the team has grown, we've developed processes to ensure that everyone explicitly considers risks and how to mitigate them whenever they make a code change.
I also agree with you that it's best to be pragmatic when it comes to developing software for a business. Code that's ugly but works is perfectly fine. When it no longer works one day, patching it to keep the lights on is the right course of action. When the same ugly code breaks over and over, though, it's time to solve the root of the problem. Sometimes there's inherent risk in doing that, and things break; it's necessary to do it for the long term good, though.
I try to write code that doesn't need to be touched again, but is pleasant enough to dive back into should you inevitably need to extend or debug it. I also try to reuse existing code and improve it as needed rather than create what I call "parallel codebases". I try to mentor my coworkers to do the same. If achieved, then it's a huge productivity multiplier.
I think I'm pretty easy to work with. I am confident in my abilities as a software engineer, but I'm also relatively modest. I try to respect work that was done before me and carry the good parts forward if it ends up needing refactoring. I prefer to let less experienced coworkers tackle problems similar to problems I've solved in the past while providing mentorship, so that they can learn similar lessons. I've avoided management because I know I'm bad at it, but I try to support management however best I can. I also throw the occasional team homemade pizza party when there aren't pandemics. Notably, I also tend to be able to work with the stereotypical difficult-to-work-with devs that you mention. My coworkers generally seem to say nice things about me to my face and behind my back, and upper management seems to reflect their appreciation financially. Honestly my biggest interpersonal problem at work right now is that newer employees seem to hesitate reaching out to me for fear of wasting my time. Therefore I try to make it known that I spend as much time staring at the wall as possible during work hours.
Given time, every developer will end up in some company, doing projects, having to go trough other vast requirements, documentation, databases, interfaces and codebases to understand these.
So I think you refer to persons who do not do projects but who stay i one company for a large time versus people who do projects (and are sometimes called consultants) and in general how they communicate with each other.
Yes, absolutely never criticize - as a manager, the #1 thing that makes me start to hate a report is when they complain about other people's work. Most of the time they don't understand why code was written the way it was (Chesterton's fence), and even when they do and are making valid complaints it's just a dick move that doesn't help.
Trust me, the lead and manager both know when someone sucks, they don't need to hear more about it. And if you're wrong with your criticisms, you just demonstrate that you suck. Literally lose lose.
This is one of the big reasons I prefer static typing.
When looking at some unfamiliar code in an unfamiliar codebase, I can reason about the code much faster when I can see what functions return, and quickly go check their types out if the type is unknown. This makes me much more productive.
I helped maintain a 250kLOC Python program. I came in when it already at over 200kLOC. I spent so much time, every time, just trying to figure out what's going on because I never knew what something returned.
Static typing is great until people ignore it. They pass around Map<String, String>, use magic strings for the keys, and stuff in Integers, Enum::toString, and UUIDs as values. And also building XML blobs by hand instead of serializing an object of a concrete class, and also deser’ing it manually on the other end (which receives it over SOAP no less).
What little quality research there is on programming productivity, does support the idea that dynamic typing is a productivity hindrance as code bases get larger, for exactly this reason. Some studies actually have video data of the programmers at work and they can seem them having to hop around to function definitions more to figure out what they are supposed to pass in, etc
I don't understand why people say this about Java and C# vs. Python. You can easily write concise code. There will be more lines because of things like brackets and some other simple things, but c'mon, it's not 2005.
Yep, indeed. Most of my head scratch moments with big code bases aren't about anything as technical as types. It's not as if knowing the types would have made me realize what the code actually does, which is the main hurdle.
In my experience, static types allows me to ignore a lot of code, which helps dealing with large codebases.
Sure it won't do magic with 1MLOC of spaghetti code, but in a reasonably sane codebase it makes it much easier to know which parts to care about and which you can ignore for the time being. Again, that's my experience at least.
Not necessarily. It's possible to write quite terse code in Java, especially Java 10+ (var, streams, etc.). And now with records and pattern matching, even more so.
lombok is the holy grail of java boilerplate removal.
With proper usage of spring-data, spring-cloud-stream, spring-* lomboked java code can be very terse. If you follow conventions repositories, rest clients, mappers etc. are often defined only by interfaces and annotations - actual implementations are generated.
The downside is that the entrance to the full blown spring-* world has a step learning curve - there is a lot to read at spring.io.
This is a great article. I felt like it was describing a job I recently left, especially this piece:
> It’s fine to have less experienced people working on a large system as long as they have the elders overseeing their work. In the world where senior titles are handed left and right, that is often not the case and it’s how you end up with a very fragile system that is suitable for a replacement as soon as it was built
Then I got to the advice part of the article and had to laugh.
Read the documentation? What documentation. Not a single scrap existed.
Look at the tests? I'd love to, but they never wrote any.
Code comments? Nah.
Use the IDE for intellisense? Great idea except the database models are in a different project so the furthest you can get is the compiled definitions that were copied into this project.
The method that eventually kind of worked was "use the debugger for absolutely everything."
It was honestly one of the most miserable experiences I have ever had.
This is one of my biggest gripes. Someone (I think uncle bob) said that good code is self-documenting, which is bs in 95% of the cases. Yeah, you don't need to document the convertMinsToSecs() method, but most real life codebases are full with edge cases, shortcuts, temporary solutions, half-complete reorganizations. So people use this for writing no comments at all, whereas a few words of comments would save hours of investigative work for future developers working on the codebase.
> Someone (I think uncle bob) said that good code is self-documenting, which is bs in 95% of the cases
Agreed. For example, comments about Why-do-this, and Why-Not-do-that can be necessary, even if the code shows what happens.
Imagine you're in a taxi, and it suddenly takes the wrong turn, now instead heading towards Surprise-City. Then — you know what is happening. You're going to Surprise-City.
But would't you also want to know Why?
So then it's nice if the taxi driver explains Why: "I buy milk to kitten."
I think the 'Linux kernel coding style' explains comments pretty well:
my priority for comments is that they should answer "why?" and "why not?" questions. Why does the method/function do it this way? Why didn't it choose that other, perhaps more obvious route?
That's not necessary in every case. But it's true in a good number of them. The code alone can never tell you that - but it's often invaluable during evolution/refactoring.
If it's complicated enough that I'm only understanding it because of the context of the last week, we need the comments. Anything that can speed the reverse engineering in 6 months when it breaks is helpful, because then you can quickly decide that we got different input or if we missed an edge case or whatever.
While I agree to some extent, the problem with comments is that they need to be maintained in order to be helpful: code comments - updated when the code changes, general comments - when the context changes, etc. This is a work in itself: developer has to remember to do it, reviewer has to remember to look for it. In my experience, people tend to forget to do it or just don't bother, which means that someone else finds himself with a contradictory, outdated, confusing comment further down the road.
Somewhat surprisingly, it depends. For instance, in our case application support is responsible for/interested in the documentation, and they make sure that developers create/update the relevant documentation for each of the release items. Stale comments, on the other hand, need to be identified and addressed in code reviews, and in practice we're not always good at that.
So, the advice is bad, as most people (including myself, no doubt), will write mediocre code, just by the shape of the distribution (assuming it's normally distributed, which is a strong assumption, but without data it's probably reasonable).
Advice that relies on people caring about their craft/having the skills to do the work well doesn't scale, so it's bad advice where those things aren't true.
You come a long way with avoiding direct calls in "if" clauses and similar, using descriptive variable names, and not trying to be clever for the sake of being clever.
For example, instead of
if (order.version > 1) ...
assign it to a descriptive variable
bool orderHasChanged = (order.version > 1);
if (orderHasChanged) ...
IMO this makes it much faster to read and understand, because it says something about the intent. It can also be easier to spot bugs.
It's a bit more to write, but I find it makes a big difference when coming back to the code later on, and typing is usually not the limiting factor when writing code.
I think Uncle Bob got cancelled /s, but politics aside, I don't think he was necessarily wrong about good code documenting itself, but it came with a lot of direction about what then exactly constitutes good code. If people don't bother to hone the skills of good types and methods, well named and with clear responsibilities, of course they're not clearing the bar to drop the comments. Having grown more senior, I believe I have gotten a little better at expressing meaning and intent through the code itself, and I'm surely writing a lot less comments because of it, which seems a win overall.
Use the IDE for intellisense? Great idea except the database models are in a different project...
If for whatever reason you can't link the project properly, as a tip for the future, I've ended up just making a separate copy of the project which included a reference to the uncompiled project.
And in the worst instance I actually used a reverse generated project from the binary. I actually gradually refactored that auto-generated code into more readable code too!
I actually tried to make a separate copy of the project once but the project had to be in a specific folder on the hard drive. Trying to put it somewhere else broke so many hardcoded absolute paths that I gave up fixing them all.
I’m currently in exactly that situation. No comments on tiny workarounds, no high level docs on any feature, any kind of code criticism interpreted as personal attacks. I’ve worked at teams where I can easily add 1000+ lines of well tested and incremental code a week, but here I can barely reach 200.
Although, it’s not just programmers fault, the product team is just pushing for changes and never allowing any time for code simplification.
The advice about using both grep /and/ the IDE is very good. Often they are framed as in opposition to each other, but in reality they're just tools. IDE's are great when they work, but it's entirely possible to make it confused.
I keep hearing to get better IDE's, especially from Java developers who seem to have nicer IDE's than us C++ schmucks, but even the best IDE will not save you when your program is really an interpreter for some ad-hoc, unspecified dynamic language implemented on top of YAML or XML.
I highly recommend having shortcuts for both ripgrep, fd and clangd in your editor. Also remember you can use the .rgignore file.
Visual Studio Code is nice in this aspect. When you use the terminal window, the output is parsed and you can ctrl+click on a filename or (filename:line) to jump to it
Speaking of IDEs - I work in video games development, huge codebase that's over a decade old, heavily templated C++ code - I've switched off the IDE "suggestions" long time ago, visual studio is just wrong about incorrect/missing code like 90% of the time. Just hit compile and read the errors, I have files that VS shows as nearly entirely wrong, squiggly lines everywhere, and yet they compile and link fine. And the opposite where VS doesn't see any issue at all but they don't build. Or they build fine using MSVC but not in Clang, or vice versa, and VS has no idea.
Yeah, visual studio is absolute and complete garbage with C++ code, it always identifies correct code as having errors, and it's not just a "big project" thing, it happens in very small projects, even "projects" that have a single file. I really don't get it... I also don't understand why intellisense doesn't update itself with the results from the compiler.
If you get the chance, when you encounter something like this that is reproducible (or at least seems obvious what's going on), you can use the Report a Problem tool and capture as many relevant diagnostics as possible. I don't work on the C++ tools team, but generally the folks working on VS are highly interested in getting detailed bug reports.
> In 1980’s Tim Berners-Lee realized that the documents are hard to find at CERN, so he started imagining a system of interconnected documents that would supposedly solve this thorny problem for good. Nowadays we know this invention as the internet.
> Despite 40 years of improvements and the internet becoming a part of our daily life, we still face the same problems. You can talk to another person half way across the world while watching a funny cat videos, but somehow we still struggle with finding the important project documents
TIL. But I think the author meant the Web[0]. Iirc internet originates from ARPA.
The article mentions the importance of comments and documentation inline in code. I tend to agree - well-written code is great and all, but a good comment can bring in context external to the code and make _why_ code is what it is more clear to future readers. And reviewers. Comments explaining _what_ code does largely aren't needed - that's evident from usage. But the _why_? Some people would say code which can't be explained by one liner comments is "too clever". Well, I'm inclined to disagree - it's hard to fit a full historical justification for an awkward handling of an edge case into a single line. I once wrote a 17 line long comment above an 8 line diff; that much context felt justified to explain the odd code. A reviewer, hilariously, had this to say:
This right here is "here be dragons" commenting level Double Dragon.
When I come back to code I've written long in the future, I think I'll be happier to have written the detailed "commenting level Double Dragon" long comments, over the more ambiguous yet still traditional `// HERE BE DRAGONS`. Mostly because that comment will give me the context I need to know if any of that _why_ has changed and thus in what way it's likely safe to change the commented code.
To me, there is also the question of change frequency. Of course, it's a guesstimate all the time, but things that look to be rarely-changing (and usually ends up being high-impact) deserve over-documentation. Something like our AWS VPC, DNS and DHCP setups and their expected change modes and their effects are very well documented. It rarely changes and if it breaks, everything breaks. The fundamentals of our disk setups and disk encryption in ansible is carefully documented because if that breaks, things will go hairy.
In those cases, having a lot of documentation, speeds up changes because you can store months and years of deliberation and decisions in these comments.
I really connected with the writing. Thank you for taking that time.
There is a lot in my current context the writing resonates with. Nice to find that others have been on this path too, and that we benefit from a lot of the same techniques.
Thanks for putting this out there as an invitation to draw people together.
An interesting way to approach the documentation issue discussed in this article is 'wiki bankruptcy': when a wiki goes stale, simply tell all devs to save what they think is important before deleting the whole thing outright. Then, they can recreate those pages into a new wiki. Read more about it here:
Over the years I have 'bankrupted' several supporting systems, some more than once. I've deleted shit like
- old tickets
- documentation / wikis
- old infrastructure
- old backlogs
I'm actually going through this process now with my current team. There's so much stuff we have written by our predecessors that is just no longer relevant. So, I've set up or renamed our Jira/confluence spaces and then move/copy back in only that content which is still relevant to us. Everything else will be archived.
In this way, everything which comes out the end of this process:
- is ours
- has recently been seen/reviewed by at least one pair of eyes
- is still relevant to the business and the product
I navigate codebases by cat'ing all the files together (prefixed with filename) and piping it into vim.
I was shocked how much I learn by this seemingly-horrible technique. For example, the python files that actually get deployed are often quite different from the ones in source control. For tensorflow, at least.
I regularly read 1M+ lines of code this way. Not an exaggeration; vim scales, nothing else does.
If you read/scan 10 lines a second, you still need over 24 hours non-stop to read a 1M+ code base.
I doubt a random file ordering is helpful! Especially if you lose code navigation features like "go to definition".
Suppose I want to "go to definition" for a class named Saver.
/^class Saver\>
19 times out of 20, this works. It's also instant; my vim will likely get me there faster than your IDE's go to definition functionality. (Looking at you, pycharm!)
Here's my flow.
>>> import tensorflow as tf
>>> tf.train.Saver
<class 'tensorflow.python.training.saver.Saver'>
>>> from tensorflow.python.training import saver
>>> saver
<module 'tensorflow.python.training.saver' from '/usr/local/lib/python3.7/site-packages/tensorflow_core/python/training/saver.py'>
Then I open /usr/local/lib/python3.7/site-packages/tensorflow_core/python/training/saver.py.
Suppose I want to know: Where are all the places that Saver is used in all of tensorflow?
time find . -type f -name '*.py' | xargs merge | ft py
Vim: Reading from stdin...
real 0m0.930s
user 0m0.243s
sys 0m0.414s
/\<Saver(
:%v//d
I did that by highlighting "being added to the GLOBAL_VARIABLES", ctrl-c, then pressing "u" to undo the :%v//d, then / followed by ctrl-v.
That might sound hard, but with muscle memory I don't even think about it -- it's like explaining how you open a can of food. Do you really think about where you place your fingers, or the pressure of your nail on the flap of the can? No, you just open it up. Same thing here; it's automatic.
Way faster than IDEs, and I get just as much (or more) info.
I'd love to use pycharm, but the slowness keeps pushing me back to this technique.
How do you handle finding occurrences of a specific variable with the same name as another one? Simple text search has the power equivalence of a dull butter knife.
Nothing wrong with this (big fan of vim myself). I can see the appeal and simplicity.
I just want to add that in Intellij you can do Ctrl + Shift + R "Saver" and it will search in all files (as dumb text matches, not usages), plus optional checks on extensions, etc. This is pretty fast too, and quite convenient, since it has a preview for each file.
Not saying it is better, but it is an alternative.
I'm happy you mentioned that, because this highlights a very important difference: Your IDE would show you all "Saver" files in some checkout of Tensorflow, right? But you usually don't want to see the latest version of Tensorflow. You want to see the current version that's installed, which is somewhere under /usr/lib.
I haven't found any IDE that can easily and effortlessly do that.
I agree that things like "Go To Definition" can be pretty bad (especially when a codebase has code that is auto-generated, but you haven't figured out how yet).
But I'm curious, what benefits do you see in your approach over just doing a grep in the folder (or e.g. Ctrl+Shift+F "Find in Files" in Pycharm)?
grep type approaches are really good when part of your application is generated as SQL strings to make tables (which is a pathology sadly common in most data science codebases).
i do something similar but with tags and without the merging. vim's tag navigation is very powerful. split a window with the target tag and see side by side the class and its descendant for example.
i use `venv`s for every project inside the project folder so uctags also generates tags for all libraries installed and i can "drill all the way up" to classes and definitions.
it's also possible to have the system libraries show up in the tag database, it's just a matter of telling uctags which path's, files to include/exclude or alternatively use another tag file for that, vim can use multiple tag files.
the real pain point is keeping the tag file update. gutentags makes this a bit more easy for me.
i use `venv`s for every project inside the project folder so uctags also generates tags for all libraries installed and i can "drill all the way up" to classes and definitions.
it's also possible to have the system libraries show up in the tag database, it's just a matter of telling uctags which path's, files to include/exclude or alternatively use another tag file for that, vim can use multiple tag files.
Oh?
Yours is the first system I've found that has this very important feature -- the whole reason I do it my way is because I can dill down into the actual installed libraries, whereas IDEs almost always fail. (It's hit or miss. Yeah, theoretically you can configure the IDE properly if you spend your life becoming an IDE master and have a 96-core workstation, but it never seems to "just work.")
If you ever do a writeup of how precisely you've set up your environment, do ping me! I'm https://twitter.com/theshawwn. I'd be very interested and would happily retweet it.
I use pycharm and as long as I set the virtualenv I'm using as the Project Interpreter in Preferences it lets me "drill down" to the library code as well
Yep, I didn't see why the author decided to snipe at Emacs and Vim at the end of the piece. Maybe 4 or 5 years ago, that criticism of Emacs would have been valid, but packages like flycheck and lsp-mode have brought Emacs at least to parity with many IDEs. In terms of Git alone, Magit has catapulted Emacs past any other editor, IDE, or Git GUI client.
- Areas vs perimeters: perimeters are linear, areas quadratic. This is why you really, really want tests. Tests will black box a component and test it from the perimeters, basically the external API. Only once something needs to be changed do you need someone who understands the insides of that component. But the testing is kept small, and the error domain is kept small, so that you might have different people fixing different components.
- AvP, part 2: people's brains can index a lot, eg you know where the tests are, you know what the components are called, but they can't map that much. Your engineers will know what line to change for the parts they've mapped, but they'll have to spend time if they only have an index to where it might be.
- AvP, part 3: documentation can mean a map or an index. Rewriting the implementation in prose is bound to go wrong. The version control method makes a lot of sense here, it connects locations to technical decisions.
- Visualising is to ensure you have held down the complexity. If the 2D box-and-line chart of your project is just a huge blob, you've done it wrong.
- You need to comment code, but try to keep it to one-liners. If you can't explain in one line what some snippet does, it's probably too clever. Also don't think that everyone will understand it just because you gave everything sensible names. Your code might be read by someone used to reading a different language. Or more importantly there's some domain specific reason why something needs to be done a certain way, and you don't want the next person to forget that.
It’s worth spending some time learning the grep or similar CLI tools that can quickly find the files containing the relevant keywords you are looking for.
Even though it is great advice, there is some sadness in the fact that it should even be advice. Apart from knowing how to enter code, the other most basic thing should be knowing how to lookup code?
Nitpick: it shouldn't be a CLI tool, a proper text editor or even IDE allows you to perform the same thing as well. Many text editors also have a pretty good indexing which will show matches on hovering the mouse. Never good enough to blindly trust, but usually faster and does a good job as 'quick win' for the first attempt.
This is a skill I've noticed that many developers don't have, or don't have sufficiently. This lack manifests itself e.g. when I review a PR that removes feature XYZ. I do `rg xyz` and `fd xyz` to see if there's anything that was forgotten to be removed related to that feature. Very often there is.
This is a skill I've noticed that many developers don't have, or don't have sufficiently
Yes, and I have trouble understanding how that is possible. Ok if you've never programmed and are just a beginner, but otherwise? Or does it depend on the kind of code? I assume this gets taught in programming / CS course, no? Or maybe not, and that is the problem?
I think there’s an argument that instructors’ time is better spent on other things, but, yeah, students should be exposed to this stuff somehow or other.
I love keeping Sublime Text installed even for iOS work for just this reason -- no (smart) autocomplete and debugger make it hard to use day-to-day, but the raw speed at which it can navigate code is just awesome.
We manage a codebase that is well over a million lines of code, and has a history dating back >5 years.
One of our answers to this problem is extreme amounts of standardization. We might have 1mm LOC in platform services alone, but it is spread across 50+ types and each looks almost identical. Everything uses the same persistence mechanism, migration technique, error handling, configuration provider, etc. Dependency injection + reflection + standardization (interfaces/abstract types) is where you can get into some really powerful leverage regarding keeping things organized and sane. Ultimately we have ~8 "flavors" of thing that developers usually need to worry about.
Our end game answer is to get away from the code altogether. We are starting to view code as glue between what would ideally be configuration-based implementations and the nasty real world which must be mutated in icky ways. So, instead of writing code for a module every time you need to implement it, make it once and in a generic way, have it take a configuration object, and then expose a web UI around configuring that thing. Then, all that code is reduced to JSON being passed around. When you are dealing with pure data, you can get away with the most ridiculous things. Cloning objects, versioning, validations, relational queries, et. al. becomes trivial. If you have 1 stable domain model throughout that is 3NF or better, you can use SQL to do basically everything.
Edit: One more thing I would note is that a big part of why we are able to support this codebase is because we have adopted a sort of "hive mind" developer mindset, where everyone tries to role play this ideal of a developer who would best be suited for the task. We acknowledge that our codebase is not a place for much "fun" and the best analogy I could come up with is its like doing something in a nuclear power plant control room. You just gotta do it by the book every time, and then you get to go home to a safe and happy community. It's not like we employ volunteers.
sounds like an ideology lock-in. Let's hope you never get a problem which does not fit your current architecture well, or else you'll end up spending weeks or even months solving an otherwise trivial problem.
I've worked with "configuration-based implementations" and in my experience they are hard to work with (no debugging, incomplete documentation and implementation, little flexibility), require an staggering amount of infrastructure, are hard to test and will approach a programming language over time.
I agree with the concern, but we have had a very long time to refine our architecture. Some would call it an ideology lock-in, I would say we solved our problem domain in a deep and meaningful way and would prefer to stick with these proven approaches. Our entire codebase was rewritten approximately 4 times before we got to the point of being confident enough to push forward with a data-driven/configuration approach.
When you are writing the same business logic hundreds of times and only 10-20 discrete things are different between each implementation, it starts to make a hell of a lot of sense to expose those things as parameters to be configured. It's simple economies of scale at this point for us. Despite our small size, we are trying to get out of a "move fast & break things" startup mindset into a more stable "lets take this to 1k customers now" mindset (we provide a B2B application in a small market, so 1k is a huge target).
For us, our company doesn't become profitable until we can scale our operations by 5-10x without any more headcount. The only thing we could come up with that would allow for this is configuration-driven techniques in which entire customer implementations can be cloned as simple JSON contracts for purposes of bootstrapping the next customer. Developers are removed from most of the product implementation process, and can focus more on core product value which is now levered hundreds of times over due to being exposed as configuration contract.
I am NOT arguing that one should seek out to build a configuration-driven system from day one. That would probably be the biggest mistake you could make. You have to already have a mostly-functional product that people already want to buy/use before you can even consider this approach. Even then, you should probably expand your target market and inject a few more use cases & rewrites before you jump over that chasm. Having a squeaky-clean domain model that addresses all potential use cases is the bare minimum prerequisite, IMO.
How was the culture of the “hive mind” developed and maintained in the organisation? I can imagine there are challenges you’ve faced to keep it working
Start small and grow carefully. Not every developer is a good fit for this type of approach and the amount of discipline we require.
We actually started looking at an approach where new hires would come in on a 6-12 month contract basis. The whole idea would be that there would be no hard feelings either way at the end if it didn't work out. If both sides felt like this was a good fit, we explore longer-term options with more benefits.
The way we do software is unconventional. We are in a very constrained environment from a security perspective. No containers, nothing can be in the cloud, all data must live on the same physical host, software delivery is tricky, etc. These constraints make the work we do somewhat unappealing to a certain crowd of developer who seeks to maximize their exposure to shiny new things.
Put differently, we use boring old technologies (with a few exceptions) and set expectations that we are going to continue to use those indefinitely. Any hopes of "mixing things up" should be reserved for future endeavors on our roadmap and personal side projects (which we encourage). I don't think any of this is unreasonable or unrealistic. We are in the business of selling software to other businesses in a sensitive market. We are not making DLC for AAA videogames.
Tons of complex code and noone to share the local knowledge? Isn't it a setup for a failure... yet the org is still in business, probably generating revenue.
So, I'd say the leverage is as always in understanding the dynamic and politics that often brews in places with the "monstrous" codebases.
Debug the people functions, so to speak, and it may eventually help you navigate that codebase. No one needs to be another hero, no one needs to burn-out while single-handedly fighting the beast!
Well, politics often are messy and more unpleasant than the code at hand, so we dig and rant...
In such case, I would try to limit the scope, instead of trying to learn the secret language of gods that stitched the whole system so it would make money.
My rule #1: WTF?!...But they must have had a reason for that.
Rule #2: Keep your code changes in-style; bug-free, of course, but similar idiomatically.
Rule #3: Try to do something, then try to do it together with others.
Maintaining a large codebase is not so much about tools as it is about finding ways not to do it alone.
Are there any visual "code flow" interpreters? Something that would separate the 1000s of interactions between functions and show flow lines between them?
I think the thing is - the code that gets thing done will confound automated visualization tools.
It's basically like expecting decompiler output to be super helpful. It may help your understanding a little bit, but much will be lost. Yes there are heroic decompiler stories, but time is involved.
Also, most code has macros or helper functions or automatic code generation or something that obfuscates what you're really looking for. You will have to develop a system to unblock this organically.
What will help:
Peruse the source code. Try to follow the flow. if you have tools to jump back and forth between a function call and definition use it.
fix some bugs. Follow the stack traces up and down.
ask people how stuff works. put in the time. and the other stuff mentioned in this article. The osmosis method is really how you'll get it.
There is no use viewing the whole graph at large. But “zooming” into and seeing the connections going to and from a node is really useful and not at all “magical”. As it is mentioned in the sibling post, sourcetrail can do it really well.
Maybe one could limit the graph to the happy path to begin with.
The output of a run of profile-guided optimisation could be used to discover that graph, and only the touched functions would be drawn. This graph could be useful to start with the codebase, without being too overwhelming with all the edge cases.
If you use VS Code with Ruby or Java, check out the AppMap extension. Its core function is similar to what you've described: diagramming execution flow and component relationships. It's dynamic analysis, so it captures data snapshots as well.
https://marketplace.visualstudio.com/items?itemName=appland....
I wrote one for Tcl years ago, as I started working on a really complicated product with around 50k lines of code (or more, distant memory).
Maybe I should give it another crack for modern languages - there is an even greater need for it these days with dependency injection and microservices being common.
The way I see it the need stems from needing to understand what is REALLY going on, as opposed to what the code is saying should be going on.
Yep, Sourcetrail can do that, for the languages it supports. (It has an SDK so additional languages can be added, with effort.) Give it a method, another method, and it will draw a line from point A to B (with all the functions in between) using static analysis plus you can explore before and after to see what calls what. You can even see field usage though there it can be confused sometimes, understanding varies by language. But it’s still really useful. Doesn’t yet support cross-language integrations but it has a lot of potential now being open source. It works great for individual developer use, for team use I’d want to try my hand porting it to React or the web in order to more easily share views with others, and perhaps use a central database. For now you can make Sourcetrail projects as part of a CI system to share them with other team members.
In addition to Sourcetrail, I also recommend adding OpenTelemetry for distributed projects or flame graphs for less distributed ones. Some of the videos Honeycomb.io put together really highlight the value of distributed tracing, such as this one: https://youtu.be/GuIWQ-EF7YE and the OpenTelemetry Collector makes it simple to filter telemetry, route it to services or drop a majority of traces which don’t have exceptions, for example.
One day I hope OpenTelemetry tracing can be baked into any language the way flame graphs tend to enjoy first-class support in Java, and that tools like Sourcetrail can be baked into IDEs such that runtime metadata is available just by hovering your mouse over modules and functions. Kind of like CodeLens shown here, but for understanding the code: https://docs.microsoft.com/en-us/azure/azure-monitor/app/asp...
JetBrains Space or GitHub doesn’t yet analyze code beyond dependencies/security issues/CI but might in the future.
Finally, there are tools like https://backstage.io/ which hint at a future where developers build their own infra tools for the rest of the company to use… but that hasn’t extended much into the realm of modelling, documentation or telemetry yet. Folks might be lucky if they have a hosted copy of SourceGraph right now… the future, I think, builds on all of these ideas.
Can you use sourcetrail on properitery codebases as well? I see it's GPL and according to my understanding, it's okay to use it on properitery software as long as you don't make any modifications to the sourcetrail software itself. Are there any hidden commercial licenses before I try it out on my company's codebase?
I’m not a lawyer but if you’re not embedding GPL code output into your code, you’re fine. Using GPL code to write or reason about code under a different license is not the same thing as having GPL software output a copy of its own GPL-licensed code, for example: https://softwareengineering.stackexchange.com/questions/5221...
The only other risk is letting your company’s proprietary code be visible by third-parties but Sourcetrail runs locally on your computer and can run completely offline.
As to Sourcetrail’s licensing— it previously had a closed license and was supported by a startup with a number of employees. It recently went open source and can be supported financially through Patreon: https://www.sourcetrail.com/blog/open_source/
In my opinion, apart from very tiny codebase that can fit in someone's head, there is not so much differences between medium and large codebase.
You just have to use grep and the IDE to get around the parts of interests for your task at hands
The article describes pretty much what I’m facing at my job. We have a monolithic Ruby on Rails application with lines of code in the millions.
Still we general mantra is that comments are not allowed and I can only agree with the author that this makes non standard parts of the code extremely hard to comprehend. I would definitely love to work once on an application that size with a few comments here and there.
In my day to day work I depend a lot on our test suite. If I can’t even find the tests that cover this part of the code I just break the code on my branch and let the CI tell me which tests fail. There are probably better ways to do this with test coverage tools but this method seems fairly straight forward to me.
Documentation is something we started to do recently. I feel as long as the documentation is not directly connected with the code it is hard to keep it in sync. We even have PR templates that mention to update the documentation but the shape of the documentation is just too different from the code to have a straight forward Intuition at which point it needs to be updated. What happens for us is mostly that the feature owner at some point realizes that the documentation pages are not accurate at all anymore and rewrites them.
Sadly our commit messages are 50% of the time useless so that they serve more to know who to talk to than to understand why the change was done. PRs and commit messages are great documentation I wish we would use them more. In my company the idea is more that the change should be so small that no explanation is needed but I feel this idea misses the point that code can’t explain *why* something was done.
This is definitely an area for further improvements. Are there best practices someone could point me to?
I think that the only thing that really works are code reviews. The best engineers on the team should have enough time to review what is being committed and provide suggestions for improvement. Things will start gradually improving, but it usually takes a very long time before you see any progress.
Like someone already mentioned in this thread before, an important part of this transformation is to not forget about the political aspects of such cleanup process. People don't like to hear criticism, so you will probably encounter a lot of pushback in the beginning.
How to you write your commit messages? For us in most projects there's a git hook that forces you to put the jira ticket number in the commit message (and the branch). So if you have to know why a change was made you at least a context of what the task was, which helps.
This was a nice read and is recognizable, probably a large part of Dilbert comics could fit in here...
There are projects that last for multiple years with larger TEAMS with the only job to entangle existing complex landscapes. Most of them fail.
Since these teams do consist of pretty smart people... I think one of the funny things you could do is list the things these people say when they start this adventure on day 1 "ah yes lets just grep stuff" or "i will start examining tests" and "i will make a spreadsheet of all interfaces" and "i will do interviews with older developers". About 6 months later the spreadsheet has become a separate application that is so complex that it is a complexity project on its own. The amount of documentation found is now about a couple of million separate documents and realization drops in that the lifetime of the universe is probably nearer as end date. The datamodels found for the gazillion databases now covers a library in itself. the end date of the universe is closer by than the end date of the project trying to understand what the environment is. And no it does not help that any developer or business person ever involved long left the company.
Comments: yes i agree. 80% is logical does not need a comment. 20% are the pieces of code coming out of meetings that lasted hours and which ended with strange outcomes that no-one will ever understand without understanding why things were setup in the way they were setup. And then there is the 20% added by junior developers who had no clue but just changed stuff here and there. It is hard to make that distinction because from the outside they look alike. Anyone trying to change the code to make it "logical" will remove the 20% illogical code and produce something maybe even working but no longer in line with desired results, also a junior mistake.
Tangentical, but relevant for complex systems and organization of them and their code:
I’ve started to look at BPMN, a thing I used to shun (bloated java enterprise junk that just slows coding down), as a way to actually help organize code.
If you have a process layer on top that describes exactly what is supposed to happen and you organize code accordingly it makes changes to complex systems easier to reason about.
I know there’s a lot more to it than this, but in my mind concepts from domain driven design, minimal service scopes and a de-coupled process layer really can help.
I guess this is why “tech” versions of bpmn-ish like systems have appeared with for example netflix conductor, uber temporal/cadence.
Split stuff up in relevant domains and describe their relationships and processes. Try to stay away from massive code bases spaning domain boundaries if possible.
No silver bullets anywhere ofc, but this is currently a topic at my employment this very moment. :)
The two best bits of advice I got about writing code when I started were: 1) start off by saying what you're doing and 2) the is always an 'Else'. "If you can't say in words what your method/function/class/... is doing you probably won't code it right either." Sure I'll leave if off simple stuff but it's really helps me keep focused. And when the debugger suddenly drops me into the middle of something it's a lot faster to scroll to the top to see what is supposed to be going on.
I also agree that's it's complete macho crap about not needing comments. Hell I come back to something I was working on six months ago and don't remember what I was doing. Far better to have some comments telling me than spend time looking back over more code to figure out what this piece is supposed to be doing.
> Resist the temptation of fixing the parts that you find horrifying, because first you can’t fix it all and second you will get crushed by the complexity of the system. Mark those places down as a horrifying place to be and keep them in mind when it’s time to refactor.
Or you will run into the territory of someone else.
And then you try navigating an Akka Framework based source with their ask pattern. No way to navigate where the control flow would go with inheritance in the mix.
For very large codebases, this is often not an option. I know of very large 'let's write Cobol mainframe to Java' projects, burning 10s of millions of euros, that were just thrown away because they could not actually get it working in the end.
And this is not limited to mainframe projects; it happens with (large) more recent projects (Java/C# mostly) as well.
For sure, it's not an option for these existing systems.
But when building a new system, design it and plan for it to be retired when certain criteria are met (e.g. when it hits 100 reqs/sec, 3 years from first release, when LOC hits 100k).
Absolutely: I was only commenting on a blind 'just rewrite it' as that will always be the first reaction of tech people and quite often it is simply not feasible. But agreed, by design it can work.
If you are asking me if I’ve experienced a successful rewrite? Yes, numerous, and the more successful ones have happened as a result of planning.
I’ve also been part of rewrites that have gone poorly, due to a lack of planning, where the legacy software fails in unexpected and unanticipated ways, which requires a rushed attempt to fix the issue (which fails) and a subsequent rushed attempt to replace the core functionality when the fix doesn’t work (and also fails because “core” tends to be larger than you initially think).
Knowing ahead of time when software isn’t going to be useful any more isn’t really an option, it’s just an acceptance of what is already going to happen.
This only "works" for companies that have unlimited VC funds to light on fire, for companies who have to actually make money, this is in no way something you can do.
This is the equivalent of bulldozing your house and building another because your hot water heater broke.
It works for every company who has software they maintain, and while I’m sorry you don’t think it’s a good idea, I think the issue is more with your lack of understanding than the idea itself.
Specifically, your analogy to construction is a bad one - software is not construction, and one critical difference is the cost of rebuilding is many orders of magnitude cheaper.
When you include the reality of obsolescence into your design, you are actively anticipating and accounting for problems as they’re outlined in this article, which is always a good thing, and will always improve your planning and its outcomes.
Burying your head in the sand by expecting to never have to rebuild something is very poor project management, and not how competent software shops operate, period.
1) Don't apologize to me because you don't understand how business works.
2) If you need to make a profit (and NOT just light someone else's money on fire), you can't afford to just throw away your profitable product(s) and rebuild them because some devs are easily triggered prima donnas who want to whine because "legacy." That's not how business works in the real world.
3) Companies (again, the ones who must make a profit) aren't going to get clients that way. Clients invest in software because the large upfront costs and small maintenance costs will amortize over the many years they intend on using this product. They aren't going to fund rewriting the product from scratch multiple times.
4) Clients are not going to fund a rewrite ever few years because the cost of rewriting is not "many orders of magnitude cheaper," that's a complete and total fiction. I know this because I've participated in ROM estimates for clients.
Of course, if your "business" has unlimited VC funds coming in then its backwards land. But most businesses exist to make a profit, not to spend someone else's money.
I found cscope essential when working with a 10+M loc C codebase.
I wish there were more cscope like tools for other languages, easy to setup and editor agnostic.
In addition, most developers are fiercely defensive of their code and you need to be aware of that when you chose to replace it. A trick I find useful is publicly declaring, in your team meeting, how useful you found x members tests in covering your refactor. Or their comments or documentation or domain knowledge. When you are picking their brains for implementation specifics try to sympathise with them when you see a bit of hacky or confusing code. Say “I’ve had to do something like that before because of xyz”. It will save face and you will get more out of the developer. Never criticise, they will know what they have done wrong without you telling them. Just, be nice.
If you are looking for devs that can do this sort of work effectively then get them to read code in an interview and explain what it does to you. They can even offer suggestions and you get to see how they deliver criticism. Not complex algorithmic code but simple vast swaths of business junk.