The codebase of an internal system our team inherited is pretty old (14 years) and large, and there's a feeling a lot of the stuff is unused (dead code or unused ancient features) and can be deleted. Fortunately, it's a monolith and all that matters is HTTP endpoints defined in a few config files. So we've made a script which downloads all access logs from the live server for the last 3 months and tries to find endpoints with zero or very low usage (it did find quite a few unused or rarely used endpoints). I'm planning the second step to be a script which builds a dependency graph and finds everything which is only reachable from unused endpoints. I wanted it to be a one-time endeavor but this blog post made me think it should be run on a regular basis.
Be very careful using this methodology. This is how you end up deleting stuff only called in rare but critical cases:
- during outage
- at end of year
- during audits
- when the one special customer that paid a fortune for an obscure feature decides to use it
It might be the case that none of this applies to your application, but it’s why “check what got hit recently” (3 months is an eye blink in the business and govt world) is dangerous.
I'm a Googler owning a system with a "stop the robot uprising" button that we seem to press every couple years. Needs code in _a lot_ of microservices that looks dead for all intents and purposes. It gets a bit annoying in various reports, but telling Sensemann to never ask about these files again was as simple as copy-pasting a line from the description to a comment on the PR.
It's an internal tool, so if we delete something actually used by accident, they will let us know via the support desk or by visiting our office :) Sure it's dangerous but the code will still be there in git history, so it can be reverted, it's not deleted permanently. We're not planning to delete data. Before me,
there was already an experiment when they modified code on a few pages to always return an error, to see if someone would complain - and no one did. In any case, there will be a human in the loop to double-check HTTP endpoints suggested for removal.
>unused code even if it ends up breaking something for someone for a while.
There are industries where this mindset is business ending. It’s pretty much only high volume consumer markets where you can afford to have so little indifference to your customers.
Breaking customers in this way is also immoral. You have no idea how you are impacting their lives with your previously-working but now broken software. Everything is mission critical to someone.
It’s so refreshing to see that written out. This is my reality. For my business, which variously runs email, e-commerce, DNS, and similar for other businesses, our services are mission critical for every customer, and they don’t expect us to touch things we don’t need to touch, or to do things they haven’t asked for specifically.
The idea of deleting code without a _thorough_ review of exactly where, why, and by whom it is used for what, is a non-starter. I wouldn’t think of deleting anything unless it was truly dead, and even then I’d take it offline and archive it so I knew what code touched customer data.
I actually enjoy this stability. I previously worked for a company with tens of millions of end users (at the time, on-prem and cloud solutions) and a code base stretching back for (at the time) almost 15 years. I don’t miss the pressures that came with that.
>Otherwise this is just making dev life easier on the cost of their users.
Not necessarily so, I'm expecting removal of dead code will enable faster delivery of new features, and also enable easier refactorings to make code more robust, because the dev team won't have to take unused features into consideration. The code we inherited is very complex and very brittle, so adding something often breaks something else and devs spend way too much time figuring out all the interdependencies. A major complaint is that the dev team doesn't deliver fast enough.
"I'm expecting removal of dead code will enable faster delivery of new features"
Well, for sure it does. Getting rid of bloat is always freeing energy - but only if it is really dead code. Otherwise you can introduce rare bugs, or break things in unexpected ways. And yes, sometimes the fastest way is to just try it out and see how things run. If it is not medical equipment, it might be fine even on a live system. It really depends on your project and users. But most users really like stability. Especially if they need that software at that moment, because they have deadlines as well.
Once the code isn’t being continuously integrated, it will quickly stop compiling if you restore from vcs. Leaving endpoints in the codebase but intercepting calls and returning deprecation errors like your predecessors did is a nice middle-ground. Leave that running for as long as the stakes and usage patterns demand, then delete.
As others say- not suggesting you don’t delete. Delete is the best refactoring! Sounds like you’re already doing the sensible extra work of chasing consumers to check endpoints really aren’t used under any circumstances.
It would be great if consumer-driven contracts were more widely adopted, so we could move more confidently when deleting stuff. It’s a constant source of annoyance in large orgs how quickly we lose track of dependencies. Keeping visibility of lineage and dependencies has a great payoff if you can build it in from the start.
Noticing that this is a real issue with your service, finding who to email, writing an email and waiting for the fix to be made/deployed can be a non-trivial amount of work that you seem to expect all of your users to be willing to donate.
OK, some context: it's a homegrown CRM used by sales, analytics and marketing departments + additional infrastructure services. We have a dedicated JIRA service desk just for it, with at least 1 dev always on call (because it's very buggy). All users learn how to open tickets during onboarding, so there's no need to find who to email. There's also direct communication between the team leads of CRM and other departments (+ monthly meetings). I agree that if we break their workflows it will waste everyone's time - that's why I'm not planning to make removals fully automatic without a human in the loop. The script's suggestions will have to be approved first. The thing is, during its 14 year course, there were added countless integrations that are long obsolete, features for very specific workflows which were asked by people who no longer work at the company, etc. Code was also written in a very brittle way ("big ball of mud"), so supporting all that code is a burden for the dev team - adding new features, or doing major refactorings to make code more robust, take much more time than it should be, because devs have to account for features no one actually uses. For example, currently there's a dev porting it to PHP8 and they have to go through every file to fix code which is broken by backward-incompatible changes in PHP8 - although some of that code is never actually called.
This is why some sort of blacklist feature should exist to ignore such critical cases that are not intended to be deleted (but then again this defeats the purpose of the methodology).
I wonder if APM performance traces could be used to determine unused code paths.
A lot of these tools sample stack traces at intervals, collecting statistics of how often functions are used. If you had enough samples, then with very high confidence you could find code paths that are never used in production.
Happy to see someone applying some science to our field instead of just going with a hunch.
That said, as kortilla already mentioned, in most cases 3 months is too short time to get a complete answer.
IIRC "Principles of Network and System Administration" by Mark Burgess has more about it, but intuitively you have to look for the largest cycles that exist, but there is also, again as mentioned by kortilla, events that doesn't happen on a schedule.
One thing I try to do is, when I come across code that I have to research is to write down in the documentation for the function (Javadoc or similar) why it exist and the conditions when it can be deleted.
You can also try to add some instrumentation code that notifies you when something is called. That way you can end up like me who, every new years day get a weird sms message at 14:00 or 14:01 from a system that no one can find, but at least I know that some system I cannot remember anymore still runs somewhere :-)
Seems a good methodology, a caveat would be the deletion should start in coeff * 14 years from time of watch start, where coeff is a project complexity coefficient. If the watcher doesn't see a part of code called in another coeff * 14 years, then it's probably safe to delete it.
There is a meta-meta situation surrounding really good software management.
You can knock up some code that say solves a specific business problem right now.
(meta:0)
But you need an environment that can take a new piece of code and deploy it and test it (meta:1)
how is that code running - this is shadingnfrom production monitoringninto QA and performance (meta:2)
Compare all the running code and its performance against the benefits of replacing code or going back to level 0 and just fixing a business problem (meta:3)
Then this death eater - meta 4 I think.
And to me this is why comments like "software needs to solve business problems" is naive - once you start using software you need more software to manage the software - it's going to grow till it consumes the business.
And the literal meaning is “Scythe Man” (which, I'm sure you can guess at this point, is a compound noun of “Sense” for scythe, and “Mann” for man. Both the German Sense and English Scythe derive from the proto-germanic Sagu meaning to cut or saw.)
Looks like this was made by a team in Zürich, which is mostly German, so I imagine it came to them fairly naturally, and who doesn't want to pick cool names for hackathon projects.
It was kind of an internal joke at Google for German-speaking teams to make German-named projects. (It's maybe only a joke that makes sense to the infamous German sense of humor.)
Turkish and Finnish have whole sentences that fit in single words. German is rather short-worded by comparison.
The main difference between German and English here is one of spelling: Germans like to write their compounds together, English speakers like to put spaces in between.
Writing compounds together like that wasn't always the norm in German spelling, either. Especially before the printing press rules were more fluid.
English would still be the same language, if you wrote supremecourt or summerolympics without a space. German would still be the same language, if you wrote Bahn Hof or Sommer Nachts Traum.
I am purely talking about changing the spelling. The Fugen-S is something you actually pronounce, so if you want to keep the language intact and spelling phonetical, you would keep the Fugen-s.
Yes, it might be a bit weird to you. But it's no weirder than other changes we make to words in German because of grammar. Like 'kleiner Hund' vs 'kleine Katze'. Adding an 's' on the previous word is no worse than adding an 'r'.
It's actually not always clear. There are "Einkommensteuer" und "Einkommenssteuer" and you could write and pronounce them either way. Same for "Bahnhofstraße" and "Bahnhofsstraße". But yeah, I guess these are some rules that developed around the practice of writing words together.
It's also interesting when you split up the words. For instance "gebrandmarkt" results in "brand gemarkt". So you extract the "brand" from "ge-...markt".
Your corner cases are mostly about compounds where the next word starts with an s or sch sound. There's also selbständig vs selbstständig, where both variants exist with speakers of the language, and 'official' German picked first one and later the other variant as the 'official' one.
> It's also interesting when you split up the words. For instance "gebrandmarkt" results in "brand gemarkt". So you extract the "brand" from "ge-...markt".
They should also look at deleting third_party directories and replacing them with package manager dependencies. For example Qt WebEngine embeds Chromium, which embeds Blink, which embeds wpt, which embeds a bunch of Python libraries and one of those Python libraries (six) is embedded in three other locations within Chromium.
Code repetition is only cheap if you abandon and forget about the repeated code. Any time you need to have to care about the code to fix bugs, security issues or improve performance, it is expensive do each of those things once, repeating them multiple times can mean an order of magnitude difference in the worse case.
This is a good plan but I've often found that with for example, Debian, the official packages are well behind the times compared to the direct third party code.
As a mostly volunteer based distro, Debian definitely can't keep up with everything, but Ubuntu inherits the majority of their package versions directly from Debian unstable. For more core things like GNOME they can be more up to date, but they also employ people to do that work directly in Debian instead of in Ubuntu. They are also heavily moving towards dropping apt and adopting snap packages, which means faster updates because each snap package is a self-contained filesystem unconnected to the rest of the distro.
The most important part of this is that the build units are hermetic and all dependencies are explicit. This is why you need to use something like Bazel/Blaze vs older build systems like make where identifying what's used, particularly when you get into meta-rules, becomes all but impossible.
As the article points out, you also have to look at what's actually run. This is the real advantage of Google infrastructure: the vertical integration so if a binary is run on Borg, or even on the command line, that can be tracked.
I worry about archival and enough history for diagnosing long-standing subtle issues that take a long time to surface as bugs. This is not theoretical: apparently a TeX bug picked up after many years had been there from the start.
I used to do archeaology on Google's monorepo (IE, looking far into the past of mapreduce, search engine, ads, and other products) and it wasn't really that hard. Heck, there was even a sythetic filesystem where you could just cd to a historical commit # and see a view of the repo at that timepoint (google's version control is based on an always-increasing globally shared commit numbers).
Your project is also interesting. I don't plan on ever adding write support. The old Python version was already using git as a library via gitpython, instead of shelling out via the command line. The new version will use Rust's gix.
Performance, even for the old Python version, was pretty decent. That probably came from using git via a library and being careful about fuse caching. I used the 'low level' API that libfuse provided, instead of the 'high level' one. The old version also already supported opening arbitrary commits, tags and branches, they were represented as different folders.
Another feature that's maybe important for performance: because everything was read-only, operations like OpenFile could be implemented as no-ops in such a way that the kernel doesn't even send us a request anymore.
(It was read-only in the sense that any changes would come from the git side. User initiated file system operations could not make any changes to the data.)
Not really. If you're in the very rare situation where you need to diagnose a bug in long-since dead code, you can just view repository synced to where the version was cut.
Sure; but working at another very large company, I can say I wish we had this.
Old unused code is a huge problem for us. The coordination costs of trying to update company wide problems are made much more severe by old code.
I wish we had something like this. We’re large enough we’d need our own system anyway. We don’t have a monorepo, and we don’t use tools so many others do.
Sure, but this kinda tool only works because of their monorepo, build system, deployment run tracking etc. Let's say they open sourced it, you'd have to basically adopt the entire google stack to make use of it
Code deletion is much easier when you have a typed language with no meta programming. I guess Go and C code is easier to tree shake like that, than say Python or JS.
Sensenmann is able to delete whole libraries independent of programming language due to how the dependencies are defined.
Programming languages would make more of a difference if deletes were happening at a more granular level, e.g. deleting unused functions, but this article doesn't touch on that.
Not sure if the parent comment was edited or not, but at the time I'm reading it, it just says "typed", not "strongly typed". Strong versus weak typing isn't something I've seen a consistent definition of that everyone agrees on (although most definitions I've seen would consider C to be somewhat weakly typed because generally almost any type can be coerced to any other type without much ability to tell if the result is valid or not), but statically versus dynamically typed is pretty well defined, and C certainly is a statically typed language.
My point is that it's not typed stongly enough to make certain kinds of static code analysis easily doable. It is totally possible to encounter a type mismatch at runtime because the compiler didn't complain (because the code is technically valid?). There are languages where this is much different, e.g. Java.
It's definitely possible to encounter a type error at runtime in Java due to the compiler not checking certain things. Java allowing subclasses across arrays is somewhat notorious for this sort of pitfall; in the classic example of `Dog` and `Cat` being subclasses of `Animal` (or implementations of the `Animal` interface; the issue is the same either way), Java will let you declare an array of type `Animal` but assign an array of `Dog` to the variable, which then will throw an error only at runtime if you try to put a `Cat` in the array:
interface Animal {}
class Cat implements Animal {}
class Dog implements Animal {}
Animal[] animals = new Dog[1];
animals[0] = new Cat(); // This will throw an error at runtime
If I'm remembering correctly, the property of not throwing type errors at runtime due to things that could be caught at compile time is called "soundness", which like static versus dynamic typing is much more formally defined than strong versus weak typing. I definitely agree that C has a lot of pitfalls in this regard, but I think you might be surprised how many mainstream "strongly typed" languages have these sort of issues.
They may use Blaze (Bazel) for every language to allow for tracking all dependencies. This helps with reducing the costs of CI by running tests specific to the changed code, and aids code deletion.
It's not that simple. A lot of dead code doesn't look dead to static analysis. E.g. gated by long-forgotten feature flags, or platform checks that are no longer relevant.
This is only the simplest possible reaper tool. You can (and Google does) also write tools to find flags that have only ever been set to one value, codepaths that haven't been exercised in months, databases that haven't been accessed in years, etc.
Discoverability. It's a lot easier to search one repo then it is to search a set of repos. For the latter you need to have all the repos listed somewhere, and have them be accessible.
SourceGraph is modeled after Google's internal code search tool (to the point I've seen ads on facebook that say "Hey Xoogler, miss code search? Try Sourcegraph").
It's just as easy to index multiple repos as it is to index one, which means that the same goes for searching. Why would it be any different for one as opposed to many?
We use GitLab in the company I work for. There may be repositories created by others in the company, that depend on repos I work on, where I don’t have access to said other repos. So to me these are invisible. If everything was in one monorepo, it’d all be visible to me easily.
That’s more of a culture thing. Your company chose to enforce more granular permissions. Perhaps there are good reasons behind it, e.g. code for different clients being under different contracts and NDAs.
That's a question of policy, not of repo structure. A monorepo can still have certain parts of the repo protected by access control. One repo doesn't mean "all files share one read permission".
Haha, yeah, I’m sure it works for Google. I was trying to say that when searching billions of lines, the challenge is in the fact that it’s billions of lines, not in whether it’s one or multiple repositories.
I’m not sure. I also thought that these big repos have to use sparse checkouts to not use too much space on the developers machines. So you would have to use an external code search index anyway.
On a non-google level just being aware of code sitting around costing resources is pretty important. Often, tests and maintenance are just ignored or not calculated in as cost (be it time, money, effort or otherwise). It is almost in the same realm as "I don't know why it works" which is as dangerous as "I don't know why it doesn't work".
The most difficult part about code deletion is practicing the Chesterton's Fence principle:
> In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.
While this tool certainly does the job of proposing code deletions, that's the easier part. The harder part is knowing why the code exists in the first place, which is necessary to know whether it's truly a good idea to remove it. Google, smartly, is leaving that part up to a human (for now).
"This code has been dead for six months" is a very good heuristic that the code is not relevant. I do occasionally reject the sensenmann CLs, but only very very rarely. This isn't weird code that nobody knows why it exists but it is currently doing something. This is code that cannot execute.
Code that only triggers from a yearly holiday, disaster alerts, leap years, or the like, would have longer periods of going unused and likely be very problematic if removed. Unless by dead code you mean unreachable code in which case it shouldn't exist in the first place and I agree should be removed.
Dead code is shit that has a build rule and no other build rules in the entire repo depend on it. Or private functions that have no callers in an application that isn't built with any kind of reflection capabilities.
Or code that is behind if(false), System.exit or null pointer dereference. Finding those kinds of things is one of the best part of modern compilers, it's nice that they've extended that to the whole ecosystem.
Yes, the nice thing about blaze/bazel + sensenmann is that you can very accurately say "this code was not built into a binary that has run in the past 6 months".
Sometimes you still want it (e.g. python scripts that are used every once in a while for ad-hoc things and might go months between uses), but usually the right thing to do is productionize stuff like that slightly more (and also test it semi-regularly to make sure it hasn't broken).
Nah, there's stuff that scans the entire repo regularly for all kinds of interesting purposes, and of that's ignoring the fact that `atime` isn't available or a source of truth in piper.
Like conceptually I believe this could be wrong in both directions, since there's heavy caching of build artifacts, you can totally build a transitive dependency of some file without actually reading the file (and potentially do this for a relatively long period of time, though I don't think that will happen in practice), and stuff will regularly look through large swaths of files that aren't necessarily run.
Because piper[1] isn't a normal filesystem. It's often accessible through a FUSE-based api, so it appears like a filesystem to some users sometimes, but it can also be accessed over an RPC api. So the concept of "atime" isn't really a fit, because, well, you access the filesystem from a view that is based on your workspace, (think, akin to the git commit hash being in the filepath), and the "real" underlying file isn't necessarily on your machine.
So under a reasonable definition of atime, the atime of most files is never, because on any given arbitrary commit, you don't access/build everything.
> you cache an entire file instead of just reading it
You don't cache the file, you cache the file's outputs, keyed by a hash of the file (or all the files which are deps of a particular output). With bazel, you have a shared build artifact cache that can securely and reliably be shared across every user at a company with tens of thousands of engineers. If I build some target `:foo`, which corresponds to `foo.o`, generated from `foo.c`, but someone built `:foo` five minutes ago, as long as `:foo` hasn't changed (and you can check that the file hasn't changed without reading it because the fancy filesystem stores a hash of the file alongside the actual file), I won't actually read `foo.c` or go though the motions of building `:foo`, I can just pull `foo.o` directly from the object cache and use that in any dependencies, which means that I can build my output without ever invoking a compiler, which is really cheap and fast.
You could argue that upon reading the hash you should update the atime, but that's extremely expensive since now you have to do writes instead of idempotent reads a bunch of times a second.
I mean that's functionally what they do, just more accurately since it isn't read time, but time it was built into a binary since that information is accessible, and because the object cache isn't filesystem like and doesn't have an "atime".
And also you still need to reverse from a cached output foo.o, to every transitive dependency of that, of which there could be thousands. Which can be done but requires invoking the build system to do. So it's not anything like just checking a timestamp on the file.
Nah, like conceptually they are both writing serialized bytes and like indexing them so they can conceptually be found easily and like read back. This is probably why you didn't like explain why they are different.
Different industries I guess. The new financial year comes around every 12mo. Good luck explaining to the accountants that you deleted their end-of-year reconciliation reports because they didn’t run them every 6mo.
I would expect that those binaries would be built and deployed constantly, even if they just woke up once a day to check whether it was mid-april or whenever the financial year ended. I would also expect that they would be integrated with other reporting that happened on a daily, weekly, monthly, and quarterly basis.
Do you really have an entire application binary that is only deployed and run once a year?
Do you really have an entire application binary that is only deployed and run once a year?
I have ones which are run when the need arises. It could be a year, it could be ten. But I know they work and I don't need to touch them, nor would I want to delete them.
At Google there’s not really such a thing as code that never changes. Your dependencies will shift beneath you, even if it’s just the compiler itself. More likely though the libraries you use have been updated, a robot or two have come through and applied automated fixes etc etc.
So if we’re using it once a year, (we do have tools like that, maybe not strictly annually but certainly rarely) then you do indeed want it to be automatically built on some kind of recurring basis so it’s ready to go when you need it. And the tooling makes it (relatively) easy to have that happen.
If your once-a-year binary has any dependencies in common with others, then you need to at the very least be rebuilding and re-running the tests when those shared dependencies change, or else you'll find at the end of the year that it doesn't build, let alone work. And the compiler and stdlib count as shared dependencies.
The fact that you can so very straightforwardly declare that code which hasn't been used in 6 months is irrelevant is extremely revealing about what leads to the deplorable state of "modern" software development.
Google's response to Chesterton's Fence is: "if you liked it, then put a test on it".
I used to update the internal version of numpy for Google and if people asked me to rollback after I made my update (having fixed all the test failures I could detect), and they didn't have a test, well, that's their problem. The one situation where that rule wouldn't apply is if I somehow managed to break production and we needed to do an emergency rollback.
I shed a tear when some of my old, unused code was autodeleted at Google, but nowadays my attitude is: HEAD of your version control should only contain things which are absolutely necessary from a functional selection perspective.
At Google, because everything is in the mono repo and nearly everything around it is in some level of flux, if you don’t have a test for your code, it gets broken pretty regularly.
Most teams are pretty understanding when they break you on some untested code, and will work with you to fix it, but the very first step in working together is for you to write a test that shows the issue.
The resultant culture is heavily biased toward writing tests before something bad happens to you.
I don’t think you understand Senssenmann fully based on this post. At Google basically everything in use has a Bazel-like build target. This means the code base is effectively a directed “forest”/tree-like data structure with recognizable sources and sink. If you can trace through the tree and find included-but-not-used code by analyzing build targets, you can safely delete it. There are even systems (though not covering everything) that sample binaries’ function usage you could double check against.
> why the code exists in the first place
If the code is unreachable it’s at best a “possibly will be used in the future” and most likely simply something that was used but not deleted when it’s last use was removed (or a YAGNI liability).
If you can find a piece of code included in build targets but unreachable in all of them, it’s typically safe to delete. And it’s not done without permission generally, automation will send the change to a team member to double check it’s ok to delete/nobody is going to start using it soon.
Yep. You would almost always know before the deletion is even committed or the change proposed whether it would cause a build or test breakage. And if the automated safety checks and manual review didn’t catch such a breakage, you’d easily rollback. The deleted code is always “still there”, you have to jump through a ton of justified hoops to do a true deletion, which Sensenmann won’t be doing.
It should also be clear that this article is about "deleting" code from an active project, not about "deleting" it entirely from the version control system. Thus, any code "deleted" through the described process could still easily be restored if necessary.
From reading the comments, I get the impression that there is a much deeper misunderstanding going on. This seems to be not so much (or not at all?) about identifying rarely run code paths, but about build units that have been abandoned on the dependant side.
So if there's a fancy onLocusSwarmAttack() that has never been run outside a test in a module to help applications deal with various kinds of datacenter outages called TenPlagues, it won't be in the crosshairs of Sensenmann. But if one day that module gets a successor (perhaps BeiWeltuntergangScheibeEinschlagen if it's made by the Zurich team?) and all code is supposed to eventually migrate to the new one, Sensenmann will tell you that TenPlagues has finally left the building when (if) it has happened.
Again, just my impression from the article and the discussions here (many seem to think it's about onLocusSwarmAttack()), probably resulting from a linguistic barrier between the language used at Google to talk about their monorepo and more conventional English. I guess they have done some quite deliberate shifts to help people "think monorepo"?
As a counter to Chesterton's Fence: sometimes the fastest way to understand what something does is to remove it and see what complains. You might get only 1 complainer for every 10 fences you take down. Putting that one fence back up takes much longer than taking it down, but the time saved from removing the other 9 unnecessary ones makes it a net win. And this time you can add Documentation to the rebuilt fence.
A further counterpoint: if you follow the Fence proponents' logic to its conclusion you can never remove any code which is clearly an absurd situation.
I think the real logical flaw is that Fencers (as I will now call them) put the blame on the person who removes an apparently useless fence. But they're wrong. The real blame lies with the person who built the apparently useless fence and didn't put a sign on it explaining why it shouldn't be removed.
> A further counterpoint: if you follow the Fence proponents' logic to its conclusion you can never remove any code which is clearly an absurd situation.
No, that would only be the case if one would never understand any code. Chesterton's Fence consists of two parts ("understanding some code" as a precondition to "removing some code"), and leaving one or the other part out makes it some other thing than what Chesterton's Fence means.
> The real blame lies with the person who built the apparently useless fence and didn't put a sign on it explaining why it shouldn't be removed.
Chesterton's Fence is not about blame, or the past in general - it is about how to deal with things that are in the present. (Although I agree that the original fence-builder should have left a note or two!)
The principle says you can’t remove it until you understand why it was there. It’s more about doing due diligence.
I follow the principle when I remove code, and it’s a reason why good code comments are important. “Oh yeah, this was written for [x] which is no longer a thing, we can remove it now”
And not putting up a sign explaining why it's a necessary fence is a bad decision. Avoiding removing all unlabeled fences because they might, maybe, potentially be useful, is also likely a bad decision if taken to it's conclusion.
Chesterton's Fence is a reminder to actually get up, walk over to the fence and skim the multiple labels and notes the builders left there - because the big problem is usually someone looking at a thing that came before them, and assuming they understand what it was for, without bothering to actually check it.
The article is about the removal of _dead_ code. So, not a "fence across the road" - it's a fence that was moved to the side of the road, already cleared. The question is just whether to dismantle the fence or keep it there just in case.
+1. And, it's in version control forever. It's not as if it entirely disappears. Like one of the sibling comments mentioned, I only rarely reject Sensenmann CLs.
That's worth explaining: it's automated code deletion, but the owner of the code (a committer to that directory hierarchy) must approve it, so it's rare there's ever a false deletion.
You raise a good point, and I would answer it with agree and disagree:
Agree: Yes, you are correct, merely observing that a code path was never executed in the last 6 months is not the same as understanding why the code path was created in the first place. There might be the quite real possibility of an infrequent event that appears just once in every two years or so (of course, this should also be documented somewhere!).
Disagree: Pragmatically, we have an answer if the code path was not executed after 6 months use in production and test: We know that, with a very high probability, the code path was created either by mistake (human factor) or intentionally for some behavior that is no longer expected from our software. To continue the Fence metaphor, regarding Sensenmann: After 6 months, we know about the Fence that 1) it has no role to play in keeping the stuff out that we want out (that was all done by other fences that were had contact with an animal at least once) and 2) that it might have been used to keep out flying elephants or whatever, but no such being was observed in the last 6 months (at least the fence made no contact with it, which it then should have!) and probably went away.
That said, having a human in the loop is probably a good idea.
Conducting an inventory (of services) would be cool too.
If a chunk of code isn't actually deployed somewhere, mark it as a candidate for culling.
Probably requires some kind of metadata provenance for deployed artifacts.
For Java, I thought Maven had a stock manifest.mf entry for the source repo. Alas, a quick search only reveals that the archiver plugin has an entry for the project's "url".
I'm generally not fond of monorepos but reading this and right away being targeted by this tweet https://twitter.com/nikitabier/status/1652764613962760196 made me think on how much time goes on decisions that won't impact users directly and how hard it becomes to justify that on regular companies in an environment where CEOs are jumping the layoff bandwagon for no reason.
Sincere quesion: what is interesting or novel about this? Is it just the scale or did I miss some subtle aspect?
This is more (or less?) the same as industry best practices, just scaled up. There is a challenge in scaling up, as there is more potential for someone to mess it up. But it's the same technique.
> the same as industry best practices, just scaled up.
That's like saying S3 is the same as ext4, their the same, just scaled up! This is a poor argument, you'll note that S3 and ext4 are entirely different things, not "challenges", fundamentally different implementations.
Google is the only company I've ever worked for that automatically deleted dead code, let alone across a company of 100k+ SWE.
Fair enough and thanks for the reply. Still, for anything bothering engineers more than a bit repeatedly, anyone will write tools to remove the manual burden.
Our internal practice is to delete code if you suspect it's unused, run tests, and if it doesn't affect any tests, go for it. This could be automated, but it is not pressing enough, so we didn't automate it yet.
We could though, and it may even be a good idea, but I still don't get the novelty. But I appreciate your point of view.
Your proposed process is exactly the wrong way around. You'll end up keeping dead code just because it has tests, and delete code that's still used in prod just because it happened to be untested.
This is one of the details that the blog post goes into. Sounds like it's not as trivial and obvious a problem as you think it is, and you would have benefited from just not dismissing the post because of that.
Not to criticize your POV and argument directly, but, in the end, a lot of things, especially like these, are always easily subjected to the "we could do it, but just didn't bother to yet" kind of argument, and, when it comes down to the real work, things are much harder than they superficially appear to be. So yeah this isn't new... But... You know eheh
Well, I'd politely agree to disagree. Google scale is defined by novel, radical approaches, like for example inventing map/reduce, writing papers on LLMs others then implement successfully, or creating something like Kubernetes.
The specific topic here is not one of those google problems to me, as I can compare it to other problems we already solved. But yes, we could miss that critical point where a totally different problem domain emerges just from one order of magnitude more, so fair game.
Thanks for the reply and I meant it from similar experiences at my $DAYJOB where any initiative like this always uncovers a lot of dusty corners! Thanks for the great and polite reply, it's rare nowadays, so kudos!
I think the takeaway is that at Google's scale, even if you think some minor problem is not pressing enough to be automated, it will become pressing soon enough.
In the perspective of software engineering economics, the scale is important. Everyone knows it's good to clean up unused code but they just don't care because they think it doesn't yield a short term ROI for themselves. Then why don't we bring the cost down and see what happens? Automation changes this equation.
Sorry for responding on the wrong thread, it won't let me reply to your comment question. The reason I am doing so much awareness raising is that there is a critical deadline of May 23 and if the House doesn't approve the act on that day, it won't become law. I'm trying to drive more clicks to the many articles on this topic in the hopes that this will help persuade legislators to care. We have a chance to make a difference here and save lives. I'm sorry for the redundant posts and I'll lay off it for a bit.
Vert interesting. At our company we have a spring cleaning event next week, and one of the goals is to delete old code. Some of the comments mention some good techniques. Anyone experienced with deep analysis, e.g. BPF on this field?
I liked most of the blog, but it bothers me to see stuff like, "... just as with the introduction of unit testing 20 years ago...".
No, unit testing was NOT introduced 20 years ago. As an example, Perl 1 was released about 35 years ago with a unit test suite that got run on every install. Every version of Perl has done so since, and since CPAN came along, most Perl modules have followed suit. This was the secret sauce behind Perl's reputation for being so portable.
Nor was Perl a pioneer. In fact unit testing was used in the 1960s on the Apollo program, and was even called unit testing. I believe that the concept can be dated back to a 1950s textbook but I can't find the reference.
This refers to Google, I'm pretty sure. Early Google didn't believe in unit testing and it took a few particularly stubborn engineers to demonstrate the value of the practice and convert the culture to promoting unit testing.
I was hoping to spark a conversation about this approach as no such thing was mentioned in the article even though it should be possible to do.
The whole point of these comment sections is to have a discussion. If the point of this site was just to read articles there wouldn't be a comment section.
I think your question was phrased in a way that suggested you had not understood the article. The described system detects unused libraries and binaries, but does not seem to look for dead code paths in source files.
Dead code path detection is an interesting topic which I think the linked article doesn’t address at all.
Profiling is inherently probabilistic and I don't think it should be used. Anything that inspects the runtime (dynamic) behavior of code isn't good enough for a code deletion tool. Only static analysis will do.
And with enough samples we can learn what code paths are being taken. You can have your dead code bot look at the number of samples that have been collected to know if it should be enabled for something. For example if you have 6 mounths worth of samples of a piece of software running on 10k machines you will get a good idea on what code isn't being used.
>Anything that inspects the runtime (dynamic) behavior of code isn't good enough for a code deletion tool.
I disagree since the code that is being run is by definitions not dead code.
When the developer reviews the CL they can use their judgment to determine if the proposed deletion makes sense. The bot can remember not to send a CL for that same code in the future.
This is hyper dangerous. People are prone to accepting robot CLs and even a small error rate of deleting important code will be a big problem. GWP is also only able to instrument a very small number of requests.
It sounds like the same issue exists with this approach of removing unused libraries and binaries that have not been run in a while. My suggestion is just expanding it from binaries that haven't run to libraries that haven't run. You could even just put this information into some code health dashboard. If code isn't run that means that it isn't providing value or it isn't being tested in production. How can you be sure some rare failure case actually works and doesn't take down the system if you never test it out to see if it works.
>GWP is also only able to instrument a very small number of requests.
As I mentioned before this could be limited to popular services where it is enabled and where a small number of a giant number of requests is still a big number.
> It sounds like the same issue exists with this approach of removing unused libraries and binaries that have not been run in a while.
Not really, it's the difference between knowing that something isn't plugged in and hoping that it isn't powered on.
Libraries that aren't run are deleted, you're suggesting to expand this to stochastically unexercised code paths within libraries that are run. (as opposed to provably unexcercisable code paths, which I think there are tools that do do this but are opt-in instead of on-by-default).
> If code isn't run that means that it isn't providing value or it isn't being tested in production.
Or can't show up in profiling for any number of reasons (I'm immediately brought to the idea of a CHECK fail which would not every show up in profiling I don't think, but could be exercised regularly, and if is included intentionally is likely preventing some kind of data corruption issue so removing it would be terrible).
Yes, but then you run into signal-to-noise issues. It's much easier when I can trust that 90 or 95 or 99% of the automatic deletions will be legitimate, than if only 50% or 20% are, then they become an annoyance.