Hacker News new | past | comments | ask | show | jobs | submit login
What was the nicest codebase you've inherited without the original author(s)?
140 points by _randyr 17 days ago | hide | past | web | favorite | 68 comments
What made this particular codebase stand out above others? What did the original authors of the code do well?

Documentation? Tests?

Perhaps nothing about the code itself, but rather the surrounding environment?

I inherited EASTL (EA's internal implementation of the C++ STL designed to be more suitable for use in games) for a while after the original author left EA, although I'd already done some work on it while he was still there. It's now open source on GitHub. There's also a paper out there outlining the original motivation and design goals.

It was generally very well written and where it differs from the standard there's usually an interesting reason why. It had extensive unit tests that ran automatically across a very wide range of supported platforms and compilers including all major desktop, mobile and console platforms. It is generally much more legible than the STL implementation used by Microsoft (which was one of its design goals) while also often more efficient. It's the STL so it's mostly fundamental algorithms and data structures and widely useful utilities of general interest rather than very domain specific business logic.

19 upvotes, 1 hour, no answers...

How often do you look at some code and think "an idiot wrote this", and then realise you're looking at your own code?

Maybe all code is technical debt? That is, maybe every piece of code you inherit is bad because you now have more to learn and understand, no matter how nicely structured/documented/tested it is.

Sorry OP, I got nothing positive!

All code is technical debt. We should not write it unless it serves a purpose of high enough value.

I like the idea but find that an overstatement that trivializes the nature of technical debt: at some level one is talking about things that "pay off now" but require "interest payments" in the future" and while one can kind of squeeze the line `for (let i = 0; i < array.length; i++) {` into that framework, experience says that that line by itself almost never requires a maintenance debt payment; if you have a list then you almost certainly want to iterate through it and refractors are driven by the size of the list, not by the bugginess of the looping construct...

I think what's really at stake is that code and data are a sort of inventory: they are not what you are selling, but they get turned into what you are selling. Inventory always has a carrying cost, and generally people underestimate that because they are only looking at the direct cost of storage, not how the presence of the inventory itself gets in the way, makes getting to other things harder, makes bottlenecks harder to see.

And that's where you see that debt is a wrong metaphor, because debt has the particular property that you can pay all of it off and that would be a good thing. By contrast inventory is a good thing in the right place: It means that if one thing stops working, the system can still continue for a while. Really operating with zero inventory everywhere is possible, and it's not done because it would drive you out of business. Similarly, deleting all of your code is not accessible in the way that getting rid of all of your debts is.

Designing an API to have a separate messaging later from its business layer from its data management layer from its data fetching layer is a technical debt; the fact that any change in the system now needs to be distributed across 10 different places in the code base is your interest payment. I would argue that you would like to derive all of these from some shared source of truth to remove those interest payments, and when you do, I no longer think that it's a bad thing for you to have a homebrew HTTP framework that has those separations in its internal functions.

I don't think that's true. If it's properly abstracted and implementation details are encapsulated, well written code can serve as documentation.

> How often do you look at some code and think "an idiot wrote this", and then realise you're looking at your own code?

Actually when I have to modify my old codebase I am usually pleasantly surprised and it is much more tidy than I am expecting it to be. It's an amazing feeling when you read your own code and go "wow, I wrote that, what a nice way of doing that".

I am not really sure what that says about me... maybe my expectations are too pessimistic on average...

Maybe you are not growing as a developer? ;)

Anytime I spend a ton of time making code pristine, it takes so long that I never accomplish anything useful. What's better, a polished half turd or a stinky full turd? As a user, a full turd. As a developer looking to fantasize over having time to finish something, a polished half turd. As someone who wants to be productive on a team without getting murdered, probably neither.

The better we get, the more complicated problems become barely within our reach

This is a great summation, and could well be one of the great quotes of software engineering.

But remember that debugging is twice as hard as programming so if you build a system as complex as you are able to build it will be too complicated to debug. :-)

This is where I'll just modify a quote attributed to Einstein as a reply: "Software should be as simple as possible, but not simpler."

The complexity of software should not be dictated by anybody's skill at creating and navigating complex designs, but by the complexity of the problem. Second system syndrome comes from youngsters not heeding this advice. Somehow it takes an inordinate amount of experience to know what you don't need.

> How often do you look at some code and think "an idiot wrote this", and then realise you're looking at your own code?

Actually never. The reverse actually happened: I looked at some code and thought "a wizard must have written this", and then realised I am looking at my own code. :-)

No, seriously: I could immediatelly reconstruct my intentions that I had for code that I wrote 15 years ago and never touched afterwards. For this reconstruction process, the emotions that went into the code lines provide a strong mnemonic. This is also the reason why I actually need very few comments in my own code if it is just for me (of course, if other authors want to contribute, these are very important - but in this case, I prefer to ask them directly what kind of guidance they actually need).

What is much harder is to get into a foreign codebase. Even if it is of high quality (it often isn't), it takes a lot of time to get deeply into the thought process on which the original authors based their code structure.

Wow, that's some wisdom right there :) +1.

Whole new meaning to "best code is the one that doesn't exist".

There's a talk by Greg Young regarding this topic called "The art of destroying software" [1]. It's a fairly interesting talk about designing your software with deletion in mind.

[1] https://vimeo.com/108441214

It may be a good message, but it's so hard to take it through such a condescending and off-putting way of delivering.

Ken Thompson: "One of my most productive days was throwing away 1,000 lines of code."

There was a time when I'd read some code and think, that's the way I'd do it, then the variable names look like ones I'd use... and turns out I was reading my own code that wasn't more than a week or two old--I used to do all-nighters in the office at the start of my career. It was pretty cool, some days I'd find code done and it was like having magic helper elves.

The night my CS final project was due, I went out drinking. I don't recall coming home and I don't recall working on the project. But I do recall getting very, very, very intoxicated.

The next morning, I woke up late in a panic. I was gonna email some excuse to the prof in a last ditch attempt to salvage a decent grade in the class. I ran to my computer desk in the corner and turned on my ginormous CRT monitor to find... the confirmation for my final project in my email inbox! I had somehow scored 100% on the project, despite having no memory of what must have been several hours of intense programming and debugging.

Perusing the code later that morning, I was stunned at how clever, clear, and concise it was. It was, at that time, the best code I had ever written. Were it not for a few telltale grammatical peculiarities, I wouldn't have believed I was its author.

That was the first and last time I ever got blackout drunk, but it made me a firm believer that the Ballmer curve exists.

As entertaining as your story is, I find it very hard to believe. If you were as drunk as you claim to be, there is no way that you're capable of writing "the best code I had ever written". A high level of intoxication severely impacts your attention span and ability to focus deeply. I speak from experience and I'd wager it would be the same for almost everyone.

If this story actually happened, I'd hazard a guess that someone else wrote that code, not you.

The hard part is that the good codebases I've inherited were ones where I overlapped the original authors, or there was some chain of developers. Picking up a codebase with no context is relatively rare in my experience.

> How often do you look at some code and think "an idiot wrote this", and then realise you're looking at your own code?

Every single day. I am my own worst enemy.

>Maybe all code is technical debt?

I am still removing jQuery like dirty splinters.

But also I am using Vue.js... (es6 modules are awesome) BUT, I am designing everything I build with Vue to be easily replaced with web components down the road. No extra plugins, no complications that super specific to Vue if I can help it. (fool me twice... prototype.js, sigh)

I am expecting to get rid of Vue.js in a 5 years or less depending on how long it takes for web components to catch up. (or another framework to replace Vue)

That being said, some of the vanilla js code I wrote years ago still works great, no need to replace it.

So, maybe technical debt is a trade off we can _manage_ with our systems consciously instead of "by accident".

I often say prototype.js was the FormMail.PL of Javascript.

I inherited a popular open source charting library. People like it because it has a data model - you can automatically brush and filter between charts.

It is intentionally a leaky abstraction so that there is always a way to customize it if it doesn't work the way you want.

There is only enough design to make it reusable, no claim to a "grammar" or other highfalutin abstractions.

There have been dozens of contributors and many parts are inconsistent.

Yet it still gets a ton of use 7 years on because it does 80-90% of what you want.

I've learned so much from this. Worse is better. Don't box yourself in with designs you don't understand.

Solve one big problem and after that be humble and let users work around anything you didn't think of.


If I reveal the name, then I must also admit it has plenty of bugs.

And I haven't kept up with merging PRs because it is a lot of work to test and integrate code. (Help welcome!)

With that out of the way: it's dc.js

DC.js is the greatest most customizable charting library I've ever used. Thank you for that.


Few years ago, I inherited the whole infrastructure for a fintech startup. It was few hundreds instances in AWS. The company was possibly the first to operate fully in the cloud and pass financial regulations for that.

At the heart of it, there was a short list of instances to run with their purposes. Most of everything was automated around that. You could add a line to order any resources in the world and have it running in the next 5 minutes. Instance provisioned with standard OS setup and patches, DNS and aliases up, permissions for developers and services deployed.

It was extremely efficient and well organized. I'm not sure who wrote it but I'm pretty sure it's the only guy in the world who figured out how to use AWS, accidentally.

While I worked there, I updated it to support provisioning in any region, EBS backups, automated firewall groups and a few other things. Everything was tagged consistently with purpose/team/environment for identification and billing.

It was neat. I doubt I will ever find again a company that can setup hardware or manage resources any decently.

To conclude this. A coworker told me that new guys were hired after I left and they undid most of it in the next 2 years.

Sounds like he basically wrote his own CloudFormation

It was mostly using ansible actually. The integrations are pretty good, better than what you can get with Terraform or CloudFormation.

Well, I'm only a lowly WordPress developer, but I'll say that most WordPress plugins are a PITA to extend the functionality of, but that the WP Store Locator plugin was very nice to work with. Just sensibly implemented object-oriented programming and extensive use of WordPress hooks.

Hey another fellow WordPress developer!

Another great plugin is Metorik's helper plugin! Bryce is amazing and he's so responsive and helpful. If you're looking for a tool to extend Woocommerce functionality, definitely check it out.

I inherited an internal tool from someone who left the company. They weren't a software developer. Everything had comments (not function comments/XML headers), behavior comments. Some were "someone who knows how to code should clean this up", "I want to do x but couldn't figure it out, this is close". It lacked clear abstractions, it was mostly one file. It wasn't even in a repository. The program was however purpose built for use by the developer, and people in a similar role at the company.

The team who was supposed own it hated it, so I get to work on it :)

There's a saying about how it's easier to bring design to something without design than bring good design to something with bad design.

My current works codebase. I've never seemed to like other peoples code. Not even my own after 1 month But this code is very good. I would attribute it to the strong set of rules and design patterns everyone follows. All code/services are broken up into `unit` which all follow the same pattern, of how it's called, how the code is arranged etc. This means there is no ambiguity for any new starters trying to contribute to the codebase. In addition to that our linter doesn't allow you to push code if it detects any errors. There are a few other elements, involved too, but I think the most important thing is keeping a standard of where each bit of code should live, how it should be called, and ensuring everyone uses the same conventions (i.e. single quotes instead of double, all this small things add up)

> In addition to that our linter doesn't allow you to push code if it detects any errors.

I just added a couple of linters to a new project and am looking forward to having the computer flag any obvious errors before allowing a git commit.

Is it git hooks that run a linter before push?


I've inherited a couple codebases at work that were pretty nice. They stood out among others because they did a good job at doing one simple thing very well. Libraries that start having a lot of feature creep, such as trying to support a bunch of different other libraries or trying to do things too cleverly to support a ton of different ways to use it, always end up being a mess.

So if the codebase were good, I would not be assigned to it. However, I always look to fix the following issues:

1) lack of extensibility. This comes from poorly scoped projects. Both previous dev and product manager didn't understand the value in what they were making.

2) built in headaches. This is related to (1) but it's kind of the opposite problem. I see deployment tools that don't make the options object available to read in all contexts, or unhelpful automations, like silent failures. This is often from someone inexperienced trying to be clever.

3) terrible engineering practices - storing prebuilt native binaries in git, deploying a custom (read unsupported) version of a tool like gpg or perl. This can represent just a terrible engineering culture, but I often find these practices can be sourced to someone with a title like Director of Research

4) Lack of scalability - this is the least worrisome thing I run into. It takes experience in big problems to know ahead of time with any accuracy where the bottlenecks will show up. If this were the only problem I ever ran into, I'd be a happy camper.

I once "inherited" an Adobe ColdFusion codebase that hadn't been touched in about 15 years. I guess the reason it hadn't been touched in so long was because it simply worked. It was a well structured project and easy to get in to. Implementing some minor new features was straight forward and thanks to an existing test suite easy to test.

Needless to say I was pleasantly surprised by that.

I inherited a well-unit tested JS MVC frontend written with classes using Resig's Simple JavaScript Inheritance and jQuery, it pretty much looks like modern React code with Pythonic style to match Python backend (self=this everywhere, snake-case etc), but written in 2009. Currently in process of migrating it to vanilla TypeScript with very few difficulties so far. A few months later and I could have be dealing with a coffeescript POC.

On the other hand I also inherited an angularjs frontend written by an ape.

I was new in development, had some notions about how to program, and stumbled upon sup (https://sup-heliotrope.github.io/), a mua written in Ruby that basically incorporated all Gmail wisdom into a local, curses client. I was hooked. Looking at the code everything was clearly defined and understandable, with boundaries set exactly where'd you expect them. I owe William Morgan, its creator, for my interest in programming and the reason I started this career. Maybe some of it is also due to Ruby itself ?

If you don't know wmorgan, he's the one who created trollop and the leveldb-ruby gem. Any Ruby practitioner should know what I'm talking about.

Code written in Haskell is often really easy to pick up. Due to it's pure functions it can easily be picked up function by function and the GHC repl gives you an easy but powerful way to poke around. Of course there are also the crazy type level EDSL libs which are hard to disect, but in general Haskell has been the best language for me to pick up code written by others.

Maybe, I'm a bit skeptical, but I'm not a very good Haskell programmer.

I see too much symbol soup and point-free style makes it tougher to understand the code.

There must be a correlation with something working well and being nice to work with and people staying at those companies working on those projects.

Inherited https://broadbandmap.nz/ the frontend was done by a external contracting company, the frontend looked like it should once i got it to run.

They gave us a stock rails CMS/static site generator that did the frontend part.

almost full rewrite as we where a Python shop and there was a mess of JQuery pubsub involved so you wouldn't know what was happening in what order or if there would be a bug because something wasn't subscribed to this event bus in time. almost goto level.

We met with the external company a few times and they made it clear to us they wheren't getting paid to fix tech debt or make the handover process smooth for us just that they where getting paid to make it look like the design.

This is actually a better state than the other projects i can think of, more because its a small project and the levels of technical debt are thus lower because theres not as much code/complexity, not that the code that is there was good.

All the tests where failing (very flaky by design, using browser driving to check for very specific parts of text on the page that had long been replaced), we also deleted all their tests/writ our own aswell as the almost full rewrite.

I needed to extend a framework on which a thesis was written about.

The thesis explained all the high level concepts.

And every line of code was commented.

It made it quite easy to understand why and how he was doing things.

Comments go a long way to help people. Despite the "only code is the documentation" dogma.

Deis Workflow is a really nice codebase, and I like to talk about it. "Team Hephy" we committed to maintain Workflow and support newer versions of Kubernetes when Deis went to Microsoft. Their release architecture and testing framework were just really top-notch. The number of things that just haven't gone wrong in 2 years since they stopped committing and we took over, is itself probably the biggest testament.

There are over 100 repositories with code in different languages, and I can honestly say that each time I've ever needed to go in and work on something, usually from a position of little or no knowledge about the code or internals other than as a black box, I just find really well-organized code that seems like it was thoughtfully put together by a bunch of people that assumed they were always going to be the ones who would be stuck dealing with the consequences of whatever decisions they made every day.

https://github.com/teamhephy/workflow (from https://github.com/deis/workflow)

If you're looking for some training wheels for your beginner Kubernetes experience, you could do a lot worse! The product itself is basically a "Bring-your-own-Infrastructure" open source Heroku work-alike.

https://web.teamhephy.com / https://blog.teamhephy.info / https://docs.teamhephy.info

Back in the day I would inherit or have to work on large codebases written in Delphi/Pascal from several different authors. While there were a few eyerolls here and there, I never recall being overwhelmed or overworked trying to discover what I needed to know.

But then that's Delphi; a pure pleasure to work with. I only wish I had more use cases for it. I would put back in the lineup immediately if I did.

Believe it or not, it was a large (~1M lines) in-house "CMS-generator for different clients" type of thing written in perl, this was right around the time 2nd-gen rails clones were starting to be a thing, a fair bit after perl was the popular weblang of choice.

The people who wrote it were competent at writing maintainable software. It was well thought out in terms of design and discoverability and tooling, it was well tested, and the authors cared a lot about not obfuscating purpose.

It had good dev ramp-up docs and a pretty clean commit history.

I think probably the largest differentiating factor for this particular codebase was that it wasn't developed under duress of time-pressure and the previous authors were past the point in their programming careers where they learned how to not overestimate their own ability or underestimate unknowns in their domain. The company was relatively small, technically competent at the top, and culturally less concerned with inflating profits than making enough money to live comfortably and do their thing.

I didn't inherit them, but I've had to read through portions of the sqlite and Mercurial codebases and found both to be very well written. Good code can and does exist. It rarely exists in the private closed source sector, however, because teams either are unwilling to invest in it, lack the proper skills in their developers, or both.

I recently took over GNU libredwg, which had an esp. nice codebase. The infrastructure was severely lacking, not enough warnings and probes enabled in autotools, many missing configure settings, totally bonkers unittests written by some late Google Sumner of code students, but the base architecture was excellent, with a DSL like specification language (via C macros), it was easy to extend, and works as perfectly fine documentation. readable code.

Documentation had the proper GNU framework, of course without man pages as most such GNU stuff, but this also easy to add. The other testsuite parts were also easy to add, dejagnu still rocks, gtest would have been a horror show.

GNU coding standards are a godsend.

People who inherit code will always complain about something as that's just the nature of it. But often those complaints are more to do with the inheritor not understanding the intent. Code comments in those cases should describe the underlying design decisions or there should be thorough documentation of the design that explains it.

One wouldn't know about the best code they have inherited because beautiful code denies its existence so perfectly.

Unix V7 which I worked on porting. Some say it was the last Unix kernel that one person could completely understand.

It wasn't technically inherited software, since no one imposed it on me, but when I shop around for open source libraries I tend to prefer those that some in-code docs and a few high level tests to check that the APIs keep working from a version to the next.

The worst code bases I inherited, on the other hand, are a lot memorable. The absolute worse was a ~500k-ish line of code WordPress/BuddyPress mess of undocumented spaghetti, complete with a slew of modifications to WP and BP core files. It took me weeks to put the latter in separate plugins in order to upgrade the mess.

An unreleased idTech2 (ish) game.

id Software writes beautiful code, or did at the time.

Rails codebases I've looked at are usually in pretty good shape. I attribute it to the conventions that are fairly strongly enforced.

I have inherited many rails codebases and do not share this experience. I have found that often teams end up not using typical conventions and start re-inventing things well before it was necessary.

To be fair, I usually seem to find myself given rails 2.x and 3.x codebases which have sometimes been in production for close to a decade. That is a lot of time to build serious technical debt.

PHP dev who inherited a Java Spring Boot application. Sooooo much hidden magic going on. This was a about a year ago and since then I've become more comfortable with the Spring way but I still feel a little uncomfortable with it, even though it's the cleanest codebase I own.

use a language that prioritizes maintenance and correctness instead of prioritizing speed-to-develop-a-prototype and brittle abstractions tailored for onboarding users who expect familiarity, and there will be a lot less expletives involved. These are real tradeoffs involved here. Without tradeoffs, there would be no need for choice and thus no need for strategy. Once those tradeoffs are determined, the crux of a strong strategy is how the chosen activities reinforce one and other to drive an unfair advantage.

A simple, direct php site without a lot of weird, novel bullshit in it.

I haven't inherited it but the C source of the simple terminal st is really really clean and good: https://st.suckless.org/

I highly recommend it, it's only a couple thousand lines so it's worth the read. Never before have I seen such readable and clean C code. I read it and then implemented my own terminal emulator from scratch, which we now use on an embedded platform at work to debug problems that cannot be debugged using a PC (for example because Ethernet and serial connections to the device are broken).

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact