Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What's the largest amount of bad code you have ever seen work?
428 points by nobody271 on Nov 13, 2018 | hide | past | favorite | 578 comments
I think I've broken my own record with this one ~2500 lines of incoherent JavaScript/C#. Works though.

I am maintaining one application in construction industry space. That application was created 25 years ago by construction worker that never wrote single line of code before, but because he caused a lot of problems on construction site they give him Programming 101 book and let him build it.

15 years later the app was close to half milion lines long of huge bowl of spaghetti code. Only comments in whole codebase were timestamps. I don't know why he dated his code, but I find it fascinating: he never deleted basically anything so you can find different timeframes of when he discovered various concepts. There is use-exception-instead-of-if period, there is time when he discovered design patterns, there is time before he learnt SQL so all the database queries was done by iterating over whole tables and such. I am sure I will find commented Hello world somewhere in the code someday.

I am working on this codebase for 10 years. Code quality improved and major issues get fixed, but there is not enough budget to actually rewrote whole system, so after all it is more or less huge spaghetti monster and I get used to it.

I have to salute this construction worker for building a solution that is apparently so valuable for the business that they can’t simply replace or rewrite it. This probably means that it solves a real problem for them, and adding 30.000 lines of code per year without any formal training or much tooling is no small feat either. I understand the criticisms and laughs here from the “real” software developers, but damn it’s just impressive what people can create on their own given enough time and motivation.

It is impressive indeed. The coding started just in time when Windows 95 were released. There was no Stackoverflow and they don't even really have internet back then. The programmer (as far as I know) didn't even speak English so he has access to book or two in German language and code snipets in help section of Delphi. At the same time creating applications with UI just started, so there was very little experience available, espcially in rural Austria.

Company did tried to migrate to other software few times, but the software is just too specific for given industry and legislation of small country that the companies who tried to create similar software usually went bankrupt soon.

Rural Austria? Please provide a company name ... or at least first and last character :)

Somewhere out there, there's a software developer who was assigned the task of building the team's office using 30,000 bricks, making all kind of spaghetti patches to prevent it from falling over, and the construction workers are laughing about it on a construction worker forum.

And this is the wall where he discovered you can use cement to bind the bricks. And here he even mixes that cement with sand and water. This is a safer place to stand.


A spaghetti monster which solves a real business problem can be improved, chunked into pieces, gradually rewritten, whatever improves maintenance. If need be, there will be funds and time for doing so.

By contract, an impeccably architectured, layered, no-design-patterns-omitted, product which solves no business problem .. oh, the horror.

One of my first jobs in the industry was really similar. I ended up sitting down with a friend and rewriting it in C#. We didn’t have permission, but no one knew until our codebase was already in working shape (a month or so). We got away with it because the original codebase was so bad that it hadn’t shipped in 5 years. Months of 0 productivity were normal. My friend and I went on to rewrite the entire suite of products over the course of a year or two. We then started our own business. The rewrites were the most successful products that company had ever had in the modern era. Rewriting is not always a bad idea, and it can be less expensive in the medium run. Few seem to realize this, thanks to Joel Spolsky’s blog post on the matter being seen as dogma.

We had a bunch of code at work that everyone sort of begrudgingly used that I wrote almost 20 years ago, not knowing too much about what I was doing. I have recently rewritten it – about 100kLOC of messy C++ turned into 25kLOC of bliss. The test coverage we had ensured that we didn’t have to worry about anything breaking. I hate myself a bit less :)

I feel like you could make a huge chart on the wall showing the epochs and what the previous developer didn't know at that time. Like "why would it be done this way? Ah 1997, let's see on the chart... Ah right, Greg doesn't know SQL here"

Haha that would be amazing!

what kind of problems do you cause at a construction site that do not get you fired but reassigned to Programmer with programming 101 book?

Probably fall from roof few times or was not really handy with hammer or something.

The company is quite fascinating, they started around 1945 and during the years they've became small conglomerate. There are three or even four generations workign together and once they like you as a person they will find something for you to do.

Is the original programmer still at the company in any capacity?

No, he left before I was hired as a consultant. I had never get chance to talk to him in any form. I am the single person who ever touched the source code for past 10 years. I don't know why he left or where he went.

From my experience this means they had a strong union.

Perhaps he played too much Minecraft and tried building a Turing machine from the materials on the construction site?

C can't be that different from redstone right?

Like reading the journal you found in the abandoned house you just moved in and it belonged to the previous kid that used to live there. Sounds like a movie.

Oh yeah it might be commedy where two people argue about spaces vs tabs and there come our function of 20k lines in single block of code that never heard about using any of those... nor about moving repeated code to own function.

This sounds quite awesome, terrible, and he sounds like he was jumping into a very deep end. Quite sad that he seemingly didn't have someone to tutor/mentor him moving to this role. Given that it worked, and there was a learning curve as you described, over a decade, he seems to have had some big amount of determination to get things to work.

I recall a civil engineering suite of programs that had been converted from Basic into Fortran IV.

The basic was so old it only had two (yes two) character variables - the Fortran code made liberal uses of Arithmetic IF staements !!!!

An example of one is IF (S12 - 1.0) 13, 13, 12

>> there is not enough budget to actually rewrote whole system,

As pointed out elsewhere, this is definitely solving a real problem, the longevity of the app is the proof.

Instead of rewriting, can you replace it with newer idioms? A MVP/PoC for a newer way of solving a problem (AR may be here) that the software solves with some tangible gains, the latter is more important, can lead to approval of a mini budget for that MVP and who knows what that can lead to.

Well, maybe. Biggest problem is using database from 80s, which is used in a way that sometimes it acts like database, sometimes it is used by copying files (one file per table) around random directories with custom locking mechanism.

App consist of maybe 20 different codebases that generate around 30 executables, kind of randomly, fetching source files from different codebases as programmer find a fit + random "fixes" of system modules/component make it all very very hard to do much groundbreaking work.

That's fascinating and gave me a genuine laugh.

I think you won! I had similar experience with my first professional job, a cad cam that is solving building structures for antisysmic regulations. Every developed had his own lib, duplicated code with different bugs on each lib, no comments, every screen was build from copy paste code of other screen and no tests at all. Undercovered bugs was out there for many years without any way of knowing and special sleep functions producing intentionally slow code.

"lava flows" anti-pattern

What is he up to now?

I don't know. The company is based in rural Austria and so it took company quite some time to find another engineer that will took on this project (me). I have never met him or have any other contact with him.

Well, if you're working on a codebase for 10 years, then "no budget" is not really an excuse, sorry. As a responsible engineer, you should have either convinced management to spend some of your time on refactoring main parts, or cleaned it up yourself bit-by-bit every time you touch something. 10 man-years should be enough for a program that was created in 10 years by a single rookie dev.

Sure if this was my job, then I would do it but I am just contractor with set amount of hours devoted to the project. First few years was spend fighting fires as the company needed this very specific software to function, currently it is just about keeping eyes on having it run and occasional fix some report or update data pipelines.

I would say that rewrite would cost about 2 millions of euros. Which is really big price tag for company that use this system as a backoffice tool.

Some company have back offices that they've spent considerably more on. Airline companies for instance may have a tool that lets the person at the gate check who that person is, what their deal is, etc. And then GDPR happened and the bill to ensure that every rule is followed to the letter and suddenly 2M€ isn't that bad after all...

Years ago as an intern at Microsoft, I had code go into the Excel, PowerPoint, Word, Outlook, and shared Office code.

Excel is an incomprehensible maze of #defines and macros, PowerPoint is a Golden Temple of overly-object-oriented insanity, and Word is just so old and brittle you'd expect it to turn to dust by committing. There are "don't touch this!"-like messages left near the main loop _since roughly 1990_.

I had a bug in some IME code causing a popup to draw behind a window occasionally, and a Windows guru had to come in to figure it out by remote-debugging the windows draw code with no symbols.

I learned there that people can make enormous and powerful castles from, well, shit.

"Don't touch this" around the main loop can mean being able to make promises about responsiveness, reliability, etc.

Frequently there are critical code sections where it is much easier to tell people "don't touch it" rather than training people how to work on it safely.

When that is the case, would it not also be a really good place to explain why not, or provide a link to the place where such an explanation is provided?

It is important to point out that most computer systems are running non deterministic operating systems.

For example, code running in JVMs on top of non deterministic operating systems sometimes behaves in really odd ways. Sometimes a main loop is stable for reasons nobody understands.

In reality there often isn't time.

Getting it done > Getting it done properly as far a management is concerned.

Ever had this discussion with a coworker?

Coworker: "I hadn't enough time to do it right"

You: "Given enough time, how would you do it differently?"

Coworker: "............" (crickets)

IMHO it's not related to deadlines only ; the "not enough time" argument is often a comfortable fallacy, keeping us from facing the limits of our current skills.

I found it to be especially true with testing. I've lost the count of how many times I heard "we didn't had time to write (more) tests". But testing is hard. And when given enough time, these developers don't magically start doing it "right" overnight.

Knowing when something isn't right is easier to knowing how to do it right. So I would be wary as to saying it is a lack of skill by your co-workers.


I had to write a bespoke popup window launcher for a large gambling company in the UK. The games were mainly the awful slots games that you see in motorway service stations. These are basically one arm bandits on steroids.

There is a lot of logic that was in JavaScript that should have been in C# and I had to design it correctly to work with a third party Proprietary CMS system and I had to manage session tokens on 3 to 4 third party systems. Not easy.

It took me about 2 weeks of just reading the code and absorbing it, drawing lots of diagrams of how data flowed through the system and then porting that logic over to C# in a way that would work with the CMS system in a logical and OOP fashion and handling auth tokens effectively.

I've quickly realized that there is never enough time.

If anyone feels that there is enough time then (depending on the position dev/manager/client/etc) he starts slacking, moving focus, moving deadlines, moving staff, demanding more features/support/documentation or new requirements analysis, start being pissed more about smaller bugs.

>I found it to be especially true with testing. I've lost the count of how many times I heard "we didn't had time to write (more) tests". But testing is hard. And when given enough time, these developers don't magically start doing it "right" overnight.

Bingo. It's not necessarily only skills though. It can be myriad reasons and "no time" is just the easiest excuse they can think of. In big companies I've often seen the company process prescribing such bad tools that a good TDD testing strategy is impossible to do with those tools, but they won't move away from them, because somebody in purchasing already bought 10,000 licenses for this bad tool (which is often just a bad GUI, which doesn't really help, except for selling the thing).

The worst tool was a bad GUI where you couldn't even define functions you want to use, that had slow (>1h), non-deterministic, test execution, for a unit test.

To even write unit tests effectively you need to write your code in a certain differently.

In C# this normally means using IOC + DI.

Also almost nobody I know does proper TDD. I know it is very convincing when one of the TDD evangelists shows you how to write something like a method to work out the nth number in a fibonacci sequence using nothing but just writing tests.

In reality most 95% of developers that even write tests write the logic and write the test afterwards to check the logic does what it should.

>To even write unit tests effectively you need to write your code in a certain differently.

>In C# this normally means using IOC + DI.

I've become quite partial to functional programming in the last few years. Side effect free functions with functions as interfaces for DI lend themselves perfectly to TDD and data parallel async without worrying too much.

C# is now slowly taking over most of the good features from F#, but I think the culture won't transform so easily.

and let’s not forget the ever popular “bug-driven testing”.

Unlike your coworker, I _always_ have a plan. Often a dozen of them. With various pros and cons for each.

But also unlike your coworker, I probably _figured out how to do it right_ in the time given. It's pretty rare that I don't have time to do it right; it does happen (especially with extreme instances of scope creep and requirements drift), but it's rare.

Which I guess is your point? The time excuse is just an excuse, and a good developer writes good code.

The one that gets me is when you've designed something as simple as possible and then to make it "simpler" people insist on making it less general and paradoxically more complex in a small way.

Related to that is the "obvious performance fix" that doesn't perform faster that keeps burning up time for years long after it was proven to not be faster because freshers never found out about it and the oldsters forgot.

This is a complete fallacy IMO. The time is 10 fold down the line when people are attempting to reverse engineer in order to maintain it.

Presumably, down the road, you'll either be a defunct company or doing well enough to afford 10 times the manpower to fix things.

Facebook was a spaghetti code mess in the beginning. I'm sure it caused them some growing pains, but moving too slowly early on would have likely been more costly.

> Presumably, down the road, you'll either be a defunct company or doing well enough to afford 10 times the manpower to fix things.

Only in startup land which is still a small fraction of our industry.

Most places will never have 10 times the manpower to fix things and are hurting themselves by not doing them properly in the first place.

> Facebook was a spaghetti code mess in the beginning. I'm sure it caused them some growing pains, but moving too slowly early on would have likely been more costly.

Survivor-ship bias, for every facebook how many potentially viable companies never got off the ground because users couldn't tolerate using their steaming pile?

Notably, MySpace is often cited as a company that failed because their codebase was terrible, which prevented them from adding features as quickly as Facebook could (despite having many more engineers at the time.)

I don't believe that for one second considering the hacks that Facebook has had to do around the limitations of PHP.

I'm completely opposed to the view that bad craftsmanship is acceptable because of time constraints. You are paying for it very dearly, very soon. It is of the utmost importance to write the best code you can from the beginning, and I don't believe it slows you down very much, if at all.

If you've ever seen a software product where something that should take a weekend takes months to get out, it's often not because the problem is more complicated than you'd think, but because of a mangled, complex codebase which prevents anyone from getting real work done.

Edit: Removed a bunch of redundancy.

Both you and this comment's parent are correct. Which doesn't say anything about the nature of software engineering or project management, but more the importance of taking context into account when considering advice.

In this case, the evidence is that excel/word etc are doing fairly well...


The long as short of it as a contractor I have to get it done. It will be probably me making the changes later and I make sure I put these things called comments in.

Also developers pretending code quality is an either or proposition is a false dichotomy. You can write 80% of it in a correct manner and the other 20% could be just hacks to get it done in time. You can't write the perfect system.

So I am sorry you are the one being fallacious.

Try telling that to a pointy-haired boss.

To be fair, if you are implementing a feature or fix in Word and think you need to edit the main loop - 99% chance you are wrong and the fix would be better placed elsewhere. And 99% chance that an edit will cause regressions or changes in behaviour elsewhere.

Yup ! Code is the how.... Comments are the Why

1990 you say.... maybe the programmer was just an MC Hammer fan.

Waited a bit for this question to pop up, but it didn't, to my surprise. So:

> ...remote-debugging the windows draw code with no symbols

Why, specifically, were no symbols available? I can't come up with an explanation. Surely old symbols are kept. Do checked builds take longer to iterate on (ie build), or something?

That's a great question! I didn't mean to imply that we absolutely couldn't have used symbols -- the answer is just because he was able to figure it out without them and it was less effort to try without first.

Office and Windows are different teams and units, so one dev on one team typically wouldn't have access to all of the symbol info for the codebase of the other. Setting that up takes some hoop-jumping, so he tried without and ended up figuring things out just fine over a few hours.

What I wanted to demonstrate was that in that moment all he had available was shit, and he still managed to push the castle higher.

Wow, I see.

I must admit, I do very much wonder what kind of environment Microsoft would be if teams were less segregated. I found http://blog.zorinaq.com/i-contribute-to-the-windows-kernel-w... in the comments, which seems to hint at the same sort of theme somewhat - particularly the bit about contributing to teams other than your own. There's a strong notion of isolation.

This is just thinking out loud, a response is not required. Everywhere has pros and cons. I'm (even with all this moping) actually less hesitant about MS as a whole than the rest of FAANG (except for N, which I also don't see a problem with) - not because of the whole "new MS" thing, or GH, but because everyone else seems to have fewer scruples than I consider to be a viable baseline. So there's that. :)

It's just kind of sad to see these kinds of inefficiencies, and it would be cool to eliminate them. Of course, it'd unleash organizational chaos for a while, but of course it would be totally worth it.

Maybe in the year 2050 some future intern/employee will be adding to the Office code and wonder about the "Don't touch this" code relics of the past that someone from long ago left for future generations.

I bet in 2050 people will puzzle over the microservice-cloud-javascript-Go-caching legacy systems that were developed in 2018 and be scared...

I mean, in 2018, I'm scared witless of all that. Anything cached in background to someone else's computer by something as unstable and slow as ECMAScript is just... a very bad idea. Keep programs local, and don't use Web coding / scripting for anything but the Web itself through a free-standing browser.

Can you guess what happened here:

https://news.ycombinator.com/item?id=15745250 ?

It appeared that a bug in an office component was fixed with manually binary editing. Is that probable?

and this tradition proudly continues with Windows 10!

The worst program I ever worked on was something I was asked to maintain once. It consisted of two parts. The first was a web application writen in ASP. The second portion was essentially Microsoft Reporting Services implemented in 80,000 lines of VB.NET.

The first thing I did was chuck it into VS2010 and run some code metrics on it. The results were, 10 or so Methods had 2000+ lines of code. The maintainability index was 0 (number between 0 and 100 where 0 is unmaintainable). The worst function had a cyclomatic complexity of 2700 (the worst I have ever seen on a function before was 750 odd). It was full of nested in-line dynamic SQL all of which referred to tables with 100+ columns, which had helpful names like sdf_324. There were about 5000 stored procedures of which most were 90% similar to other ones with a similar naming scheme. There were no foreign key constraints in the database. Every query including updates, inserts and deletes used NOLOCK (so no data integrity). It all lived in a single 80,000 line file, which crashed VS every time you tried to do a simple edit.

I essentially told my boss I would quit over it as there was no way I could support it without other aspects of work suffering. Thankfully it was put in the too hard basket and nobody else had to endure my pain. I ended up peer reviewing the changes the guy made some time later and a single column update touched in the order of 500 lines of code.

There was one interesting thing I found with it however, there was so much repeated/nested if code in methods you could hold down page down and it would look like the page was moving the other way, similar to how a wheel on TV looks like its spinning the other way.

When I do code necromancy, instead of understanding the code, I try to understand what it does, how it interacts with the world, and then recreate that behavior.

Meaning I capture all the I/O and recreate it. SQL, HTML, PDF, CSV, whatever. Serialize everything, before and after. And then do diffs on the outputs to see if new code reproduces expected behavior.

Much easier than dead code removal, code deduping, incremental changes, backfilling tests, etc.

Once someone captures, documents what the code is actually doing, the real refactoring begins. Removing unnecessary queries. Grooming data. Simplifying schemas. Aligning the app with the biz. Etc.

This is often the only sane thing to do.

It’s my default approach with thorny scientific code: get everyone to agree on what output should be for a bunch of relevant inputs, get the original systems output, hash it, and then write a bunch of tests in a new project that all assert that these inputs produce outputs whose hashes are as follows..

then never look at the guts of the horror again.

> spinning the other way

Hahhahah. That is the funniest shit I have read in a long time!

At my first gig I teamed up with a guy responsible for a gigantic monolith written in Lua. Originally, the project started as a little script running in Nginx. Over the course of several years, it organically grew to epic proportions, by consuming and replacing every piece of software that it interfaced with - including Nginx.

There were two ingredients in the recipe for disaster. The first is that Lua comes "batteries excluded": the standard library is minimalist and the community and set of available packages out there is small. That's typically not an issue, as long as one uses Lua in the intended way: small scripts that extend existing programs with custom user logic (e.g. Nginx, Vim, World of Warcraft). The second is that Lua is a dynamic language: it's dynamically typed, and practically everything can be overridden, monkey patched and hacked, down to the fundamental iterators that allow you to traverse data structures.

This was the playground for the guy to create his own reality.

Lacking a serious standard library, he crafted his own. Where a normal world e.g. file rename function would either do the job or return a error to the caller, he chose a different approach. Functions were autonomous highly intelligent pieces of code that tried to resolve every possible problem, entangled with external logic, so grokking the behaviour of the most fundamental things was challenging - let alone understanding fragments of code composed of library calls.

Lacking a OO model in Lua, he built his own. I can spend a lot of time describing with what was wrong with it, but it suffices to say that each object had SIX different 'self' or 'this' pointers, each with slightly different semantics. And highly entangled with external unrelated logic of course.

I'll save the stories about the scheduler and time series database he built for another time.

I'm a masochist and would very much like to hear about the scheduler and time series database!

I've seen personal reality building happen in Lua several times already. It's very seductive to intelligent solo artists who are given a lot of freedom.

To be fair, I've also seen it happen in C, C++, and JavaScript.

"personal reality building" is the greatest quote in this whole thread, it's perfect. Will be using.

That's spot on, the guy is indeed one of the most intelligent, knowledgeable and dedicated persons I've ever met. Actually a really good guy.

Ruby as well. I went down that path once. I learned my lesson.

“This was the playground for the guy to create his own reality.”

Queue the Neil Gaiman sandman styled comic about his dark adventure

Well, OO libraries for lua were not popular nor standardized until not long ago, so, the default suggestion for it was actually to create your own OO lib, which many projectes ended up doing (mostly in the same way), which led to some trouble.

For replacing nginx with lua, that is curious. Openresty is not being used? No web framework? (lot of lua dead web frameworks in the road, by the way)

Also, "creating your own reality" is usually a bad thing in any language. That usually happens when you're developing something alone, for long and didn't give maintenance to other peoples code much in the past.

In another related point: lua is NOT supposed to be used as glue/extension code. It was designed so that it would be easy to do so. It is that "you can" more than "you should".

Lua doesn't come with batteries included, by design and doesn't have a plethora of libraries available or (coff coff) easy to pick from, but, they do exist and they do solve most problems. Nonethless, truth be told, most of the good api for Lua one sees nowadays, was not available 1 or 2 years ago or not mature enough.

To conclude, lack of type hinting in Lua or optional static typing can create problems for bigger problems if good design, testing and documentation is not enforced from day 1. Most scripting languages suffer from this. You guys could try "ravi" to get this (almost) for free.

This is gold.

Part 2 please???

We've got a Robert Heinlein fan here...

Also, give us part 2 =D

Behold, academia.

The only maintainers of this code, ever, have been grad students and postdocs. I estimate there have been about 12-15 generations worth. This code has supported hundreds of publications in its lifespan.

A codebase that began life in 1987, in C. First ported to matlab in 1999. First source control was added (as SVN) in 2015. Between 2015 and 2018, there were 6 commits total, yet 3 people graduated out of the lab from it. Probably 100,000 loc total, of which I estimate maybe a third is ever used. 1400-line matlab functions are normal-ish. I've found loops nested 11 levels deep.

It's a series of psychophysical experiments. Each experiment exists in at least 4 different versions side by side in source, each named slightly different, often by incorrect datestamp of last modification. Version control across machines is not well maintained, so you have to diff everything before you can copy or move files lest you accidentally blow something away completely.

Oh, and it's mexed and wrapped for use on a mac on exactly one snow leopard machine, hardware from 2007.

edit: I think this counts as a job, not a student experience, because I am not a student. I just have to clean this mess up once in a while.

Yeah. At this point I think teaching source control and abstraction should be part of the "the scientific method" part of the course

I think it is a problem in general with code for experiments. You just need to change a tiny bit for new experiment and you don’t want to ruin earlier experiment.

There's a very simple solution: guard your tiny bit of code with e.g. a command line flag that defaults to "off", and commit that.

It's a little more work sometimes, especially when your experiment changes things structurally, but it pays off over and over.

Isn't this what git branches are for?


There's a great article somewhere about how the normal version control flow doesn't really work for this style of computing.

You want to keep both "versions" of code live and active in the same place at the same time (often in the same notebook).

People end up with methods named methodName, methodName2 etc, which isn't very good. But once you see the workflow you understand why normal version control doesn't work either.

There should be a solution to this, but AFAIK there isn't yet.

No, branches break down badly when you have many experiments. You end up with a bunch of incompatible versions of the system that you need to merge together later which can be a huge mess (depending on the size of the changes).

By all means, branches are great for super-prototypey early code, but once you know that you want to keep the ability to run the experiment around, guard it with a flag and merge it into mainline to avoid nightmare merges later!

A customer-facing dashboard. Yes, a dashboard. How bad can a dashboard be bad you ask? Well, for one, the dashboard had tabs, and each tab was a separate webapp hosted on a separate server. And each team was responsible for developing and maintaining the webapp that their team was incharge of (i.e the User team in charge of Users webapp, Feature1 team incharge of Feature1 webapp).

Now add on the fact that different teams had varying levels of frontend competence. This led to some webapps being in (badly written) React, some in Angular, some in JQuery and one in Angular2 as well. Some were java API backed, some were NodeJS backed and one used Ruby in the backend. Oh yeah, and each had different datastores as well.

Now alongside this, there's no central auth framework, so each webapp had their own way of how to determine user auth (there was a shared cookie thankfully on the *.company.com domain), so there were 6-7 possible login and logout pages as well.

When the company had a brand redesign and needed all customer facing stuff to re-align with new design, we had to literally re-design 6 dashboards which have used different paradigms and different tech stacks.

To this date, the dashboard still exists and is used by customers (Main dashboard for a company valued at $500M+) and the most common issues are issues related to Auth (eg. random logouts when switching tabs), data inconsistency (They have crons which update data from one DB to another, but it's not immediate) and an inconsistent design and UI behaviour (since JS is also different for each app) which pisses off many users.

Till date, I'm not sure who signed off on this pointless dashboard design.

To some extent this sounds like Spotify. I have been told it's basically Iframes stitched together, and each frame is owned by a team [0] , I do think Spotify has some better auth communication though.

0: https://www.quora.com/How-is-JavaScript-used-within-the-Spot...

Aha! This is the Spotify model!

I've run into exactly this issue at a past job, multiple webapps run by different teams that were "tabs", with a central auth system. I turned up, saw this and couldn't believe my eyes. Basically because teams couldn't agree on working together, they made this horrid attempt of a service based architecture and the result was chaos. I know monoliths are not popular, but they have their place in early stage companies where moving fast v's having a trendy design is critical. Company went out of business a few years later...

Talk about shipping the org chart.

Sounds eerily familiar to something we had at the investment bank I worked at. I'm guessing this is in finance and these apps are from different departments/desks?

Teams not collaborating is the giveaway.


Sounds like many resumes were polished at that company.

Sounds like how they do it at Spotify

We have absolutely no idea how to write code. I always wonder if it's like this for other branches of engineering too? I wonder if engineers who designed my elevator or airplane had "ok it's very surprising that it's working, let's not touch this" moments. Or chemical engineers synthesize medicines in way nobody but a rockstar guru understands but everyone changes all the time. I wonder if my cellphone is made by machines designed in early 1990s because nobody was able to figure out what that one cog is doing.

Software is a mess. I've seen some freakishly smart people capable of solving very hard problems writing code that literally changes the world at this very moment. But the code itself is, well, a castle of shit. Why? Is it because our tools (programming languages, compilers etc) are still stone age technology? Is it because software is inherently a harder problem than say machines or chemical processes for the human brain? Is it because software engineers are less educated than other engineers? .....?

It's like that in other fields of engineering too, when they are making something they haven't built before. That's the essential part: for example, a lot of construction is really just rebuilding the same thing that's already been built 100,000 times in the past 100 years.

When they attempt to build something new, it often ends up like software – tremendous overruns in both cost and schedule.

C.f. the only new nuclear power plant being built in the West, which is $4 billion over budget and ten years late: https://en.wikipedia.org/wiki/Olkiluoto_Nuclear_Power_Plant#...

The reason why this feels more commonplace in software is that we're usually designing something new. Software has essentially no reproduction costs, so there's no reason for anybody to design software that's a carbon copy of something you can already download or buy off-the-shelf. That's not the case in engineering of physical products or works. New buildings are needed all the time, even if they're performing exactly the same function as the building in the adjacent lot.

> It's like that in other fields of engineering too, when they are making something they haven't built before.

Exactly this. I like to use an analogy of building bridges :)

  - OK, we have this valley and we want to drive cars over it. What do we 
  - Hmm, we could just make the road along the bottom of the valley
  - Wouldn't that cause the cars to fall because of the steep angle?
  - Good point. Maybe if we built the road with a lot of twist and turns
  - Then it's a slow and long drive.
  - Maybe we could build some sort of catapult setup to throw the cars
    across the valley
  - That would make safety a concern though. Is there a way we could use
    helicopter rotors to suspend a road in mid air over the valley?
  - Or what if we attach the road to each side of the valley and make a 
    road that's strong enough to not crack in the middle under its own weight?
  - Yeah, that sounds like a good idea. How do we make sure it can hold its
    own weight, though?
  - .....etc

> Maybe we could build some sort of catapult setup to throw the cars across the valley

This was played for laughs by German satire news website "Der Postillon" (like The Onion, but in German): "Department of Traffic to replace ramshackle bridges by jumping hills"

Nicely photoshopped picture at: https://www.der-postillon.com/2014/11/lander-ersetzen-marode...

Very interesting point.

I must admit that when I saw valley in

> OK, we have this valley

I immediately thought about SV, and then as I read

> - Maybe we could build some sort of catapult setup to throw the cars > * across the valley*

I imagined some kind of scenario happening in an alternate reality where a startup was trying to dream up a viable way to actually achieve this in SV.

It was funny because of the fact that a lot of the ideas people do come up with are probably ideologically very similar in magnitude and ridiculous impossibility.

It's not the only one being built in the west, EDF is having tremendous difficulty delivering the other ones. Flamanville ( https://en.wikipedia.org/wiki/Flamanville_Nuclear_Power_Plan...) is also very late, and given that I'm not quite sure why the UK ordered one too (https://en.wikipedia.org/wiki/Hinkley_Point_C_nuclear_power_...).

Or you find that the ground can't support the bridge and it collapses - If your lucky you find out before that happens and redesign the bridge.

Most software, including on this thread, is not truly new, or at least can be built from known, reliable tools and patterns.

It's because software that solves the problem the business intends to solve, and can be maintained without unreasonable time and effort, is good enough. Code readability and maintainability problems are not business problems unless they impact developer productivity to the extent that feature development becomes too slow to tolerate.

After all, if work inside the codebase does not make a difference to the people who use the software, in terms of reliability or features, is it really worth doing?

The trick is to find a balance where efforts to improve code quality actually improve outcomes for the business and the users -- otherwise it's not justifiable to take time away from feature development, which is what the users actually care about.

not all software is solving a "business problem". sometimes it's academic, or hobby, or government. but always the code is shit.

I think it's because software is inherently easier and less critical...castles of shit actually work. Software is fairly unique in that level of quality and reliability needed to have significant economic benefit is very low, and the consequences of system failure are rarely severe. That's not the case in most branches of engineering. The complexity levels are also off the charts, in large part because economic value is so loosely coupled with quality. In many cases more turds are more profitable.

The difference is, an elevator is a precisely defined problem and never needs to have its functionality improved and changed.

In fact, you could build the exact elevator almost identically. Software is never like that — it’s dynamic and constantly changing throughout the life cycle of the software.

> an elevator is a precisely defined problem

Ha! As a mechanical engineer, no, nothing in reality is ever precisely defined. Consider that every single part in the elevator has a tolerance: none of the parts are exactly the same in every elevator. Did you account for thermal expansion? What about wear? Fatigue?

> Software is never like that — it’s dynamic and constantly changing throughout the life cycle of the software.

But elevators need to be shipped right the first time and can't make mistakes; with software, it almost never works perfectly no matter how much time passes.

> with software, it almost never works perfectly no matter how much time passes.

There is a person inside me -- whenever this is said -- that wants to shout "No! You can prove correctness of your program!". But this complicates the issue even more, since afaik no elevator's correctness is proven but it just works. Mechanical stuff somehow just magically work without proof whereas it's still debatable if proven software works (did we prove the software that proved it?). I don't know what's the difference. Maybe elevator has a well-defined structure, it goes up and down, opens the door etc, even though implementation can change (as you explained). But maybe software isn't like that. Idk.

> Mechanical stuff somehow just magically work without proof whereas it's still debatable if proven software works (did we prove the software that proved it?).

It might seem like magic, but, of course, there is actually a science to it. Depending on the situation, a part can work if a certain length is 10.000 or if it's 10.001. In software, that is never the case: a value that should be 10 but is actualy 10.001 can stop everything. In engineering, there is a limited amount of leeway at every step of the process, and everything is slightly overengineered by some factor of safety to ensure that this is the case. For example, if an elevator cable is rated to hold 10,000lbs, the elevator will be sold as having a maximum capacity of 8,000lbs. In correct usage, (weight less than 8,000lbs), there is a sufficient factor of safety on the cable so that it won't break even if the cable is cracked, worn, etc.

I mean, we have that kind of thing in software too, when it comes to e.g. memory usage or runtime (in a sense, anywhere we start hitting physical reality, really).

What should we make these buffer sizes? There's not really a right answer, but there are definitely some wrong ones. Make it big enough to handle the expected use cases and pad a bit extra. Works exactly like a tolerance in physical engineering.

Mechanical things allow a surprising amount of modularity, compared to software, and therefore make it easy to get test coverage.

E.g. your cable is rated for 10,000 kg and 6-month inspections, your driveshaft can go 5 million revolutions or 9 months before it needs to be greased...test all those things separately, put them together and you have a stateless system that if it works today will work tomorrow.

Turns out physical reality is robust in ways code written by mere mortals can only dream of.

Physical reality is linearity and bell curves. And when it's not, you can usually get away with pretending it is.

Software is... not.

Physical systems work perfectly, until they don't. Roofs and bridges collapse sometimes, even though you'd think we would have figured them out by now. Rockets explode because materials behave slightly differently than the engineer expected.

Building things is really hard. I don't think software engineering is in any way special. It's just that the problems we solve with software are often more complicated than the problems we solve in meatspace and failure is much more benign so that QA is not done with as much diligence.

Indeed. Writing software is akin to drawing the design of the elevator, not building an individual copy of the elevator. Each elevator is equivalent to one instance of the software deployed to run in one place. The same design/code is used to build/create multiple instances. So when we write new software to solve a new problem, we are designing something new and unique - not another blueprint for the same elevator.

> I always wonder if it's like this for other branches of engineering too?

EE here. It's not. The entry bar is much higher: most of the time difficult projects are given to senior engineers. It takes a degree and many years of work to get there.

> Or chemical engineers synthesize medicines in way nobody but a rockstar guru understands

Been there. Not at all.

> cellphone is made by machines designed in early 1990s because nobody was able to figure out what that one cog is doing

Same. Part of the reason is that any physical component adds cost, weight, and so on to a project.

In software you can throw code and random libraries at a problem and nobody will notice until vulnerabilities start to pop later on.

We gotta finish these bridge foundations by this Friday, guys. The sprint ends next week and it’s a short week with Thanksgiving coming up.

I think a lot of programming could be described as "formalizing the logic of a business" which is a very strange and interesting problem.

Programming also involves things like "formalizing and automating the representation of knowledge in general" which is a holy grail of philosophy since Leibniz's time.

We're always building on top of preexisting ontologies and logics which sometimes fail to even make sense, or which make it tedious to express things that we would want to consider elementary (Unix, TCP, Java, GTK+, SQL, etc).

And we're always vulnerable to being smacked on the head by yet another "Falsehoods Programmers Believe About X" post detailing the myriad ways in which we routinely falsify and oversimplify to deal with the boundless complexity of actual reality and the horrendous details of legacy bureaucracy.

Brilliance is not indication of an ability to write clean code.

In my experience, code needs to be re-factored 3 times before it starts to look coherent and clean.

But 99% of projects can't justify that kind of expense.

But good code is very feasible, just takes time.

I'll expand this further by saying yes to all of the above with the addition that to get it perfect in my experience requires about 7 iterations.

Yup. Diminishing returns on that though. I'm happy with 3. Unless it's API or public code used by a wide audience then it needs a lot of eyes and has to be very clean.

The best code I've written is when rewriting some code from scratch having a clear picture of the solution, or when I solve the same problem a couple of times.

Solving complex stuff takes a lot of effort, and usually most companies do not have the resources or the motivation to go back and rewrite a product.

Not exactly relevant to the overall discussion but I've been dreaming for a long time about programming an elevator such that it arrives faster and skips other floors if a user frantically pushes the call button repeatedly. Same for traffic lights with push buttons.

Maybe it's a good thing that I'm not an engineer.

There are apocryphal stories about elevators that let you push/hold buttons to cancel floor selections (eg, https://www.youtube.com/watch?v=eQSdKe5kArA) or skip floors. I often forget to try them, but on the few occasions I've remembered nothing has happened.

All elevators accept a standardized fire/security key that enables "exclusive" mode that lets you tell the lift "go to this floor" and it will do so: https://www.youtube.com/watch?v=1Uh_N1O3E4E (a reasonable sink of 1hr)

As for traffic lights I'm aware (at least in Australia) that they all tie back to a realtime system that lets a control room instruct lights to turn green and so forth. (Presumably the immediate action is that the lights that are currently green immediately turn orange, then red after the normal delay.) You can get summarily fired for misusing this though. I've also heard (IIRC on a TV documentary-type show) that ambulances tie into this system with live GPS tracking such that lights turn green as they approach, but this is sufficiently fantastic that I'm waiting to trust-but-verify it before I quote it with confidence.

Under normal use some traffic lights are traffic based while others have timers. The ones that are traffic based will take into consideration whether the pedestrian crossing button has been pushed and reduce the threshold value needed for the actual traffic lights to turn red. The other timer ones - like the really annoying ones outside my local mall :) - are on a fixed timer, and I try to run for that one when I know it's about to turn green, or just accept and twiddle my fingers while I wait if I miss it.

It'd be nice if elevator simulators were satisfying to write. Although, actually... I've just realized that with Unity you could have _quite_ a lot of fun :D... (hmm, I've had GPU on my wishlist for about 15 years now, now I really want one)

There is one fundamental difference between software engineering and any other type of engineering. It's that the feedback of the rules that govern the outcome is instant and received by multiple human senses.

If you hit something with a hammer the result and response is instant. You can feel the nail went deeper before you look, you can hear you hit the head correctly before you think about it. You can feel the vibration and hear and feel and see the wood crack before you could read a sentence about it.

You don't have to compile the hammer and run the hit and read the result from the screen and if you logged the right things you will see something about the result based on what you logged but not quite everything because that would be an unintelligible mess. If you compiled the right version of hammer that is.

You don't even know for sure if you are holding a hammer. Of course that's not called a hammer but a unique tool you downloaded because they said it is the new best tool of the year - and there are several famous new tools every year. Though you cannot be sure if that tool helps in your job until you try to use it as you don't know if you are using a hammer or an excavation machine and both can be suitable for the task one being slightly bigger though.

I don't think it's any of the reasons that you listed. To a large extent, I blame management - they push sub-optimal solutions because it's generally the fastest way to resolve the immediate problem. This generally results in poor solutions being piled up on top of each other that becomes progressively more impossible to maintain.

This despite the fact that if additional time were allocated to come up with a better solution, tomorrow's software would be easier to integrate on top of today's, and with a higher quality of code.

But the reason that managers behave like this is because they can - and this I think has more to do with the ethereal and timeless nature of software than anything else.

In the real world it simply is not possible to put together a physical artifact with the sort of compromised constructs that appears in software.

There are costs of material to consider; costs of production. There's degradation over time.

All of these aspects of physical constructs provide a strong motivation to produce a quality product - not out of beneficence, but because (quality engineers aside) it's cheaper and it's really the only feasible option.

>I wonder if engineers who designed my elevator or airplane had "ok it's very surprising that it's working, let's not touch this" moments.

I honestly sometimes do wonder. Judging by the elevators in my building some things even barely work at times. They go down often and require parts replacements that take weeks to arrive. They're barely 3 years old.

> They go down often

Isn’t that one of the two main features of the product?

"to go down" means "to be in a non-functional state" as in "Facebook went down". Compare to "to go up" like "servers are up" i.e. servers are functioning.

I think maybe you didn't get the joke

Your 2nd to last one is right I think. Other forms of engineering are constrained by physical world rules. In software we have to recreate reality and redefine and model how the real world system works. Mech eng/ chem eng etc don’t have to recreate reality , physics just works and will always work .

We have plenty of ideas on how to write code well. The catch is that they all make development slower and more expensive (at least up front, maybe the technical debt will come back to bite, maybe not).

The larger a code base grows, the worse it will get. Once in a while you get a problem that can only be solved by a "hack". And once in a while you need to make performance optimizations. Then time goes by, things are forgotten, the language changes, the OS get replaced, the machine gets replaced, etc.

Chemical synthesis of medicine is highly scrutinized and fairly well understood, but there are a lot of other products that large chemical companies make in great quantity that are basically made using Black Magic. Tinker with it until it works, then don't mess with it!

One reason is that there's still a divide in understanding of the requirements to actually solve a problem between management and the engineering team. Unrealistic deadlines are given, and as a result, shit is produced.

At one stage a company I worked for was considering licensing the code for a school time-tabling application, rather than paying the company to do the (fairly minor) changes we required to meet (non-USA) state requirements. The company was started by a couple of teachers, the same people who wrote their product. It was 10s of ks of Pascal code, but with not a single variable name or function name that made any sense; everything was A, AA, AA1, X, XX, XX2 etc. I spent a few days looking at it, then recommended we keep paying the somewhat steep cost for the modifications. Then at least if anything broke it was on them to fix it.

Incidentally we had a small falling out with this company, and they were refusing to update their executable until this issue was resolved. This looked like affecting some hundreds of schools and their timetables. I did some checking, and it turned out their 'non-updated' executable was doing a simple date check on the local PC; if it was past a certain date, the executable refused to run. So I did a quick hack in our application that involved: - setting the local PC date to prior to the 'cutoff date' - running their executable with the required parameters, and grabbing the results - setting the local PC date back correctly

This led to interesting negotiations as they were puzzled why their 'gun to our heads' no longer appeared to be working, and things were resolved to the benefits of both parties soon after.

Code obfuscation + blackmail negotiations ... awesome company to do business with.

I took over a Perl project where every SQL call was an exec to a java program which would make the query.

The largest madness was a J2EE mess where persistence was achieved by taking your current object, sending a message bean to the server, it would touch the database and return the result which was being polled for (making it synchronous). The amazing thing is that the client and the server were the same J2EE instance. So Class A could have just called Class B. Instead it was A -> turn it into a message bean -> send it to the "server" (same machine) -> unwrap it into A again -> transform it into B -> message bean it back to client -> unwrap into B.

Literally three months of 8 people ripping all of that out and replacing it with A.transform() // returns B

Oh, and at the same time, none of this was in source control. It was being worked on by 12 people up until that point. They didn't know who had the current files which were in production. So my first job was taking everyone's files and seeing which ones would compile into the classes being used in production, then checking those into source control. Up until then, they just kept mental track of who is working on which files.

Reading this gave me a thought, which is that I would like to see a series of artistic renderings of physical analogies for such sorts of Rube-Goldberg-Machine-esque software monstrosities. This one sounds like it'd be an assembly line where in each step of the process, the piece-in-progress would be packaged in a shipping box and sent through the mail C/O the person manning the next station 15 feet away.

Was there a reason they did not use DBI to call the code ?

I had been told he bragged about it being "job security" since nobody could understand what he had written. I never spoke to him directly as I was his replacement...

The java mess was a pattern copied from within the company where it was used correctly. The main product of the company was built around some very complex scheduling software. It was a black box and communication was entirely through inter-server beans. This was copied to intra-server, which makes no sense at all, and it was used everywhere.

"job security" coupled with "I was his replacement" says that his plan worked about as well as his code...

Indeed. It was quite the entry to the company for me as well. "What are you working on?" "Fixing Dave's code." "Good luck."

But my most auspicious entry to a company was where the guy I was replacing had been arrested for stealing data from a competitor. My first two days were spent recovering logs, downloads and other traces of what he had done and handing it over to the lawyers. (the circumstances were not mentioned in the hiring process)

"I took over a Perl project where every SQL call was an exec to a java program which would make the query."


"Oh, and at the same time, none of this was in source control."

Owwww, that's painful! 12 people spaghetti Perl that did its DB lookups via Java and non of it in source control?

Owwww, owwww, owwww....

IMHO sometimes it's fine to use messaging within an application that runs on 1 device, purely for decoupling purposes.

My first job was sysadmin of a third tier ISP back in the dialup days. The account management and provisioning system that ran EVERYTHING was probably close to 100k lines of csh. Everything was done via a UI that the shell generated as a sort of curses-style interface.

What was horrible about it was that it controlled everything from who got a website, active domains, what POPs users could dial into, metered billing, you name it. And it did all of this by manipulating flat files of pipe-delimited data on a central server, then rcp’ing those files to the various machines, then rsh’ing to the various machines and kicking off THEIR scripts, which parsed the source files and generated their own files, which called another set of scripts that parsed THOSE files and generated the software config files.

This included doing things like updating init scripts so that new IPs got added to interfaces, and what email server a user was provisioned on, so it had to generate new exim configr with routing rules.

All this to say that it all worked, but I dreaded having to go in to manipulate anything. Adding a server at least had a dedicated procedure so that was fine, but anything else was a nightmare.

Case in point - as part of a gradual plan to remove this nightmare, I swapped out the radius server that they were using for one that could support a database backend, and modified the local config generator script to make a new config for the new software as a stopgap until I could get it into a database.

The config file had a series of fields that just had numbers in them, and after much digging, it seemed like that controlled whether a terminal dial in user was presented with a menu of options, and what options. I had to reimplement that logic for the new software, made a mistake, and accidentally removed the option for UUCP for the 10 customers that were still using UUCP. One of them was on an ISDN line and their mailer decided to continuously redial looking for the UUCP, tacking up thousands of dollars in carrier rate charges for the weekend that it took anyone to notice something was broken.

I had a similar job to this when I was a grad.

I got given an IDE that was written in korn shell to maintain. Not as mission critical as this sounds, but was the only way to edit, compile, link and deploy around 6000 COBOL programs that made up a very large and expensive financial services platform. It also integrated with the SCM (unix RCS!), did checkout, checkin, merging, branching and all manner of amazing things.

There was probably 30 devs who used it, all running on a HPUX server.

It was very powerful, but a total nightmare to look after.

Wow this is legendary. I’d love to direct a short film or tv series that revolves around an IT/software team using a massive csh codebase like this. I’d love to generate some training montage / diagram sequence shots of the system being built by the characters maybe make some cool blender / adobe premiere overlay screen splits of the high level architecture as the team references certain aspects of the system

Make it like the IT crowd, but for a software team. That would be golden.

I'd take something less fictionally dramatic and more along the lines of reality TV (ala home makeovers/Kitchen Nightmares/Bar Rescue): a team of crack engineers untangling the mess and laying out the best practices for future development. The concept is even ripe for booze sponsorship.

This is incredible. Terrifying, but incredible.

The project around 10 million lines of code. Some of it was written in a very specific version of Fortran that was a PITA to compile. One fun experience was opening a file and seeing it was created on my birthday. Not in the sense just matching the day and month of my birtdhay. The code was literally written on the day I was born.

I recall rewriting a several thousand line program that was written in a preprocessing language that compiled to fortran that was homerolled by some random guy that was just so ahead of the game at the time that he basically wrote an extension of fortran with all the macros he made and rolled into one thing.

The rewrite was 100 lines long.

> The code was literally written on the day I was born.

That made me LOL.

Ten years ago I was called in to remediate a new web application which had been subcontracted to an Indian development company. The PHP developers who'd put it together evidently didn't know about classes, and each page in the application was hundreds, sometimes thousands, of lines of spaghetti code, most containing the same duplicated (but subtly changed) blocks providing database connectivity etc. Security had not been a concern either; passwords and credit card details were stored unencrypted in database text fields.

I was called in because, while most of the application worked, some of the requested features were not yet complete. When I made my initial recommendation (scrap the whole thing and start again) I was told the client's board would not agree to that because of the money already invested and the fact that the board had seen a demonstration proving that "most of it worked".

It took two developers eighteen months to beat this sorry mess into a maintainable state while ensuring it remained "usable". It would have taken one third of that time to rewrite it from scratch.

> most containing the same duplicated (but subtly changed) blocks

I had a similar experience with an offshore company. This was during the early-mid 2000s, at the height of the offshoring era when $10/hour programmers in India were aplenty and everyone felt their job was soon to be outsourced. Turns out, no joke, they were being paid per line of code.

Despite having to maintain the heap of crap, I was amazed at the brilliance of their maniacal dark-art ability to implement as little functionality with as many lines of code possible. It was like code golf in reverse.

What sorts of techniques did they use? I'm very curious to see/hear examples.

    function isTrue(v) {
      var result;
      result = v;
      if (!isFalse(result == true)) {
        return result;
      } else {
        return isFalse(result);

    function isFalse(v) {
      var result;
      result = v;
      if (!isTrue(result == true)) {
        return isFalse(result); // Tail recursion so this is fine.
      } else {
        return result;

    if (isTrue(myBool) == true) { // TODO: Could this be refactored to isTrue(isTrue(myBool))? 
      return true;
    } else if (isTrue(isFalse(myBool))) {
      return false;
    } else if (isFalse(isTrue(myBool))) {
      return false;
    } else {
      return isFalse(isFalse(myBool));

i hope to god that this is just code you made up to be facetious.

Haha, yes I made it up. The infinite recursion should have been a dead giveaway :P

There's also a bug on line 7 that I didn't notice - it should be isTrue rather than isFalse. I can't edit the comment anymore to fix it.

This is literally insane.

I had a very very similar experience. I saw the craziest code in my entire career while debugging performance issues. Even though they were using a PHP MVC framework, they were pulling every record from a db table to iterate over to find the record using PHP string compare functions. I still can't believe it. The dev shop I worked for back then was even in the habit of hiring multiple teams for the same project in the hopes one actually completed it and it was still cheaper.

> pulling every record from a db table to iterate over to find the record using PHP string compare functions

As horrible as this sounds, this is actually good from a refactoring point of view because it should be straightforward to rewrite it to use actual queries.

Perhaps technically true, but the entire point of the MVC is you can do simple record retrievals with a single line of code using auto generated active record models. One line vs a monstrous dirty function isn't acceptable.


10 years ago PHP didn't have classes

PHP 5 was released 14 years ago in July, 2004. One of its new features was OOP. Classes were supported at that time. In fact, a comment from 14 years ago shows an example class [1].

[1]: http://php.net/manual/en/language.oop5.php#46290

Sure it did. Classes in PHP showed up in version 4.0 18 years ago.

Why didn’t you rewrite it from scratch and then just replace the codebase in 1 giant commit ? Seems doable if you have the balls

I use to work on hospital lab software.

* It was over 20 years old by the time I started

* It was written in Fortran

* Variable names were single and double digits

* Each fortran program would run in isolation but had a shared memory process

* It was formally a terminal program but a weird Java frontend was created so everything looked like Windows GUI

* All program names were four letter acronyms

* All data was stored in fixed width binary "flat" files

* It previously was under CVS version control, but each install slowly drifted apart, so each site had it own unique features and bugs.

* I once had to move a significant feature from one install to another using only patch files generated from the work done on the original install.

The worst for me was a rescue project: a site for a US tech sector Public-private partnership. Nothing too complicated: recurring donations, paid events, a small members area. They had sent it to an Indian firm to build in Drupal 7 - not a lightweight system to begin with.

I would like to say "cue the stereotypes for Indian developers" and we could all have a good laugh. But no. This is more like Heart of Darkness. They must have traveled to the darkest corners of the subcontinent to find a mind capable of the eldritch horrors we found there. We started keeping a wiki of design patterns, to save us WTF time. Here are a few.

* Memory management. What's that? This site required 16 GIGABYTES as a PHP memory limit in order to load the front page for an anonymous user.

* Security. What's that? Part of the reason it required so much memory, is that it would include a complete debug log of the site's entire transaction history, including PII and credit card numbers, in every session. Meaning any vulnerability was a critical vulnerability.

* They would arbitrarily include the entirety of Zend framework to access a single simple function. This happened several places in the codebase, with different versions of Zend committed.

* Can't reach the ERP to get the price for an event? Let's set it to $999,999.00 and proceed with the transaction.

* Invoice numbers were random numbers between 1-1000. Clashing numbers meant a fatal exception that would fail to store the invoice or payment record... But not until after payment had been processed. Birthday paradox means this happened a lot.

* The developers used arcane bits of Drupal API to do totally mundane things. Like, if you know about hook_page_alter, you know there's a setting in the UI for the frontpage. But we'll just use hook_page_alter instead.

* Write queries using Drupal Views, rewrite the query in code, override the views query with their custom (identical) version using an unusual API hook, just to add a sort.

I could go on, but I think you get the picture. Eldritch horror of a codebase.

Holy crap. You already earned yourself an upvote for this sentence:

> * Memory management. What's that? This site required 16 GIGABYTES as a PHP memory limit in order to load the front page for an anonymous user.

I'm tempted to log out and created 5 new accounts to upvote for each bullet point :-D

I used to own a C++ application that was a morass of abstractions and indirections so it was impossible to reason about. It took a number of hours to compile.

On one infamous occasion we were making a, relatively small, patch release. The debug version worked fine but the release version crashed systematically. Even when we backed out all the changes we had the same behaviour. We were screwed.

Until one of the team had a bright idea. She stripped strings from the debug build and tested it. To our surprise it not only worked, it was only slightly bigger than the previous release version and it was also slightly faster! We shipped.

This experience was the trigger to make me go all-in on a full re-write that I had been contemplating. One of only a couple of times in my career that I've made that decision on a major piece of software.

The re-write was a huge success. It was also about 10% of the original in terms of LoC. The day our testing finished, we held a ceremony where we deleted all the old code from the current version.

This caused a slightly different issue. At the time, code metrics were starting to get fashionable but LoC wasn't yet the pariah it became.

So, a couple of days later I got a concerned call from the metrics guys. Apparently, we had deleted more code than all the other teams combined had added in the previous measurement period. This caused their metric calculation to barf. Their solution? We should add all the code back in! This led to a somewhat heated argument that ended up with me persuading them that deleting code was good and they should, at least, abs(LoC) it. It didn't make the metrics any more useful but meant that we had an application we could reason about. Happy days.

> Apparently, we had deleted more code than all the other teams combined had added in the previous measurement period. This caused their metric calculation to barf. Their solution? We should add all the code back in!

ah yes, when bad metrics become targets

From 10+ years ago when I worked at PayPay, webscr (the frontend at the time, where you'd log in) was a total of 2GB of C++ that would get compiled into a 1GB CGI executable, deployed over some ~700+ web servers.

Debug versions would never get compiled, as I'm told the resulting file was too large for the filesystem to handle.

Apparently a great deal of the code was actually inline XML.

They knew this was a bad pile of technical debt, too: at one point, a senior/staff engineer gave a company presentation where they brought a fanfold printout of the main header file for this monstrosity, and literally unrolled it across the entire stage.

I think you meant to say PayPal?

Oh whoops, thank you, yes!

PayPay is a Japanese payment company

Job security.

I've worked on a CMS that was partially done in .NET, Iron Python and used an XSLT templating system to generate HTML for the front end.

The architecture looked like something from the Early Java days.

The system used Iron Python / C# in the following way.

1. A web request would hit the CMS 2. There was a massive switch statement to work out how the query would be rewritten 3. If it the url was prefixed with processor it would attempt to find the processor in the db. 4. The code would then find the python script associated with the processor. 5. The processor would then spin up a command line instance in a hidden command window on windows server. 6. The processor would have to return XML that had to be built up using strings (not element tree for you). 7. This would return the XML to the C# which would then try to render into the XSL Transform.

If at any time this failed. Silent failure. There was no way to debug easily (There was a magic set of flags that had to be set in Visual Studio or wise you couldn't debug the python scripts).

To get the software to build on a new machine. It took a contract 4 months to reverse engineer an XP machine. None of it was documented anywhere.

It used ImageMagick to generate thumbnails on the fly which doesn't work to well with windows server.

The lead engineer was an alcoholic. He used to go to the local pub for 4 hours in the middle of the day and come back smelling like a brewery.

I actually laughed while reading this. I've encountered something eerily similar, but it was a Ruby app. Almost exactly your steps 1-3, however, what was returned from the db were ruby method names, which were then invoking other pieces of the code through some deep, twisted ruby magic.

Uncovering what the hell was actually happening was like peering into the mind of a psychopath.

Once myself and 3 other engineers spent 4.5 hours trying to figure out how to send an email from this treacherous app (a feature which had stopped working months before we showed up). After those 4.5 hours, none of us even came close.

To top the whole thing off, page load times were in the minutes. The standard joke from the users was that they "get to take lots of coffee breaks."

Much to the (new) manager's credit (and my sanity), we got buy in to just build a new version and let the old one die. I left the team shortly after the 2.0 beta launch, but God I wish I would have stuck around a little longer so I could have seen the official end of life for that calamity of an app.

In a weird way I kinda like these totally mental systems. It really ups your skill level for debugging if nothing else.

I've worked with lots of proprietary CMS systems and I kinda got into a groove with working with them. I knew exactly how to manipulate the system within the parameters of said system.

I kinda found it challenging. When a system is well built. I find it boring.

When you are just above the Ballmer peak.

We had a developer meeting a few weeks before he left the company and he said "I designed the system in the pub" ...

I like my beers but this guy was on another level.

I wanted to tackle a rewrite of a legacy LISP program, used by AutoCAD, to take in a list of 3D coordinate points, and output a DWG of a 2D projection (from which to create a jig on which to assemble a prototype). Cool stuff, right? My intent was to create a "proper" script in AutoCAD to actually create 3D objects from the input data, and let the program do the projection to a 2D drawing. (I'm still not sure if that would have been possible; I never got that far.)

I had never written in LISP before, so I bought a book!

I was having trouble reading the program on the computer, so I printed out the roughly 15,000 lines. Not a lot, I know, but it was about an inch and a half thick stack of paper. I started going through it.

It consisted of LOTS of subroutines. Thousands. Each one neatly formed; no more than a couple hundred lines. It gave me hope.

It read in the text file, created a blob of a string, and then passed that blob to the first subroutine. Then it passed to the next subroutine. And then the next, and so on. As far as I could tell, it never called a subroutine a second time, and it never returned to the starting method. Given the strangeness of LISP, I couldn't figure out what it was doing, or why.

The guy who wrote it had retired, and we didn't get along anyway, so I didn't try to chase down what his thinking was.

I gave up.

To my knowledge, they're still using that program 17 years later.

Bad is not a good measure. But taking 'bad' to mean large, convoluted, zero documentation; I was once at a financial services company that had a 25 million+ LOC mainframe cobol application that had been under active development since 1969. This was a batch and CICs system. It was spaghetti on every level; database (db2, vsam, isam), the screens of the app, the batch jobs, the cobol. It was truly astounding. It was also the source of about $500 million in revenue. They were doing software as a service in the 1970's. It's still going today. Customers in that space don't have many options.

Sounds like a market opportunity. What vertical market is this, more specifically?

The only homeless former software developer I ever met had once been a cobol specialist.

The domain is mutual fund accounting. The system in question had grown organically over decades to encompass everything related to mutual funds; account record keeping, shareholder statements, broker dealer recording keeping, commission payments, cost accounting, you name it, they do it.

I agree it is an oppurtunity, but the barrier to entry is very high. Reaching feature parity is a multi year project with a large team and domain experts.

Market opportunity? Sounds more like the best way to throw away your sanity. Some things are just beyond repair.

Since when is a greenfield app on a new product addressing a market with only old entrenched players “throwing away your sanity?”

Sorry, I did not realize you meant to create a new software from scratch. I thought you were talking about supporting/extending/refactoring this old pile of madness. Thanks for the clarification!

I suppose you were referring to the statement "Customers in that space don't have many options". Your statement is reasonable and there might be market opportunities to create new software. But I somehow have the feeling that it would take a tremendous amount of time to recreate something that checks the same boxes as the old system. But I do not work in that sector so my judgement might be totally off.

Thing is, a competing product does not (at launch) have to match the entrenched one feature-for-feature. It just has to tick the MOST IMPORTANT boxes while being advantageous in other ways (such as being built on FAR newer tech, more reliable etc.)

Being built on newer tech isn't an advantage from a customer POV. It's only an advantage when the newer tech is better, and tbh I have a hard time believing that you'd match the performance and security of a mainframe that easily.

Being an order of magnitude cheaper would be compelling to these same companies though.

It's only a market opportunity if the software is bad from a customer POV.

There's a lot of engineering that goes into/went into mainframes.

How are the changes to rewrite this castle of shit? Being a developer I would hate working on that pile of garbage but judging from management? Well if it works it works. Being pragmatic ain’t that bad.

With that kind of stuff, clients usually also depend on defects and accidently behavior and they are not happy touching their own Client-Code integrating the service with their systems because it is of the same quality as the service.

Sooo... what space is that?

In my first year of university, I wrote a Java swing game which had classes for literally every part of the GUI (I was young, carefree and unwilling to yield on the principals of OO I'd been taught earlier in the year. I think it had somewhere in the region of 100 classes. Of which only about 3 had the real logic in them.

Now I've put it in perspective, I went on to an intern position between years 2 and 3 of uni. I was handed a lovely piece of code which had:

- Around 300 classes - 3 or 4 layers of nested generics - Factories, factory factories, generator factory factories - 90% of the parameters were passed in from the build engine running the code, so it was impossible to run locally, ever. - 0 tests - Some 100 pages of documentation, which had been lost until I was about halfway through my placement (and mostly documented how to set up and run it, not how to maintain it)

Seriously, this thing was designed to the extreme, made to be generic to every single scenario in existence.

So what did it do? It took items from a customer facing system, transferred it onto the internal work tracking system. Then when they were updated in the modern system, mirrored the relevant updates back to the legacy.

The best part? Every time the internal work tracking system updated (once every 6 months), this thing broke horribly and it was practically impossible to fix. Even if you managed to set up stuff so you could work in a development environment, it still connected to the customer facing system, so you had to be incredibly careful what you did during testing.

It wasn't the biggest in terms of LOC, but it astounded me just how much effort (apparently the guy who wrote it squirreled away for a year to write it, then moved to Canada, and was famous in the department for having one too many beers during a outing to a local Indian) went into designing this behemoth.

I still have the occasional nightmare about it!

I once had to look at a client’s code to determine if/how we’d go about taking over their application. Their only developer threatened to quit and this is when they realised it would be best to outsource this and reduce the bus factor.

It was a huge folder (not repo - and there were zip files of different “versions” of the code in there). The main monster was a huge Visual Studio solution with hundreds of targets, one would be an application for entering some data, the other was for entering data from a hardware device (a scale if I remember right), etc.

The main source of truth was an MSSQL database to which all these apps would connect as root. There is no backend as such to ensure access control & consistency, and any misbehaving app could essentially trash the entire DB.

Database credentials were hardcoded in every app’s main entrypoint, with earlier “versions” of the credentials commented out below.

I thought that surely these must be either staging DBs or at the very least there would be network-level access control meaning the DB wasn’t accessible from outside... but no - I managed to connect to their production DB as root from a random, untrusted location. I do not know if MSSQL uses encryption by default but I would bet good money there was none and they were essentially connecting to their DB as root, over plaintext, from hundreds of different locations across the country without any kind of VPN.

In terms of code you obviously have your standard & expected “spaghetti monster” with UI & business logic scattered everywhere. What struck me the most was an empty exception handler around the main entrypoint.

In the same folder there was also source for an iOS app. Didn’t look at it but I don’t see any valid reason why this should be in the same place as the Windows apps.

Thankfully I no longer work there and even if I were I had no major C# experience (which gives me a very convenient excuse not to touch this mess).


Are you me? :-)

Had almost the same experience, minus the database. Friend of the owner wanted to buy a company and asked us to evaluate their code to see if it was maintainable enough to add new features. I got a zip of hundreds of firmware projects each representing a different version. They were all on the same basic platform but with different hardware features #ifdef'd, or customized for a particular customer. The code itself wasn't that bad (not that good either!), but their developer clearly had no idea what Version Control meant.

In the end I gave the thumbs up and he bought the company, then ended up having to redesign the product from scratch since much of the originally designed-in components were no longer available. He did his Due Diligence for the software, but ignored the hardware side!

> In the same folder there was also source for an iOS app

A true monorepo

I worked for a company that makes sortation devices (conveyors etc), and i inherited an old product from a previous colleague.

The software was largely a standard base with modifications done for the individual customer to suit their business needs.

In this particular project, it was a batch sortation, meaning we received a large batch file from their mainframe, which would then be parsed and executed by the controller software.

Everybody feared this particular project, and estimates on new functionality were sky high, but I didn't think much of it until i had to modify the batch parsing code. I was met with 22000 lines of C code in a single function. This single function was modified multiple times each year, usually adding more code in the process.

It took me the better part of a month to refactor into "manageable" chunk sizes, and in the end i was left with a 1700 LOC function that was still "too big", but nobody really understood how it worked, and we couldn't test it, so i just left it at that.

After my refactor could implement new functionality somewhat faster, but in the end it was still a very complex algorithm, so despite being in smaller functions, you still had to be very careful when modifying it.

(I am a regular of HN. I don't want this attributed to my acct.)

Previous job. Was sysad. This code runs most of the academic US internet.

Everything was Perl5. There were over 150 different major applications written with it, ranging from 40 lines to 500k lines. The older the recent commit, the worse. Touching any of this would cause errors, either in itself OR in associated applications! You'd be working on thing A, get it working well, and 4 weeks later thing B would fail horrendously.

The worst was a tie between a variable that was a 6 line long SQL query, which was packed to the brim with function calls that ended up expanding a query to something like 50 lines long.

The tie was a gem in code that wasn't touched for 7 years. This wasn't at the top, or even in a config file. It was hardcoded in the middle of the perl5 program...

     $db_server = (servername);
     $db_user = root;
     $db_password = (password);
Other dishonorable mentions are as follows:

1. No primary keys for the main database....

2. Goober had the idea of storing pictures in said MySQL database. 70 GB of pics...

3. Redhat 4 still in production, along with RH 5.

4. Its everyone for themselves. The goal is you hobble it along enough for the next oncall. Let them get hit with it.

5. Running iron from 10 years ago. Contracts pull in $$$, but you're dealing with paleo-datacenter crap

6. Just retired LTO3 tapes. Now they have "shiny" LTO5....

You know what's terrible? This could almost be any one of about three of the Perl contracts I've had in the last decade (apart from the US part, obvs.)

Yeah, the more I dig around how perl shops work, this seems endemic.

Now, I work in a heterogenous Windows/Linux shop with non-paleo hardware. We're even deploying on AWS in a limited fashion.

This place has its warts too, but everywhere has something. But that previous place... I'm surprised it still hasn't crashed and burned. Their networking was solid tho. Just anything with Linux was a tire-fire.

And as one more gem, there was a program that terrified the fuck out of me. A fellow engineer showed me this update tool that would update a router remotely. Great. Well, if you add the -n (customer) flag with the customer number, it would update all the routers for that customer!

It was spectacular, and terrifying at the same time. I asked them for their testing procedure, and it was a -t (customer-testing). If they forgot the -testing, well.....

> 2. Goober had the idea of storing pictures in said MySQL database. 70 GB of pics...

Is that really a bad idea? The pics would not be any smaller outside the database.

At a former employer someone created a framework to try and create a generic process to process and load data (i.e. ETL). It was written with the best of intentions, but was terrible.

This framework was given to an offshore team, and they were told to use it to farm out hundreds of requests. The framework was inflexible enough that they each started adding snippets here and there to make it work for those projects, with no code review.

When I joined there were well over a hundred different projects, all using this framework at the core, most having little bespoke tweaks to make the code work. Every failure was essentially a new failure in a new way.

It was a useful experience - it's one of those experiences that teaches me via a negative example. This was the worst example of how "roll your own" resulted in incredible technical debt I've seen.

Had something similar happen to me. I gave a new guy a task of 'figure out luigi or airflow to help us process nightly data better than crontab'.

He came back with a custom python/c++ framework where literally every function was called handle(), and in c++ it used type inference to figure out which handle() to call.

Apparently this is what he wrote for his last firm.

We quickly found something else for him to do but he didn't last much longer after that anyways.

My own! For a university assignment we had to make a simple Excel clone as a group.

We divided the work and I ended up working on the formula parser. I spent a week thinking about it and couldn't figure it out (I wanted to work it out from scratch). Eventually I had a flash of insight: I know how to parse simple formulas, so I can use string replacement to recursively rewrite a formula until I can parse it.

By the time I had written all of it, I didn't understand how it worked anymore, but it did work!

FormulaParser ended up being longer than the rest of the codebase combined, and I eventually learned the other groups did it with a regex and ~50 LoC...

Most large enterprise systems that are more than a few years old tend to be pretty bad, and almost all of them work just fine, as long as you respect the 'dragons be here' signs and you don't attempt to fix that which isn't broken.

Millions of lines of code (or even 10's of millions of lines) are really not all that exceptional. The original programmers are typically in some home or are pushing up the daisies.

I came here was about to complain about some projects of my previous companies. But reading this thread makes me feel ashamed for getting angry with them, compare to how much more pain you guys went through.

Thanks for all the sharing. Young devs (like me) should really read and appreciate this thread.

My previous job. More than a million lines of code for a glorified CRUD app, with more than a $7 million annual budget.

Accruing technical debt was a process feature. More bad code that everyone is afraid to touch means more budget for terrified developers and testers, and insane networked database design means more budget for servers and sysops. The fear leads to meetings, the meetings lead to suffering, and suffering leads to the dark side. It still works according to spec, and is human-fixed quickly whenever it doesn't, but the poor quality of the codebase is likely costing at least $2 million per year.

Read that last paragraph in the emperor’s voice from star wars episode 6.

I'm guessing by bad, you mean ugly. I'm fairly convinced there is no "good" code, or that it's such a minuscule subset that you'll never likely encounter it in your career. Every year I look back on last years stuff I've written, and I can find ten ways to clean things up.

In my opinion, if it works, and provides the utility it was designed to bring, then it doesn't matter. If it makes money, then who really cares!

> In my opinion, if it works, and provides the utility it was designed to bring, then it doesn't matter. If it makes money, then who really cares!

This makes sense. However, if the code 1) is hard to understand (according to the developers available) and change and 2) needs to be changed, it costs money.

Eg, read about the expense banks are now incurring trying to maintain COBOL systems. Whether the code is "bad" is debatable. But the fact is that they have a hard time finding people who can work on it.

Sure, but you're acting under the premise that "if we just did it 'right' the first time, we wouldn't have this mess". What I'm saying is that only under very few circumstances does it ever workout that way. Particularly with long standing systems and their software. It just builds over years, nothing you can really do about it.

> Sure, but you're acting under the premise that "if we just did it 'right' the first time, we wouldn't have this mess".

I think you're right that every long-lived code base will have warts. And I don't think that means that the original builders were wrong-headed.

But if you've got a decades-old system that nobody understands anymore, you've got a huge liability. You can't ship features to compete, you can't fix bugs, you can't comply with new regulations. You can't even rewrite confidently because you don't know what the old system does.

There must be things you can do as a code base ages to keep it maintainable, allow incremental rewrites, etc.

You may not want to reinvent the wheel, but you have to change the tires.

>I'm guessing by bad, you mean ugly. ... In my opinion, if it works, and provides the utility it was designed to bring, then it doesn't matter. If it makes money, then who really cares!

"Ugly" is subjective; I'd define "bad code" as "difficult to reason about". If you're introducing someone new to the project, how long do they have to stare at it until they can grok it? Good code is code that makes sense, where things are documented and clearly named and encapsulated and have a flow that makes them understandable.

This doesn't necessarily mean it isn't ugly. Often, it's uglier than "clever code" which does something succinctly but less obviously.

> If it makes money, then who really cares!

If it makes money now, you might care about continuing to make money in the future. Being able to reason about and change your product should help with that, but so will more money to throw at it, as this thread shows.

300,000 lines of Python…

The company had bought Ellington, a Django based CMS, but the team basically rewrote the entire thing using multi table inheritance (unintentionally), so everything in the database had two copies, and we had over 70 tables, hundreds of gigabytes, disasters every week and tons of bugs... I discovered this more than a year later and nobody was even aware. Even the DBA wasn’t aware the tables were duplicates.

Not the largest, but the most insane.

At an old employer there were 15,000 lines of batch script across 14 .bat files on a Windows laptop. Old director of IT used it to onboard new customers. It basically copied a DB and turned "CHANGE ME" in some columns to the client's name.

It had it all. 5k lines of date validation, 3k lines of "UI", 400 goto statements, hard-coded passwords, versioning by incrementing the file names (leading to a bunch of code that was never called), and to top it off a static IP granted to the laptop that used as a part of authentication.

Took me two weeks to unravel it and replace with ~20 lines of Ruby.

Later, all of my complaining on Facebook led an old professor to invite me back to give a talk on the importance of code quality!

More than a million lines with a single file containing more than 100,000 lines of spaghetti code consisting of macros and badly indented C code in a single function! This powered the GUI for an entire generation of low cost phones (pre Android era).

A while ago, I was tasked with maintaining production code written in R by an enthusiastic junior developer. He loved R so much that it blinded his ability to use the right tool for the job.

Instead, he wrote web applications in R, instead of Python or Ruby, which my company had many developers who had expertise in, and eventually handed it over to to me. He even persuaded our bosses to invest into R Studio Server and had an instance installed in one of our machines. It's not only the choice of the programming language that made me furious, it's also the quality of code. He also mixed up snake case and camel cased variables all over the code. In addition, the same name would refer to different things, eg. `abc` and `Abc` and `a_bc` would mean totally different things. And stuff that could be written in a simple Sinatra or Flask application were written in R Shiny.

As a non-R person, I quickly learnt the language, (while mentally cursing it all the way for the bad choices it had made and the terrible inexplicable syntax it had) but getting used to this bad code was quite a challenge. We had several top tier clients whose reports were critical and reliant on this R code and it would frequently, randomly fail while maxing out on memory, no matter how much you threw at it. Debugging was another issue and I struggled with this codebase for 8 months while this junior developer had moved on to other technologies.

Eventually, my main role almost switched to devops which I hated, because I enjoyed writing web applications and good code that doesn't require maintenance nor devops much. Eventually, I realized I couldn't take responsibility for this anymore as it would cost me my reputation and I really didn't like the way the company handled the situation as well. They were quite supportive of the junior dev encouraging him to move on to newer technologies while he half-assed everything and threw it on other people's heads who already had other responsibilities. They did this so that they could show off at meetups "We use the latest tech stack..blah blah" while adding 0 value for clients.

So, I quit the company, along with a dozen others and never looked back. But, I did learn quite a lot..my my.

> In addition, the same name would refer to different things, eg. `abc` and `Abc` and `a_bc` would mean totally different things.

That's not too unusual. In Java `Camel` would usually be a class and `camel` an object, and in Prolog `Camel` is a variable wheras `camel` is an atom. Not sure about R though..?

> And stuff that could be written in a simple Sinatra or Flask application were written in R Shiny.

I don't know R Shiny, but the examples looks neat and simple.[0] Are you sure this is not just a case of "I don't like X" rather than the code being bad?

[0]: https://shiny.rstudio.com/articles/basics.html

Well, that nomenclature works well for classes and objects, but if you use mix up cases within an object, I think we call can agree it's really bad code?

As for R Shiny, those examples all look fine, but wasn't the case with my code.

Here's a metaquestion:

Is it possible for any codebase to NOT eventually (given enough time) become a crufty pile of garbage?

I suspect (but have no real evidence... yet) that SOME of this spaghetti garbage is due to the traits of procedural, OOP, mutable languages. But this would then imply that things like functional-language codebases have much longer lifespans... and I don't have evidence for that... but I'm hoping someone can chime in

I think it's not like a natural law. But several things help with it. Changing maintainers and committers for example. When the original ideas are lost, so is the structure of the code.

Also the quality of the test suite makes a huge difference in combination with the willingness of developers to refactor. No testsuite automatically means no refactoring. With a testsuite the question remains whether the organization penalizes or encourages larger code changes (sometimes with old code bases managers demand pinhole-surgery only).

And then there is this funny thing that perfectly acceptable code bases age without a change to their code. Idiomatic C++ code from the 90ies is from our perspective not clean, even if the person who wrote it was a dedicated and smart programmer following the best practices that they could get a hold on.

With functional code bases you might see similar patterns. A Haskell person of today may look at an older common lisp code base with similar reservations as a java programmer to the visual basic 3 app.

I just sat down and wondered how a tech stack potentially used by a startup could age. In the future i think we will see much more dependency problems. When you get to maintain a Django/nodejs/ruby on rails application, you always take pypi/rubygems servers for granted (or your local mirror on artifactory). Think about the time in 40 years when the dependencies are not available. Or the small languages we sometimes see and still are able to find tutorials for, how will it feel to take over a Lua codebase in 40 years? I hope that enough docs stick around, but already when I browse the web on Smalltalk stuff most links are broken because actually information can disappear from the web.

Or database technology. How will that NoSQL db appear to a maintainer in 30 years?

So while today we are unhappily having to maintain software that was written without version control, it may be that future generations will have it even worse because they dont even have a monolithic code base in front of them but something with dependnecies they cannot install anew anymore.

Linux kernel is massive and has been around for decades now and it’s probably millions of lines.

It certainly has a lot of cognitive overhead since you need to always keep in mind the context in which the code you are writing will be running (to reason about concurrency etc.), but it’s relatively easy to understand and well written.

Yeah, I think some FOSS projects qualify; I'd say OpendBSD.

OpenBSD has a very sane culture in which deleting code, or reformatting code without the compiler object change, are good things.

Most organizations don't see the value in that.

I worked on a about a million line Scala codebase a couple years ago. It started it's life in 2007. The code wasn't the greatest, but because all developers throughout it's lifetime had been militant about functional programming, refactoring it wasn't too difficult. When everything is referentially transparent, you can fix problems pretty easily.

It's mostly a function of the effort put into the quality of the code base.

Currently, I work (among other things) on a 25 year old code base that acts as an interactive, terminal-based interface for a CMDB.

It has its problems (like mostly not using the exception mechanism of the programming language it's implemented in; probably wasn't very reliable back then, or maybe didn't exist), but all in all it's OK to work with. Most changes touch only 1 to 3 files, most of which are pretty short (<100 lines of code, typically).

There are "here be dragon" areas, but not too many.

This site/paper isn't actually (entirely) in jest and tries to explore that exact question: http://www.laputan.org/mud/

> I suspect (but have no real evidence... yet) that SOME of this spaghetti garbage is due to the traits of procedural, OOP, mutable languages.

It's comfortable to blame the tools but in my experience it's a people problem, not a tool problem. I've looked back on my own code from years ago and wondered what I was thinking! I'm currently rewriting one of my own projects because it was done in a hurry with the requirements half-specified and my heart was not in it.

But I've seen things from other people, insane twists of logic that can hardly be imagined.

One of the projects I don't work on originates from the 80's. There is tons of actual code from the 80's in this product. It has a Windows GUI. It has a web interface. It started out on Unix at some point but now runs on Windows. It's written in a language that doesn't exist anymore. It costs millions of dollars. OOP vs. Functional is not really the question.


Edit: I actually like the following article much better, found it in the See Also section of the link above. https://en.wikipedia.org/wiki/Software_rot

The worst I've personally seen was a many tens of millions of lines C++ Qt GUI app. Won't mention the name to protect the guilty but it's something some people here might recognize.

It didn't really need to be that big. A lot of its size was a result of pathological over-engineering. Apparently a previous (of course) engineer who built much of the original app thought boost::bind and functional binding was awesome. He also absolutely loved template meta-programming.

The code base was full of templates that contained templates that contained templates that contained... layers upon layers upon layers of templates, generics, bind, and so on. The horror. The compile times were awesome too. Local builds on an 8-core machine took 15-20 minutes.

I once made a commit to that code base where I replaced two entire subdirectories of files and tens of thousands of lines of code with one function. It took me weeks to understand and then finally to realize that none of it was necessary at all. It was all over-engineering. The function contained a case statement that did the job of layers of boost::bind and other cruft.

I definitely had a net negative line count at that job. The experience helped to solidify my loathing of unnecessary complexity.

It also made me respect languages like Go that purposely do not give the programmer tools like complex templating systems, dynamic language syntax, etc. It's not that these features have no uses, but honestly their uses are few. I've used Go for a while now and have found maybe two instances in tens of thousands of lines where I missed generics. I imagine I'd miss operator overloading in heavy math code, but that's about it. The problem is that these features are dangerous in the hands of "insufficiently lazy" programmers that love to over-engineer. I'd rather not have them and have to kludge just a little than to deal with code bases like the one I described above ever again.

I never actually saw the code that makes this work (thank Christ) but at the previous place I worked, we did everything in SQL. Believing SQL Agent to be 'too limited', my boss had taken it upon himself to reimplement it... in SQL...

Just let that sink in for a moment.

There were 20+ tables all modelled after how SQL Agent does its own scheduling, 30+ stored procedures for interacting with it, and it was all intended for use with a GUI that was (naturally) never written. A relationship diagram of the tables sat on my personal Wall of Shame board for most of the time I was there. And yes, I had to use it from time to time. It was installed in every database for 200+ clients, in production, UAT, development, you name it.

The most delicious irony is, the only way such a thing could run automatically... was to use a SQL Agent job to poll it once a minute.

And yes, I'm aware you can programmatically create SQL Agent jobs in SQL, so he couldn't even use that excuse. The only thing it had over SQL Agent was that it could be 'event driven'. All that ultimately meant was that the job would only run if a condition was met.


As many experienced commenters below have already noted, nearly any piece of substantial, revenue-generating, long-lived code will ascend to a place full of dragons and mysticism. The more interesting debates revolve around:

1) How did it get that way, and what can we learn from that? 2) Is it inevitable that all software will end up like that? 3) How can an organization ever successfully sunset or move to a more maintainable system? Or should they even aspire to that?

Code that is easy to change is changed until it's no longer easy to change.

Also: Well-designed components are easy to replace. Eventually, they will be replaced by ones that are not so easy to replace. -- Sustrik's Law

Edit: Added Sustrick's Law

https://www.reuters.com/article/us-usa-banks-cobol/banks-scr... describe banks struggling to maintain ancient COBOL systems as the developers who understand them are dying off.

I don't know exactly what they should have done and when, but it seems like rewrites are going to be necessary, and it sure would be nice to start rewriting a system while you still have someone who can explain what it does.

In the post-Y2K years, I heard a lot about how the next order of business was going to be to replace all those systems in health care, banking, etc, especially as a generation went into retirement. But, here it is 2018 and there are still articles Food for thought.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact