"A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system."
– John Gall (1975) Systemantics: How Systems Really Work and How They Fail
It's why I'm always very skeptical of new languages and frameworks. They often look great on a PowerPoint slide, but it's not clear how they'll look on something complex and long-lasting.
They usually pick up warts added for some special case, and that's a sign that there will be infinitely many more.
There's a fine line between "applying experience" and "designing a whole new system around one pet peeve". But it's a crucial distinction.
Most languages are much older than we think. But early adoption is a key to geting to that point of when to "trust it". D isn't that much younger than C and its variants, and older than C#. But it never quite got that adoption to really push development to the point of C#
About all that happened with ANSI C was that prototypes were created (the major addition), type promotion rules were altered, plus 'const' and 'volatile' were added. ANSI also added 'void *'.
I've a vague recall about 'void' existing in unix C compilers before that, having read a version of the above memo in a unix manual ('papers' section) and it mentioning 'void'.
Disrespect is part of progress, respectful humans are liable to blindness of flaws. Just as part of youthful creativity is disregard for what has come before.
It's a double-edged sword: ancestor-worship blocks progress, but throwing the baby out with the bathwater also blocks progress. Real fundamental progress comes from the tiny minority that avoids both.
If all disrespecting is to belittle and look down upon, then fair enough, I agree with you. What I meant, in perhaps an ill-phrased manner, was that overemphasised respect can often lead to stasis, where people might not want to change in case they are seen as disrespectful. Hence my use of disrespect, in that it is a relative judgement, and which can and has been used to discourage creative difference or just difference in general.
> "designing a whole new system around one pet peeve"
BAHHAHAH! So…you mean React. If I hear the word hook as if it alone can solve complexity in web dev one more time I’ll…eh, I’ll do nothing actually. But my point still stands. React solves asynchronous event driven behavior well, but that’s all. Everything else in React projects is, well, everything else.
I have an alternate theory: about 10% of developers can actually start something from scratch because they truly understand how things work (not that they always do it, but they could if needed). Another 40% can get the daily job done by copying and pasting code from local sources, Stack Overflow, GitHub, or an LLM—while kinda knowing what’s going on. That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.
Given that distribution, I’d guess that well over 50% of Makefiles are just random chunks of copied and pasted code that kinda work. If they’re lifted from something that already works, job done—next ticket.
I’m not blaming the tools themselves. Makefiles are well-known and not too verbose for smaller projects. They can be a bad choice for a 10,000-file monster—though I’ve seen some cleanly written Makefiles even for huge projects. Personally, it wouldn’t be my first choice. That said, I like Makefiles and have been using them on and off for at least 30 years.
> That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.
Small nuance: I think people often don’t know because they don’t have the time to figure it out. There are only so many battles you can fight during a day. For example if I’m a C++ programmer working on a ticket, how many layers of the stack should I know? For example, should I know how the CPU registers are called? And what should an AI researcher working always in Jupyter know? I completely encourage anyone to learn as much about the tools and stack as possible, but there is only so much time.
If you spend 80% of your time (and mental energy) applying the knowledge you already have and 20% learning new things, you will very quickly be able to win more battles per day than someone who spends 1% of their time learning new things.
Specifically for the examples at hand:
- at 20%, you will be able to write a Makefile from scratch within the first day of picking up the manual, rather than two or three weeks if you only invest 1%.
- if you don't know what the CPU registers are, the debugger won't be able to tell you why your C++ program dumped core, which will typically enable you to resolve the ticket in a few minutes (because most segfaults are stupid problems that are easy to fix when you see what the problem is, though the memorable ones are much hairier.) Without knowing how to use the disassembly in the debugger, you're often stuck debugging by printf or even binary search, incrementally tweaking the program until it stops crashing, incurring a dog-slow C++ build after every tweak. As often as not, a fix thus empirically derived will merely conceal the symptom of the bug, so you end up fixing it two or three times, taking several hours each time.
Sometimes the source-level debugger works well enough that you can just print out C++-level variable values, but often it doesn't, especially in release builds. And for performance regression tickets, reading disassembly is even more valuable.
(In C#, managed C++, or Python, the story is of course different. Until the Python interpreter is segfaulting.)
How long does it take to learn enough assembly to use the debugger effectively on C and C++ programs? Tens of hours, I think, not hundreds. At 20% you get there after a few dozen day-long debugging sessions, maybe a month or two. At 1% you may take years.
What's disturbing is how many programmers never get there. What's wrong with them? I don't understand it.
You make it sound easy, but I think it's hard to know where to invest your learning time. For example, I could put some energy into getting better at shell scripting but realistically I don't write enough of it that it'll stick so for me I don't think it'd be a good use of time.
Perhaps in learning more shell scripting I have a breakthrough and realise I can do lots of things I couldn't before and overnight can do 10% more, but again it's not obvious in advance that this will happen.
I agree. And there's no infallible algorithm. I think there are some good heuristics, though:
- invest more of your time in learning more about the things you are currently finding useful than in things that sound like they could potentially be useful
- invest more of your time in learning skills that have been useful for a long time (C, Make) than in skills of more recent vintage (MobX, Kubernetes), because of the Lindy Effect
- invest more of your time in skills that are broadly applicable (algorithms, software design, Python, JavaScript) rather than narrowly applicable (PARI/GP, Interactive Brokers)
- invest your time in learning to use free software (FreeCAD, Godot, Postgres) rather than proprietary software (SolidWorks, Unity, Oracle), because sooner or later you will lose access to the proprietary stuff.
- be willing to try things that may not turn out to be useful, and give them up if they don't
- spend some time every day thinking about what you've been doing. Pull up a level and put it in perspective
I agree. An additional perspective that I have found useful came from a presentation I saw by one of the pragmatic programmers.
They suggested thinking about investing in skills like financial investments. That is, investments run on a spectrum from low risk, low return to high risk, high return.
Low risk investments will almost always pay out, but the return is usually modest. Their example: C#
High risk investments often fail to return anything, but sometimes will yield large returns. Their example: Leaning a foreign language.
Some key ideas I took away:
- Diversify.
- Focus on low risk to stay gainfully employed.
- Put some effort into high risk, but keep expectations safe.
- Your mix may vary based on your appetite for risk.
> invest your time in learning to use free software (FreeCAD, Godot, Postgres) rather than proprietary software (SolidWorks, Unity, Oracle), because sooner or later you will lose access to the proprietary stuff.
The think you have a solid point with Postgres v Oracle and I haven’t followed game dev in a while, but your FreeCAD recommendation is so far from industry standard that I don’t think it’s good advice.
If you need to touch CAD design in a professional setting, learn SolidWorks or OnShape. They’re what every MechE I’ve ever worked with knows and uses, and they integrate product lifecycle aspects that FreeCAD does not.
10 years ago people said the same about KiCAD and Blender :) No guarantee that FreeCAD will be the same - but you can be pretty confident it will never go away.
However, if the goal is to be a full time employee doing CAD, I would of course going with one of the most established tools. Potentially even get some certifications.
One simple approach that the second, or at least third, time you deal with something, you invest time to learn it decently well. Then each time you come back to it, go a bit deeper.
This algorithm makes you learn the things you'll need quite well without having to understand and/or predict the future.
If it’s a tool you use every day, it’s worth understanding on a deeper level. I’ve used the shell probably every day in my professional career, and knowing how to script has saved me and my team countless hours of tedious effort with super simple one liners.
The other thing that’s worth learning is that if you can find tools that everybody uses regularly, but nobody understands, then try to understand those, you can bring enormous value to your team/org.
That’s an insightful comment, but there is a whole universe of programmers who never have to directly work in C/C++ and are productive in safe languages that can’t segfault usually. Admittedly we are a little jealous of those elite bitcrashers who unlock the unbridled power of the computer with C++… but yeah a lot of day jobs pay the bills with C#, JavaScript, or Python and are considered programmers by the rest of the industry
Both have strong limits for writing complex code. Typescript is one attempt of an answer because bad as javascript is for large programs the web forces it. I prefer a million lines of c++ to 100k lines of python - but if 5k lines of python sill do them c++ is way too much overhead. (rust likely plays better than c++ for large problems from scratch but most large probles have existing answers and throwing something else in would be hard)
Many devs are never even coming into challenging projects like that. For a large part of the dev community its just simple webapps for most of their professional life.
> I completely encourage anyone to learn as much about the tools and stack as possible, but there is only so much time.
That seems like a weird way to think about this. I mean, sure, there's no time today to learn make to complete your C++ ticket or whatever. But yesterday? Last month? Last job?
Basically, I think this matches the upthread contention perfectly. If you're a working C++ programmer who's failed to learn the Normal Stable of Related Tools (make, bash, python, yada yada) across a ~decade of education and experience, you probably never will. You're in that 50% of developers who can't start stuff from scratch. It's not a problem of time, but of curiosity.
> I mean, sure, there's no time today to learn make to complete your C++ ticket or whatever. But yesterday? Last month? Last job?
That seems like a weird way to think about this. Of course there was no time in the past to learn this stuff, if you still haven't learned it by the present moment. And even if there were, trying to figure out whether there perhaps was some free time in the past is largely pointless, as opposed to trying to schedule things in the future: you can't change the past anyhow, but the future is somewhat more malleable.
To be clear: I'm not suggesting a time machine, and I'm not listing any particular set of skills everyone must have. I'm saying that excusing the lack of core job skills by citing immediate time pressure is a smell. It tells me that that someone probably won't ever learn weird stuff. And in software development, people who don't learn weird stuff end up in that 50% bucket posited upthread.
> I'm saying that excusing the lack of core job skills by citing immediate time pressure is a smell. It tells me that that someone probably won't ever learn weird stuff. And in software development, people who don't learn weird stuff end up in that 50% bucket posited upthread.
Or the whole chain of work culture is bad and people do not have adequate down time or brain juice to pursue these. Additionally, how many do you want to learn? I have dealt with Makefile, then recently someone decided to introduce taskfile and then someone else wanted to use build.please and someone tried to rewrite a lot of CI pipelines using python because shell scripting is too arcane, while someone decided that CI were super slow and must be hosted on premises using their favorite system(was it now drone or whatever I forgot). Eventually, things become so many and chaotic, your brain learns to copy-paste what works and hope for the best as the tool you have spent time learning will be replaced in few months.
A lot of these technologies share a common base that can be pretty small. Once you learn about Make and concepts like target, recipe, dependencies,... It'd be easier to learn Ansible or Github actions even though they don't solve the same problem. It's the same when learning programming language and whatever tools of the week. But that requires to spend a bit of effort to goes under the hood and understand the common abstractions instead of memorizing patterns and words.
> Or the whole chain of work culture is bad and people do not have adequate down time or brain juice to pursue these.
And... again, I have to say that that kind of statement is absolutely of a piece with the analysis upthread. Someone who demands a "work culture" that provides "down time" or "brain juice" to learn to write a makefile... just isn't going to learn to write a makefile.
I mean, I didn't learn make during "downtime". I learned it by hacking on stuff for fun. And honed the skills later on after having written some really terrible build integration for my first/second/whatever job: a task I ended up doing because I had already learned make.
It all feeds back. Skills are the components you use to make a career, it doesn't work if you expect to get the skills like compensation.
This is the 40% that OP mentioned. But there's a proportion on people/engineers that are just clueless and are incapable of understanding code. I don't know the proportion so can't comment on the 50% number, but hey definitely exist.
If you never worked with them, you should count yourself lucky.
We can’t really call the field engineering if this is the standard. A fundamental understanding of what one’s code actually makes the machine do is necessary to write quality code regardless of how high up the abstraction stack it is
I don't consider that an equal comparison. Obviously an engineer can never be omniscient and know things nobody else knows either. They can, and should, have an understanding of what they work with based on available state of the art, though.
If the steam engine was invented after those discoveries about steel, I would certainly hope it would be factored into the design (and perhaps used to make those early steam engines less prone to exploding).
A lot of material science was developed to make cannons not explode - that them went into making steam engines possible. The early steam engines introduced their own needed study of efficiency-
Yes and they’re far less efficient and require far more maintenance than an equivalent electric or even diesel engine, where equivalent power is even possible
Steam engines currently power most of the world's electrical grid. The main reason for this is that, completely contrary to what you said, they are more efficient and more reliable than diesel engines. (Electric motors of course are not a heat engine at all and so are not comparable.)
Steam engines used to be very inefficient, in part because the underlying thermodynamic principles were not understood, but also because learning to build safe ones (largely a question of metallurgy) took a long time. Does that mean that designing them before those principles were known was "not engineering"? That seems like obvious nonsense to me.
Steam engines are thoroughly obsolete in the developed world where there are natural gas pipeline networks.
People quit building coal burning power plants in North America at the same time they quit burning nuclear power plants for the same reason. The power density difference between gas turbines and steam turbines is enough that the capital cost difference is huge. It would be hard to afford steam turbines if the heat was free.
Granted people have been building pulverized coal burning power plants in places like China where they'd have to run efficient power plants on super-expensive LNG. They thought in the 1970s it might be cheaper to gasify coal and burn it in a gas turbine but it's one of those technologies that "just doesn't work".
Nuclear isn't going to be affordable unless they can perfect something like
There is some truth in what you say. Though steam engines still power most of the power grid (especially in the "developed world") their capital costs are indeed too high to be economically competitive.
However, there are also some errors.
In 02022 24% of total US electrical power generation capacity was combined-cycle gas turbines (CCGT), https://www.eia.gov/todayinenergy/detail.php?id=54539 which run the exhaust from a gas turbine through a boiler to run a steam turbine, thus increasing the efficiency by 50–60%. So in fact a lot of gas turbines are installed together with a comparable-capacity steam turbine, even today.
Syngas is not a technology that "just doesn't work". It's been in wide use for over two centuries, though its use declined precipitously in the 20th century with the advent of those natural-gas pipeline networks. The efficiency of the process has improved by an order of magnitude since the old gasworks you see the ruins of in many industrial cities. As you say, though, that isn't enough to make IGCC plants economically competitive.
The thing that makes steam engines economically uncompetitive today is renewable energy. Specifically, the precipitous drop in the price of solar power plants, especially PV modules, which are down to €0.10 per peak watt except in the US, about 15% of their cost ten years ago. This combines with rapidly dropping prices for batteries and for power electronics to undercut even the capex of thermal power generation rather badly, even (as you say) if the heat was free, whereas typically the fuel is actually about half the cost. I don't really understand what the prospects are for dramatically cheaper steam turbines, but given that the technology is over a century old, it seems likely that its cost will continue to improve only slowly.
as if 1.5 hours of storage was going to cut it. I've been looking for a detailed analysis of what the generation + storage + transmission costs of a reliable renewable grid is that's less than 20 years old covering a whole year and I haven't seen one yet.
To be honest, I don't think anyone has any idea yet (other than crude upper bounds) because it depends a lot on things like how much demand response can help. Demand response doesn't have to mean "rolling blackouts"; it could mean "running the freezer during the day when electricity is free". Will people heat their houses in the winter with sand batteries? Will desiccant air conditioning pan out? Can nickel–iron batteries compete economically with BYD's gigafactories? What about sodium-ion? Nobody has any idea.
I was pleased to calculate recently that the EV transition, if it looks something like replacing each ICE vehicle with the BYD equivalent of a Tesla Model Y, would add several hours of distributed grid-scale storage, if car owners choose to sell it back to the grid. But that's still a far cry from what you need for a calm, cloudy week. Maybe HVDC will be the key, because it's never cloudy across all of China.
Sensible-heat seasonal thermal stores for domestic climate control (in some sense the most critical application) have been demonstrated to be economically feasible at the neighborhood scale. PCM or TCES could be an order of magnitude lower mass, but would the cost be low enough?
We don’t have to assume, because we know. We can calculate and measure the efficiency of gasoline and diesel engines, and electric motors. We know that electric motors are highly efficient, and ICE engines are not.
The problem is that software is much more forgiving than real life engineering project. You can't build a skyscraper with duct tape. With software, especially the simple webapps most devs work on, you don't NEED good engineering skills to get it running. It will suck of course, but it will not fall apart immediately. So of course most "engineers" will go the path of least resistance and never leave the higher abstractions to dive deep in concrete fundamentals.
Sure if you are doing embedded programming in C. How does one do this in web development though where there are hundreds of dependencies that get updated monthly and still add functionality and keep their job?
The current state of web development is unfortunately a perfect example of this quality crisis. The tangle of dependencies either directly causes or quickly multiplies the inefficiency and fragility we’ve all come to expect from the web. The solution is unrealistic because it involves design choices which are either not trendy enough or precluded by the platform
Yes, and I should overrule half the business decisions of the company while I am at it. Oh, and I'll push back on "we need the next feature next week" and I'll calmly respond "we need to do excellent engineering practices in this company".
And everybody will clap and will listen to me, and I will get promoted.
...Get real, dude. Your comments come across a bit tone-deaf. I am glad you are in a privileged position but you seem to have fell for the filter bubble effect and are unaware to how most programmers out there have to work if they want to pay the bills.
I know a lot of people have terrible jobs at profoundly dysfunctional companies. I've had those too. That situation doesn't improve unless you, as they say, have the serenity to accept the things you cannot change, the courage to change the things you can, and the wisdom to know the difference.
Not everyone has a position where they have the autonomy to spend a lot of effort on paying down technical debt, but some people do, and almost every programmer has a little.
I think it's important to keep in view both your personal incentive system (which your boss may be lying to you about) and the interests of the company.
The serenity in question boils down to "I'll never make enough money to live peacefully and being able to take a two years sabbatical so let's just accept I'll be on the hamster wheel for life and I can never do anything about it".
No. I'll let my body wither and get spent before my spirit breaks. I refuse to just "accept" things. There's always something you can do.
BTW is that not what HN usually preaches? "Change your job to a better one" and all that generic motivational drivel [that's severely disconnected from reality]? Not throwing shade at you here in particular, just being a bit snarky for a minute. :)
RE: your final point, I lost the desire to keep view of both my personal and my company's incentive systems. Most "incentive systems" are basically "fall in line or GTFO".
Before you ask, I am working super hard to change my bubble and get a bit closer to yours. To say it's not easy would be so understated so as to compare the description of a lightning hit on you and you enduring the said lightning hit. But as said above, I am never giving up.
But... it's extremely difficult, man. Locality and your own marketing matter a lot, and when you have been focused on technical skills all your life and marketing is as foreign to you as are the musical notes of an alien civilization... it's difficult.
Funny enough I'm an A̶I̶ML researcher and started in HPC (High Performance Computing).
> if I’m a C++ programmer ... should I know how the CPU registers are called?
Probably.
Especially with "low level"[0] languages knowing some basics about CPU operations goes a long way. You can definitely get away without knowing these things but this knowledge will reap rewards. This is true for a lot of system based information too. You should definitely know about things like SIMD, MIMD, etc because if you're writing anything in C/C++ these days it should be because you care a lot about performance. There's a lot of stuff that should be parallelized that isn't. Even stuff that could be trivially parallelized with OpenMP.
> what should an AI researcher working always in Jupyter know?
Depends on what they're researching. But I do wish a lot more knew some OS basics. I see lots of things in papers where they're like "we got 10x" performance on some speed measurement but didn't actually measure it correctly (e.g. you can't use time.time and be accurate because there's lots of asynchronous operations). There's lots of easy pitfalls here that are not at all obvious and will look like they are working correctly. There's things about GPUs that should be known. Things about math and statistics. Things about networking. But this is a broad field so there are of course lots of answers here. I'd at least say anyone working on AI should read at least some text on cognitive science and neuroscience because that's a super common pitfall too.
I think it is easy to not recognize that information is helpful until after you have that information. So it becomes easy to put off as not important. You are right that it is really difficult to balance everything though but I'm not convinced this is the problem with those in that category of programmers. There's quite a number of people who insist that they "don't need to" learn things or insist certain knowledge isn't useful based on their "success."
IMO the key point is that you should always be improving. Easier said than done, but it should be the goal. At worst, I think we should push back on anyone insisting that we shouldn't be (I do not think you're suggesting this).
[0] Quotes because depends who you talk to. C++ historically was considered a high level language but then what is Python, Lua, etc?
you don't have the time because you spend it bruteforcing solutions by trial and error instead of reading the manual and doing them right the first time
I feel like your working environment might be to blame: maybe your boss is too deadline-driven so that you have no time to learn; or maybe there is too much pressure to fix a certain number of tickets. I encourage you to find a better workplace that doesn't punish people who take the time to learn and improve themselves. This also keeps your skills up to date and is helpful in times of layoffs like right now.
Seriously? Yes, you should read the docs of every API you use and every tool you use.
I mean, it's sort of ok if you read somewhere how to use it and you use it in the same way, but I, for one, always check the docs and more often even the implementation to see what I can expect.
Actually it is trivial to write a very simple Makefile for a 10,000 file project, despite the fact that almost all Makefiles that I have ever seen in open-source projects are ridiculously complicated, far more complicated than a good Makefile would be.
In my opinion, it is a mistake almost always when you see in a Makefile an individual rule for making a single file.
Normally, there should be only generic building rules that should be used for building any file of a given type.
A Makefile should almost never contain lists of source files or of their dependencies. It should contain only a list with the directories where the source files are located.
Make should search the source directories, find the source files, classify them by type, create their dependency lists and invoke appropriate building rules. At least with GNU make, this is very simple and described in its user manual.
If you write a Makefile like this, it does not matter whether a project has 1 file or 10,000 files, the effort in creating or modifying the Makefile is equally negligible. Moreover, there is no need to update the Makefile whenever source files are created, renamed, moved or deleted.
If everything in your tree is similar, yes. I agree that's going to be a very small Makefile.
While this is true, for much larger projects, that have lived for a long time, you will have many parts, all with slight differences. For example, over time the language flavour of the day comes and goes. Structure changes in new code. Often different subtrees are there for different platforms or environments.
The Linux kernel is a good, maybe extreme, but clear example. There are hundreds of Makefiles.
Different platforms and environments are handled easily by Make "variables" (actually constants), which have platform-specific definitions, and which are sequestered into a platform-specific Makefile that contains only definitions.
Then the Makefiles that build a target file, e.g. executable or library, include the appropriate platform-specific Makefile, to get all the platform-specific definitions.
Most of my work is directed towards embedded computers with various architectures and operating systems, so multi-platform projects are the norm, not the exception.
A Makefile contains 3 kinds of lines: definitions, rules and targets (typical targets may be "all", "release", "debug", "clean" and so on).
I prefer to keep these in separate files. If you parametrize your rules and targets with enough Make variables to allow their customization for any environment and project, you must almost never touch the Makefiles with rules and targets. For each platform/environment, you write a file with appropriate definitions, like the names of the tools and their command-line options.
The simplest way to build a complex project is to build it in a directory with a subdirectory for each file that must be created. In the parent directory you put a Makefile that is the same for all projects, which just invokes all the Makefiles from the subdirectories that it finds below, passing any CLI options.
In the subdirectory for each generated file, you just put a minimal Makefile, with only a few lines, which includes the Makefiles with generic rules and targets and the Makefile with platform-specific definitions, adding the only information that is special for the generated file, i.e. what kind of file it is, e.g. executable, static library, dynamic library etc., a list of directories where to search for source files, the strings that should be passed to compilers for their include directory list and their preprocessor definition list, and optionally and infrequently you may override some Make definitions, e.g. for providing some special tool options, e.g. when you generate from a single source multiple object files.
Sure, but this will require you to know how to tell the compiler to generate your Makefile header dependencies and if you end up making a mistake, this will cause silent failures.
I like Makefiles, but just for me. Each time I create a new personal project, I add a Makefile at the root, even if the only target is the most basic of the corresponding language. This is because I can't remember all the variations of all the languages and frameworks build "sequences". But "$ make" is easy.
I disagree. Make is - at it's simplest form - exactly a "simple plain shell script" for your tasks, with some very nice bonus features like dependency resolution.
Not the parent, bit I usually start with a two line makefile and add new commands/variables/rules when necessary.
Make is - at its core - a tool for expressing and running short shell-scripts ("recipes", in Make parlance) with optional dependency relationships between each other.
Why would I want to spread out my build logic across a bunch of shell scripts that I have to stitch together, when Make is a nicely integrated solution to this exact problem?
`Just` is also popular in this space, but tbh I think Make is a better choice.
Make is included in most Linux distros, including the ones available for WSL. It's also included with Apple's developer tools. It's been used for decades by millions of people, and is very mature.
If you use it in a simple way, Make is almost identical to `Just`. And when/if you want Make's powerful features, they're there and ready for you. Make's documentation is also exceptional.
I've used a bunch of these kinds of tools over the years, and I've never found one that I like more than Make.
I’d be curious to hear your ratio. It really varies. In some small teams with talented people, there are hardly any “fake” developers. But in larger companies, they can make up a huge chunk.
Where I am now, it’s easily over 50%, and most of the real developers have already left.
PS: The fakes aren’t always juniors. Sometimes you have junior folks who are actually really good—they just haven’t had time yet to discover what they don’t know. It’s often absolutely clear that certain juniors will be very good just from a small contribution.
My personal experience:
- 5% geniuses. This are people who are passionate about what they do, they are always up to date. Typically humble, not loud people.
- 15% good, can do it properly. Not passionate, but at least have a strong sense of responsibility. Want to do “the right thing” or do it right. Sometimes average intelligence, but really committed.
- 80% I would not hire. People who talk a lot, and know very little. Probably do the work just because they need the money.
That applies for doctors, contractors, developers, taxi drivers, just about anything and everything. Those felt percentages had been consistent across 5 countries, 3 continents and 1/2 a century of life
PS: results are corrected for seniority. Even in the apprentice level I could tell who was in each category.
From my 40 years in the field, I see much the same trend. I wouldn’t call 5% of developers “genius”—maybe 1% are true geniuses. Those folks can be an order of magnitude better at certain tasks—doing things no one else can—but only within a limited sphere. They also bring their own baggage, like unique personalities. Still, I believe there’s always room for genius on a big team, even with all the complications.
Typically, upper management wants smooth, steady output. But the better your people are, the bumpier that output gets—and those “one-percenters” can produce some pretty extreme spikes. If you think of it like a graph, the area under the curve (the total productivity) can be way bigger for a spiky output than for a flat, low-level one. So even if those spikes look messy, they often deliver a ton of long-term value.
Being able to set up things and truly understanding how they work are quite different imo.
I agree with the idea that a lot of productive app developers would not be able to set up a new project ex novo but often it is not about particularly true understanding but rather knowing the correct set of magic rules and incantations to make many tools work well together
At my work I've noticed another contributing factor: tools/systems that devs need to interact with at some point, but otherwise provide little perceived value to learn day-to-day.
Example is build system and CI configuration. We absolutely need these but devs don't think they should be expected to deal with them day to day. CI is perceived as a system that should be "set and forget", like yeah we need it but really I have to learn all this just to build the app? Devs expect it to "just work" and if there are complexities then another team (AKA my role) deals with that. As a result, any time devs interact with the system, there's a high motivation to copy from the last working setup and move on with their day to the "real" work.
The best solution I see is meet the devs halfway. Provide them with tooling that is appropriate simple/complex for the task, provide documentation, minimise belief in "magic". Tools like Make kinda fail here because they are too complex and black-box-like.
Same goes for anything "enterprisey". Last time I set up a big project, I made a commitment that "we should be able to check out and build this whole thing, for as long as humanly possible".
The local part is my big problem too. I used azure Dev ops in work. I find clicking through the UI to be a miserable experience, Id love to have it running locally so I could view inputs and outputs on the file system. Also yaml is an awful choice, no one I know enjoys working with it. The white space issues just get worse and worse longer your files get.
Spoken like someone who has not tried what you are describing. There are two moving parts to your response: a locally hosted runner awaits jobs from GitLab itself, which doesn't help running _locally_, and the other part is that -- back when it existed! -- trying $(gitlab-runner exec) was not a full fledged implementation of the GitLab CI concepts, making it the uncanny valley of "run something locally."
However, as of v16 there is no more exec https://gitlab.com/gitlab-org/gitlab/-/issues/385235 which I guess is good and bad. Good in that it not longer sets improper expectations that it could have plausibly done anything, and bad in that now it joins GitHub Actions[1] in not having any _local_ test strategy aside from "boot up gitlab/gitlab-ce && echo good luck"
1: yes, I'm acutely aware of the 3(?) implementations/forks of nektos/act that claim to do GHA but, again, readme is not software and I can tell you with the utmost certainty they do not do as advertised
Not to mention that the rest of the environment is missing: e.g. you probably can't push to e.g. the organization's Docker registry/Artifactory/whatever from you local dev machine (if it's even visible from your machine in the first place). And those are arguably the most interesting parts that you want to test the integration with.
I think GP got confused, it's not running the runners locally, it's running the CI steps locally (see the other sibling replies).
For example, running tests locally exactly the same way as in the runner - sometimes I have to open a debugger in the middle of a test to see what exactly went wrong. Our tests run in gitlab in a particular docker image, and I've been adding a "make test" that runs the same tests in the same image locally, with additional flags to have full interactivity so the debugger works if needed.
Strong agree. The best workflow I've seen uses CICD as a very thin wrapper around in-tree scripts or make files.
If a Dev can run some/all of the "cicd" stuff locally, they can see, control, and understand it. It helps tremendously to have a sense of ownership and calm, vs "cicd is something else, la la la".
(This doesn't always work. We had a team of two devs, who had thin-wrapper CICD, who pretended it was an alien process and refused to touch it. Weird.)
+1. The only CI tool that I've seen really organize around this principle is Buildkite, which I've used and enjoyed. I'm currently using Github Actions and it's fine but Buildkite is literally sooooo good for the reasons you've mentioned.
The office coffee machine is not „set and forget”, but you wouldn’t expect the entire responsibility for it’s maintenance to be evenly distributed between all people that use it. Similarly, CI needs ownership and having it fall on the last developer that attempted to use it is not an efficient way of working.
Make is one of the simplest build tools out there. Compared to something like Grunt, Webpack, etc. it’s a hammer compared to a mining drill.
The solution is to not use tools used by large corporations because they are used by large corporations. My unpopular opinion is that CI/CD is not needed in most places where it’s used. Figure out how to do your builds and deploys with the absolute fewest moving pieces even if it involves some extra steps. Then carefully consider the cost of streamlining any part of it. Buying into a large system just to do a simple thing is often times not worth it in the long run.
If you really do need CI/CD you will know because you will have a pain point. If that system is causing your developers pain, it isn’t the right fit.
If you think cmake is a good example of more complex than make, then you haven't seen automake/autoconf. The first thing I thought of. You can find tons of tons of configure scripts that check if you're running ancient versions of Unix, checks that a byte is 8 bits wide, and a ton of other pointless checks. They don't do anything with all that information, don't think for a moment that you can actually build the app on Irix, but the checks for it have been passed along for decades likes junk DNA.
Firstly, automake/autoconf is not `make`, but a different piece of software, and secondly, that you know all those details about it is because it is not black-box-like.
I never said it was. It's a script to generate a script to generate a makefile, more or less.
If it wasn't black box like, why do people keep blindly copying tests which check things which haven't been relevant for decades and in any case would require a ton of manual effort to actually use for something?
Yeah, I think this is the real issue. Too many different tool types that need to interact, so you don't get a chance to get deep knowledge in any of them. If only every piece of software/CI/build/webapp/phone-app/OS was fully implemented in GNU make ;-) There's a tension between using the best tool for the job vs adding yet another tool/dependency.
Make and Makefiles are incredibly simple when they are not autogenerated by autoconf. If they are generated by autoconf, don’t modify them, they are a build artifact. But also, ditch autoconf if you can.
In the broader sense: yes this effect is very real. You can fall to it or you can exploit it. How I exploit it: write a bit of code (or copy/paste it from somewhere). Use it in a project. Refine as needed. When starting the next project, copy that bit of code in. Modify for the second project. See if changes can be backported to the original project. Once both are running and are in sync, extract the bit of code and make it into a library. Sometimes this takes more projects to distill the thing into what a library should be. In the best case, open source the library so others can use it.
They are also extremely limited. Timestamp-based freshness is often broken by modern VCSes. Git doesn’t record timestamps internally, so files can (and often do) have their mtime updated even when their contents are the same, causing unnecessary rebuilds.
They also are utterly unable to handle many modern tools whose inputs and/or outputs are entire directories or whose output names are not knowable in advance of running the tool.
I love make. I have put it to good use in spite of its shortcomings and know all the workarounds for them, and the workarounds for the workarounds, and the workarounds for those workarounds. Making a correct Makefile when you end up with tools that don’t perfectly fit into its expectations escalates rapidly in difficulty and complexity.
They are simple but very often wrong. It's surprisingly hard to write Makefiles that will actually do the right thing under anything other than "build from scratch" scenarios. (No, I'm not joking. The very existence of the idea of "make clean" is the smoking gun.)
I disagree, but I think once a project gets beyond a certain level of complexity you may need to move beyond make. For simple projects though I usually do something like:
This isn't perfect as it causes a full project rebuild whenever a header is updated, but I've found it's easier to do this than to try to track header usage in files. Also, failing to rebuild something when a header updates is a quick way to drive yourself crazy in C, it's better to be conservative. It's easy enough that you can write it from memory in a minute or two and pretty flexible. There are no unit tests, no downloading and building of external resources, or anything fancy like that. Just basic make. It does parallelize if you pass -j to make.
That's effectively "make clean" just slightly more automated. Btw, what happens if you change CFLAGS? Does anything get compiled if no files have changed?
I use makefiles all the time for my projects; projects that are actually built with something else (ex, gradle, maven, whatever). My makefiles have targets for build, clean, dependencies, and a variety of other things. And they also have inputs (like "NOTEST=true") for altering how they run. And then I use make to actually build the project; so I don't need to remember how the specific build tool for _this_ project (or the one of many build tools in a project) happens to work. It works pretty well.
The idea that git offers a 'clean' command was revelatory to me. Your build system probably shouldn't need to know how to restore your environment to a clean state because your source control should already know what a clean state is.
That's sort essential to serving its purpose, after all.
I haven't yet run into a scenario where there was a clean task that couldn't be accomplished by using flags to git clean, usually -dfx[0]. If someone has an example of something complex enough to require a separate target in the build system, I'm all ears.
[0] git is my Makefile effect program. I do not know it well, and have not invested the time to learn it. This says something about me, got, or both.
The problem with `git clean` is -X vs -x. -x (lowercase) removes EVERYTHING including .env files and other untracked files. -X (uppercase) removes only ignored files, but not untracked files.
If there is a Makefile with a clean target, usually the first thing I do when I start is make it an alias for `git clean -X`.
Usually, you want to keep your untracked files (they are usually experiments, debugging hooks, or whatever).
This. I have no urge to have git "clean" my project, because I'll lose a ton of files I have created locally. Rather, I want the project know what it creates when it builds and have the ability to clean/purge them. It's a never ending source of frustration for me that "gradlew clean" only cleans _some_ stuff, and there's no real "gradlew distclean".
I know there are local-only git settings, but AFAIK ignore-status won't protect files from being removed/overwritten by all possible git commands, it just means they won't be accidentally staged/committed.
That’s why I usually write them from scratch and don’t let them get over 100 lines long at most. Usually they are around 30 with white space.
make clean makes lots of sense but is not even strictly necessary. In the world where all it does is find all the *.o files and deletes them it’s not a bad thing at all.
I think Makefile is maybe the wrong analogy - the problem with most people and makefiles is they write so few of them, the general idea of what make does is at hand, but the muscle memory of how to do it from scratch is not.
But, point taken - I've seen so much code copy-pasta'd from the web, there will be like a bunch of dead stuff in it that's actually not used. A good practice here is to keep deleting stuff until you break it, then put whatever that was back... And delete as much as possible - certainly everything you're not using at the moment.
This is exactly the problem I face with many tools, Makefiles, KVM setups, docker configurations, CI/CD pipelines. My solution so far has been to create a separate repository with all my notes, shell script example programs etc, for these tool, libraries or frameworks. Every time I have to use these tools, I refer to my notes to refresh my memory, and if I learn something new in the process, I update the notes. I can even point an LLM at it now and ask it questions.
The repository is personal, and contains info on tools that are publicly available.
I keep organisation specific knowledge in a similar but separate repo, which I discard when my tenure with a client or employer ends.
I'm usually contractually obligated to destroy all client IP that I may posses at the end of an engagement. My contracts usually specify that I will retain engagement specific information for a period of six months beyond the end of the contract. If they come back within that time, then I'll have prior context. Otherwise it's gone. Occasionally, a client does come back after a year or two, but most of the knowledge would have been obsolete and outdated anyway.
As for LLMs. I have a couple of python scripts that concatenate files in the repo into a context that I pass to Google's Gemini API or Google AI studio, mostly the latter. It can get expensive in some situations. I don't usually load the whole repository. And I keep the chat context around so I can keep asking question around the same topic.
The best term for this is Cargo Cult Development. Cargo Cults arose in the Pacific during World War II, where native islanders would see miraculous planes bringing food, alcohol and goods to the islands and then vanishing into the blue. The islanders copied what they saw the soldiers doing, praying that their bamboo planes and coconut gadgets would impress the gods and restart the flow of cargo to the area.
The issue of course is the islanders did not understand the science behind planes, Wallis talkies, guns, etc.
Likewise, cargo cult devs see what is possible, but do not understand first principles, so they mimic what they see their high priests of technology doing, hoping they can copy their success.
Hence the practice of copying, pasting, trying, fiddling, googling, tugging, pulling and tweaking hoping that this time it will be just right enough to kind of work. Badly, and only with certain data on a Tuesday evening.
I don't think of this as being cargo cult development. Cargo culting has more to do with mimicking practices that have worked before without understanding that they only worked within a broader context that is now missing. It's about going through motions or rituals that are actually ineffective on their own in the hopes that you'll get the results that other companies got who also happened to perform those same motions or rituals.
What OP is describing isn't like this because the thing being copied—the code—actually is effectual in its own right. You can test it and decide whether it works or not.
The distinction matters because the symptoms of what OP calls the Makefile effect are different than the symptoms of cargo culting, so treating them as the same thing will make diagnosis harder. With cargo culting you're wasting time doing things that actually don't work out of superstition. With the Makefile effect things will work, provably so, but the code will become gradually harder and harder to maintain as vestigial bits get copied.
I would almost call this the "boilerplate effect".
Where people copy the giant boilerplate projects for React, K8, Terraform, etc. and go from there. Those boilerplates are ideal for mid to large scale projects. And it's likely you'll need them someday. But in the early stages of development it's going to impart a lot of architecture decisions that really aren't necessary.
That's a great phrase. A perfect example of what you're talking about is actually built-in to the `helm` tool for creating re-usable kubernetes configs: `helm create myapp` creates the most convoluted junk I've ever seen in my life. As a new `helm` user I was extremely confused, as I had been expecting a minimal example that I could start from. Instead, I got a maximal example. Thankfully a more experienced engineer on the infra team confirmed that it was mostly unnecessary boilerplate and I could remove it.
Something to consider for anyone else building tools — boilerplate has costs!
Seeing this exact effect where I am currently working. Main available CI/CD tool is a customised and centrally managed Jenkins fleet. It's pretty much impossible to avoid using and seldom needs changed - until it does. Some attempts have been made at centralised libraries and patterns - but even that requires knowledge and study that most won't know is available or be given time to acquire.
So when the inevitable tweak or change is made it's made in the easiest, cheapest way - which is usually copying an existing example, which itself was copied from somewhere else.
I see exactly the same in other teams repositories. Easiest path taken to patch what already exists as the cost/benefit just isn't perceived to be there to worth prioritising.
> only worked within a broader context that is now missing
> because the thing being copied—the code—actually is effectual in its own right.
I don't understand how the second disproves the former.
In fact, a cargo cult works because there's the appearance of a casual linkage. It appears things work. But as we know with code, just because it compiles and runs doesn't mean "it works". It's not a binary thing. Personal I find that belief is at the root of a lot of cargo cult development. Where many programmers glue things together say "it works" because they passed some test cases but in reality code shouldn't be a Lovecraftian monster made of spaghetti and duct tape. Just because your wooden plane glides doesn't mean it's AC an actual plane
But...it doesn't? That's the whole definitional point of it. If action A _does_ lead to outcome B, then "if we do A, then B will happen" is not a cargo cult perspective, it's just fact.
Sorry, the quotes around "work" were implicit. I thought there was enough context to make that clear. That just because some things actually work doesn't mean it works for the reason it actually works.
This is what I meant by cargo cults working. Where there is a _belief_ in a causal connection where there is none. The Melanesia really did believe there was a causal connection between their actions would cause cargo to return. It's not just about appearance. It is about not understanding the actual causal chain and misinterpreting the causal variables.
Measurement is HARD. It is very common that if you do A then B will happen YET A is not the cause of B. Here's a trivial example: a mouse presses a button and it gets fed. In reality, the mouse presses a button and a human feeds the mouse. It is not the button. It may even be impossible for the mouse to know this. But if the human leaves, the mouse can press the button all they want and they will not get fed. I hope you can see how this can happen in far more complex ways. As the chain of events gets longer and more complex you an probably see why it is actually a really common issue. Because you have literal evidence that your actions are causal while they are not.
For actual cargo cults, yes. Cargo Cult Development just used the name to invoke a comparison..when CCD is being practiced, devs are doing mystical steps because it's part of the incantation. They wouldn't keep doing them if the project then never worked.
Your definition is extremely unlikely to ever be practiced, because those developers would be fired for never getting anything working, and so it's not really a helpful one imo.
Concrete examples of what I think actually counts as cargo culting:
* Incorporating TDD because it's a "best practice".
* Using Kubernetes because Google does it.
* Moving onto AWS because it's what all the cool companies are doing.
The key thing that makes cargo cult development a cargo cult is that it's practices and rituals adopted without any concrete theory for what a bit is supposed to do for you in your context. You're just doing it because you've seen it done before.
This is different than small scale copypasta where people know exactly what they're trying to accomplish but don't take the time in any given instance to slow down and re-analyze why this bit of code looks the way that it does. They know that it works, and that's enough in that moment.
If we're going to go back to the original analogies that started it all, what you're describing as cargo cult would be more similar to islanders using machinery that was left behind without knowing how it works or how to maintain it. They don't strictly need to know that in order to gain actual concrete value from the machinery, but it would be better in the long term if they knew how to maintain it.
Right, yes, exactly this. Using a tool or process without full understanding of its operation, or how to use it in different ways, is not _ideal_ - but it's a different thing (similar! but different) from "doing something just because everyone else is doing it". A Makefile-copier might not know the full stack of concepts that undergird their working configuration - but they do know (in the "true justifiable knowledge" sense) that it _is_ a working configuration, and they know why they're adopting it.
Hmm, fair, my definition certainly doesn't apply to CCD ("we do it because Google does it") - but I still maintain (as a child commenter elaborates) that there's a difference between "the reason I'm doing this is because other people do it" and "the reason I'm doing this is to achieve my given aim. I don't the exact functionality by which this leads to the aim being achieved, and I might not be able to modify this to achieve other aims - but I do know that it will get me where I want to go".
Neither is ideal - but, the latter is much less harmful IMO.
I'd say it is true for both. There's evidence that the actions cause the events. They correlate. It's why people start doing the actions in the first place. The exact reasoning you use, if it didn't "work" (appear to work) then the cult dies off pretty fast (and they do). Rationally irrational. It's good to be aware of because with high complexity systems it is easy to fall into these types of issues. Where you are doing A and you _believe_ are causing B, but there is no real relation.
> Just because your wooden plane glides doesn't mean it's AC an actual plane
But if your wooden plane can somehow make it to Europe, collect cargo, and bring it back to your island, what you're doing is definitely not cargo culting.
It might not be actual engineering, maybe you don't understand aerodynamics or how the engine works, and maybe the plane falls apart when it hits the runway on the return flight, but if you got the cargo back you are doing something very different from cargo culting.
That's why copypasta doesn't count as cargo culting. It accomplishes the same task once copied as it did before. It may do so less reliably and less legibly, but it does do what it used to do in its original context.
>> Just because your wooden plane glides doesn't mean it's AC an actual plane
> But if your wooden plane can somehow make it to Europe, collect cargo, and bring it back to your island
Sure, but these are categorically different and not related to my point.
> That's why copypasta doesn't count as cargo culting.
Let me quote wiki[0]
The term cargo cult programmer may apply when anyone inexperienced with the problem at hand copies some program code from one place to another with little understanding of how it works or whether it is required.
Cargo cult programming can also refer to the practice of applying a design pattern or coding style blindly without understanding the reasons behind that design principle. Some examples are adding unnecessary comments to self-explanatory code, overzealous adherence to the conventions of a programming paradigm, or adding deletion code for objects that garbage collection automatically collects.
Even in the example it gives the code will "work." You can collect garbage when the language already does that, you'll get performance hits, but your code won't break.
It "it doesn't _work_" disqualifies something from not being cargo cult programming, then there would be no cargo cult programming. Who is shipping code that doesn't compile or hits runtime errors with any form of execution? You couldn't do that for very long.
Let's take an airplane example. Say you want to copy Boeing[1]. You notice that every 747 has a coffee maker on it. So you also make a coffee maker. After all, it is connected to the electrical system and the engines. Every time you take out the coffee maker the airplane fails. So you just put in a coffee maker.
A cargo cult exists BECAUSE _something_ is "working". BECAUSE they have evidence. But it is about misunderstanding the causality. See also Feynman's "Cargo Cult Science"[2]. As dumb as people are, there's always a reason people do things. It is usually not a good reason and it is often a bad reason, but there is a reason. Even people will explain you "causal" explanations for things like astrology.
> not only what you think is right about it: other causes that could possibly explain your results
His explanation explicitly acknowledges the experiment works. In fact, even the math to explain the experiment "works". But it is wrong. Related is Von Neuman's Elephant. Where Freeman Dyson had evidence that a theory explained an experiment, yet it was in fact wrong. Evidence isn't sufficient to determine causality.
To quote the original source that Wiki cites and is derived from:
> A style of (incompetent) programming
dominated by ritual inclusion of code or program structures that
serve no real purpose. A cargo cult programmer will usually
explain the extra code as a way of working around some bug
encountered in the past, but usually neither the bug nor the reason
the code apparently avoided the bug was ever fully understood
(compare {shotgun debugging}, {voodoo programming}).
This is categorically different than the kinds of copypasta that TFA is talking about, and it's different in that the copypasta in TFA does serve a purpose.
There's a world of difference between copying something whose implementation you don't understand but whose function you do understand versus copying something which you vaguely associate with a particular outcome.
This is mentioned in footnote 1. Concretely, I don’t think this is exactly the same thing as cargo culting, because cargo culting implies a lack of understanding. It’s possible to understand a system well and still largely subsist on copy-pasting, because that’s what the system’s innate complexity incentivizes. That was the underlying point of the post.
For me, there are many cases where I copy-paste stuff I've written in the past b/c some tool is a pain-in-the-ass and I can't afford the mental context switch. I usually do understand what's happening under the hood, but it's still cognitively heavy to switch into that "mode" so I avoid it when possible.
Tools that fall into this category are usually ops-y things with enormous complexity but are not "core" to the problem I'm solving, like CI/CD, k8s, Docker, etc. For Make specifically, I usually just avoid it at this point b/c I find it hard to avoid the context switch.
It has nothing to do with miraculous incantations--I know the tradeoff I'm making. But it still runs the risk of turning into the Makefile Effect.
It’s always hoped (but rarely shown to be true) that by making templates, teams will put thought into their K8s deployments etc. instead of just copy/pasting. Alas, no – even when the only things the devs have to do is add resource requests and limits, those are invariably copy/pasted. If the app gets OOMkilled, they bump up memory limit until it doesn’t. If it’s never OOMkilled, it’s probably never touched, even if it’s heavily over-provisioned (though that would only matter for the request, of course).
This has spawned a cottage industry of right-sizing tooling, which does what a dev team could and should have done to begin with: profiling their code to see resource requirements.
At this point, I feel like continuing to make things easier is detrimental. I certainly don’t think devs need to know how to administer K8s, but I do firmly believe one should know how to profile one’s code, and to make reasonable decisions based on that effort.
I do know how to profile my code, and I'll also continue to bias towards not doing it now. Even if it could mean more pain later.
I think part of the problem is that the pain it causes is quite visceral, but the opportunity cost is pretty abstract, so it's a lot easier to just focus on the pain and forget about what you're gaining.
I agree, and I think the key distinction is in understanding. In a cargo cult there's a lack of understanding, whereas I'll often copy and paste code/config I understand to get something done. Usually this is for something I don't do very often (configuring nginx, writing some slightly complicated shell script etc.) I could spend an hour reading docs and writing the thing from scratch but that's likely gonna be wasted time because there's a good chance Im not going to look at that thing again for a few years.
And of course every one of those tools has to have their own special language/syntax that makes sense nowhere else (think of all the tools beyond make, like autotools, etc)
I don't care about make. I don't care learning about make beyond what's needed for my job
Sure, it's a great tool, but I literally have 10 other things that deserve more of my attention than having my makefile work as needed
Honestly is this not how it should be done? There's always going to be a more elegant approach for sure. But in general, we don't want developers to keep rewriting the same code again and again. Avoiding that is part of entire design paradigms. I'd like to talk to the dev who doesn't copy-paste and writes everything from scratch.
The article does kind of mention this in footnote '1', for what it's worth:
> The Makefile effect resembles other phenomena, like cargo culting, normalization of deviance, “write-only language,” &c. I’ll argue in this post that it’s a little different from each of these, insofar as it’s not inherently ineffective or bad and concerns the outcome of specific designs.
I think I fall very much into the "beginner of beginner stages" of understanding programming. It sounds like then, if I want to avoid that "cargo cult" mindset, then a structured flow of:
Does this then mean that, if someone truly wants to "escape the island, and fly the plane" as it were, it comes down to "university is the 'truest' way"?
Note: Yes, I realize it's hard to speak in absolutes, that there are plenty of exceptions to generalities, and that all people have various degrees of justifications of I-can't-do-that-itus; I'm talking more in terms of optimal theory. That, the optimal route to avoid cult-like behavior is to understand the whole thing, and that "the whole thing" comes from higher education, right?
Logically at least, it would seem that even diligent studying with books as a means to meet/surpass the "completeness" of university would still be... inadequate in some regard when compared to in-class time with learned educators. (Again, supposing that the same person worked just as hard doing either option, etc.)
An engineer should learn first principles and master the tool rather than dancing around it or reach immediately for replacing it with something else. This is why the "replacement" tool "just" is fundamentally terrible because it doesn't do dependency checking and optimizes for the wrong things. Instant loss of efficiency throwing away the power and simplicity of makefiles (GNU extensions often needed).
Instead, (GNU or vanilla) makefiles are ideals for very simple, portable projects. Make is everywhere.
For anything complicated, a proper build system that doesn't use autotools like cmake or bazel.
Another factor is frequency of use. I use LaTeX to do big write-ups on the order of once per year or less. LaTeX at the level I use it is not a hard tool, but I generally start a new document by copy-pasting a previous document because there is a lot of detail about how to use it that I'm never going to remember given that I only use it for a few weeks once a year.
I usually try to avoid the "makefile effect" by learning the technolgoy I use reasonably frequently (like e.g. Makefiles, Shell Scripts, ...).
However, despite the fact that I used to use LaTeX very much, I always copy-pasted from a template. It is even worse with beamer presentations and TikZ pictures where I would copy-paste from a previous presentation or picture rather than a template.
For TikZ I am pretty sure that the tool is inherently complex and I just haven't spent enough time to learn it properly.
For LaTeX I have certainly spent enough time on learning it so I wonder whether it might be something different.
In my opinion it could very well be a matter of “(in)sane defaults”. Good tools should come with good defaults. However, LaTeX is not a good tool wrt. this metric, because basically all my documents start something like
Most of this is to get some basic non-ASCII support that is needed for my native tongue or enable some sane defaults (A4 paper, microtype...) which in a modern tool like e.g. pandoc/markdown may not be needed...
Hence the purpose of copy-pasing the stuff around is often to get good defaults which a better tool might give you right out of the box (then without copy/paste).
Copy-pasting itself is not bad per se. What's bad is copy-pasting without understanding the why and how.
For LaTeX I also copy-paste a whole lot from older files, but I don't feel bad because (a) I wrote these files before, (b) I know exactly what each line is doing, (c) I understand why each line is needed in the new doc.
I wrote a relatively large amount of TikZ code earlier in my life (basically used it as a substitute for Illustrator) and for this library in particular, I think it just has so much syntax to remember that I cannot keep it all in my brain for ever. So I gladly copy from my old TikZ code.
\usepackage[utf8]{inputenc} now is the default, at least; you don't need to include it anymore. And diacritics work out of the box, no need to write weird incantations like G\"{o}del anymore.
I use it more often and also start with copy-paste header, that includes:
* all packages needed for my language (fontenc, babel, local typography package)
* typical graphicx/fancyhdr/hyperref/geometry packages that are almost always needed
* a set of useful symbol and name definitions for my field
If you are not writing math or pure text in English only LaTeX is batteries not included.
This is “Copy-Pasta Driven Development” [0] and it’s not even related to makefiles. It’s related to the entire industry copying code from here to there without even knowing what they are copying.
TBH I think copilot has made this even worse, as we are blindly accepting chucks of code into our code bases.
Blame the business people. I tried becoming an expert in `make` probably at least 7 times in a row, was never given time to work with it daily until I fully memorized it.
At one point I simply gave up; you can never build the muscle memory and it becomes a cryptic arcane knowledge you have to relearn from scratch every time you need it. So I moved to simpler tools.
The loss of deep work is not the good programmers' fault. It's the fault of the business people.
I wouldn't say so. Make is very simple and you can grasp the basis within an hour or so, if you're familiar with shell scripting (as it's basically a superset of shell scripts, with the dependency graph on top). Then all you have to do is just in time learning which is mostly searching for a simpler pattern that what you're currently doing.
If I had a nickel for every time I have seen a Makefile straight up copied from other projects and modified to "work" while leaving completely unrelated unnecessary build steps and targets in place.
You know, a makefile is documentation. That's why you should probably never copy one (except for a single line here or there). There's space for commenting a few stuff, but your target names and variables should explain most of what is going there.
Anyway, the article and most people here seem to be talking about those autotools generated files. Or hand-built ones that look the same way. But either way, it's a bad solution caused by forcing a problem to be solved by a tool that wasn't aimed at solving it. We have some older languages without the concept of a "project" that need a lot of hand-holding for compiling, but despite make being intentionally created for that hand-holding, it's clearly not the best tool for that one task.
You find the first part in your stack that is documented (e.g., make is documented, even if your makefile is not) and use that documentation to understand the undocumented part.
You then write down your findings for the next person.
If you don’t have enough time, write down whatever pieces you understood, and write down what parts “seem to work, but you don’t understand“ to help make progress towards better documentation.
If you put the documentation as comments into the file, this can make copy&pasting working examples into a reasonably solid process.
There are certainly a lot of tools that are more complicated than necessary, but Make as a tool isn’t a good example of that, IMO. With modern tooling, more often than not the complexity problem is compounded by insufficient documentation, the existing documentation being predominantly cookbook-style and not explaining the conceptual models needed to reason about how the tool works, nor providing a detailed and precise enough specification of the tool. That isn’t the case for Make, which is well-documented and not difficult to get a good grasp on, if one only takes the time to actually read the documentation.
The cookbook orientation mentioned above in turn leads to a culture that underemphasizes the importance of learning and understanding the tools that one is using, and of having thorough documentation that facilitates that. Or maybe the direction of causation is the other way around. In any case, I see the problem more in too little time being spent in creating comprehensive and up-to-date documentation on tooling (and designing the tooling to be amenable to that in the first place), and in too little resources being allocated to teaching and learning the necessary tooling.
I wouldn't say this is necessarily a bad thing. I wrote my first version of a Makefile with automatic dependencies and out-of-tree builds 10+ years ago and I have been copying and improving it since. I do try to remove unneeded stuff when possible.
The advantage is that one can go in and modify any aspect of build process easily, provided one takes care to remove cruft so that the Makefile does not become huge. This is very important for embedded projects. For me, the advantages have surpassed the drawbacks (which I admit are quite a few).
You could, in theory, abstract much of this common functionality away in a library (whether for Make or any other software), however properly encapsulating the functionality is additional work, and Make does not have great built-in support for modularization.
In this sense I would not say Make is overly complex but rather the opposite, too simple. Imagine how it would be if in C global variables were visible across translation units. So, in a way, the "Makefile effect" is in part due to the nature of the problem being solved and part due to limitations in Make.
I am that someone else because I seldom edit the makefiles and I forget things. That's why I try to trim unused targets and recipes and I try to keep it documented.
In the end it is no different from any code that's suffered from 10 years of tuning and it can get ugly. Maybe Make is even somewhat worse in this respect, but then again it does not need to be changed often.
Is not this a very generic phenomenon? I would argue it applies broadly. For example budgeting, you usually start from last year's budget and tweak that, rather than start from scratch. Or when you write an application letter, or a ServiceNow ticket, or whatever. Now I regret that I have brought in ServiceNow in the discussion, it kills the good mood....
But as I understand it and I am not an accountant (IANAA?), for non-ZBB budgets last years budget is usually used as a starting point and increases are justified.
"Here's why I need more money to do the same things as last year, plus more money if you want me to do anything extra".
I'd be curious what our man Le Cost Cutter Elon Musk does for budgeting?
I have observed the Makefile effect many times for LaTeX documents. Most researchers I worked with had a LaTeX file full of macros that they have been carrying from project to project for years. These were often inherited from more senior researchers, and were hammered into heavily-modified forks of article templates used in their field or thesis templates used at their institution.
This is a great example of an instance of this "Makefile effect" with a possible solution: use Markdown and Pandoc where possible. This won't work in every situation, but sometimes one can compose a basic Beamer presentation or LaTeX paper quickly using largely simple TeX and the same Markdown syntax you already know from GitHub and Reddit.
That won’t solve any problem that LaTeX macros solve. Boilerplate in LaTeX has 2 purposes.
The first is to factor frequently-used complex notations. To do this in Markdown you’d need to bolt on a macro preprocessor on top of Markdown.
The second one is to fine-tune typography and layout details (tables are a big offender). This is something that simply cannot be done in Markdown. A table is a table and if you don’t like the style (which is most of the time inadequate) then there is no solution.
I have made conscious effort in the past to never copy/paste the initial fleshing-out of a Makefile or a PHP class, or HTML boilerplate, or whatever. Like, for years I stuck to that. Then I stopped making that effort because there is no upside. Or rather, there is no downside to copy+paste+modify. It's faster and you save your brain power for things that actually matter.
There's a subtle difference between a nice template and a fully-working implementation that you then modify though.
(e.g. in that they were designed with different goals in mind, so the former is likely to have stopped at the point where it was general enough, to save you time, but not too specific to create footguns).
Bonus points if your template explicitly has fail patterns that prevent your code from silently failing.
* using fully automatic dependencies is a good idea
* never committing generated files is a good idea (avoid hysteresis)
It is fundamentally very difficult to get all three of these at once; automatic dependencies often require generating files ahead of time, but generating files often involves needing to know the dependencies or at least their paths ahead of time.
These days the trend seems to be to commit "generated-in-place" files, which avoids some of the historical problems with the last (at the cost of introducing others). I don't claim this is optimal.
I did that, then I needed to tweak things so I added options, then I needed to use the package somewhere that needed to be self-contained, so I started copy-pasting ;). I've done similar things with makefiles, tox configs, linter settings (all of which started from an initial version I wrote from scratch).
I suspect the real reason this effect exists is because there's copy-pasting is the best way to solve the problem, due to a varying mix of: there being no way of managing the dependencies, needing to avoid (unmanaged) dependencies (i.e. vendoring is the same, only we have a tool managing it), the file (or its contents) needing to exist there specifically (e.g. the various CI locations) and no real agreement on what template/templating tool to use (and a template is just as likely to include useless junk). Copy-pasting is viewed as a one-time cost, and the thing copy-pasted isn't expected to change all that much.
I guess that there's a very important difference between copying something that you understand (or at least the details of which, like syntax, you can easily remember - here comments become important),
and copying something that not only you do not understand, but you were not the one that made it in the first place, and you never understood it !
Okey but to me, copying - pasting working code (even with sone extra unused bits) really looks no more different than inheriting a library - provided base class, and then extending it to one's needs.
That's literally the basis of all software. There is no need to invent "a Makefile effect/syndrome"
Yes that's an indication that a code sharing mechanism is needed but not implemented. Copying pasting solves that. You don't expect people to rewrite http client for every project which interacts with APIs, so you?
I think this is a good point. As somewhat of a tangent I have vaguely been thinking of the difference between copy pasting and explicitly extending for a bit.
It seems that in many cases, adapting copy pasted code has some benefits over importing and adjusting some library code. https://ui.shadcn.com/ is an example of going the copy paste direction. It seems to me this is preferable when tweaking the exact behaviour is more important than keeping up to date with upstream or adhering to an exact standard. If you customize the behaviour a lot the extra abstraction layer only gets in the way.
This insight might be a bit mundane. But I remember myself bending over backwards a bit too much trying to reuse when copy pasting is fine.
Well, I expect people to understand http clients and if things don't work to be sufficiently knowledgeable to recognize when they have a performance problem and figure out why they have it. For that one needs language, library and networking skills which to a degree most developers have because they do it every day.
At issue however are niche skills. We are dealing with the long tail of a distribution and heuristics which work most of the time might not - the author mentions e.g. security. The way I look at this is risk i.e. security, bus factor, disruptions due to software moving from state "works and is not understood" to "broken and is not understood" and last but not least ability to predict behavior of this niche technology when it is going to be pushed into an larger project.
> the tool (or system) is too complicated (or annoying) to use from scratch.
Or boring: some systems require boilerplate with no added value. It's normal to copy & paste from previous works.
Makefiles are a good example. Every makefile author must write their own functionally identical "clean" target. Shouldn't there be an implicit default?
C is not immune, either. How many bits of interesting information do you spot in the following excerpt?
The printf alone is the real payload, the rest conveys no information. (Suggestion for compiler authors: since the programs that include stdio.h outnumber those that don't, wouldn't it be saner for a compiler to automatically do it for us, and accept a flag to not do it in those rare cases where we want to deviate?)
> since the programs that include stdio.h outnumber those that don't
I don't think that is true. There is a lot of embedded systems C out there, plus there are a lot of files in most projects, and include is per file not per project. The project might use stdio in a few files, and not use it in many others.
> Makefiles are a good example. Every makefile author must write their own functionally identical "clean" target. Shouldn't there be an implicit default?
At some point you have to give the system something to go on, and the part where it starts deleting files seems like a good one where not to guess.
It's plenty implicit in other places. You can for example, without a Makefile even, just do `make foo` and it will do its best to figure out how to do that. If there's a foo.c you'll get a `foo` executable from that with the default settings.
> The printf alone is the real payload, the rest conveys no information.
What are you talking about? Every line is important.
#include <stdio.h>
This means you need IO in your program. C is a general purpose language , it shouldn't include that unless asked for. You could claim it should include stuff by default, but that would go completely against what C stands for. Code shouldn't have to depend on knowing which flags you need to use to compile successfully (at least not in general like this).
int main(int argc, char** argv)
Every program requires a main function. Scripting languages pretend they don't, but they just wrap all top-level code in one. Having that be explicit, again, is important for a low level language like C. By the way, the C standard lets you declare it in a simplified manner:
int main(void)
Let's ignore the braces as you could just place them on the same line.
printf("Hello\n");
You could just use `puts` here, but apart from that, yeah that's the main payload, cool.
return 0;
The C standard actually makes this line optional. Funny but I guess it addresses your complaint that "common stuff" perhaps should not be spelled out all the time?
So, here is the actual minimalist Hello world:
#include <stdio.h>
int main(void) {
puts("Hello world\n");
}
Thank you, but this thread was not about writing good code, but rather how often one ends up acritically copying existing "legacy" parts without even attempting to understand it.
I probably used the wrong words: "conveys no information" was meant as "is less meaningful than the printf". Just like switching on the PC every morning is essential, but if you ask me what my job's about, I wouldn't mention it.
In the same vein, I'm convinced that the printf is the part that expresses the goal of the program. The rest, the #include, the main(), even with the optimizations that you suggested, is just boilerplate, the part that is usually copied and pasted, not because it's not useful and not because it's too difficult to get right, as the original article claims, but because it's boring.
This also happens with tools you have to use but don’t get much payoff from—like internal tooling. At work, we have a shitty in-house feature flag service. It breaks all the time and is super finicky. Learning it properly doesn’t really help me, so I mostly copy and paste my way through it.
Another example is jq. I use it occasionally, and ChatGPT handles the syntax pretty well. For me, learning it properly just isn’t worth the time or effort.
Makefile syntax is also well understood by ChatGPT. If you want to know a suitable way for doing some task, ChatGPT can do it. It can also explain what another Makefile is doing.
Here's an example of a (similar) prompt I used recently: "Write me a makefile that executes a script inside a docker container. I want the script to be someprefix-<target-script> that calls /app/<target-script>.sh inside the container."
I don't have to care about Makefile syntax anymore for the most part.
> Another example is jq. I use it occasionally, and ChatGPT handles the syntax pretty well. For me, learning it properly just isn’t worth the time or effort.
This resonates with me, I was in exactly the same position when I needed to do something with `kubectl` JSON output - just ask ChatGPT because I couldn't be bothered to learn the unintuitive syntax.
Interestingly I _can_ blame the tool, because I started using Nushell[1] which has built-in JSON manipulation that provides a MUCH simpler syntax, and I have learnt this properly because it was that easy.
Nushell is awesome. Too bad the incompatibilities are still a bit too much for me to use it as a daily driver.
It’s easy to blame the tool, but sometimes the problem space is inherently complex. With limited time, building the right abstraction is an immensely difficult job. LLMs fixed this issue for me, and I’ve stopped complaining about unintuitive but ubiquitous tools.
This only happens because people treat build code at a lower standard than app code. IMO you should treat all code with the same rigour. From build scripts to app code to test code.
Why write hacks in build tools when you wouldn’t do in your app code.
We build tool code with the same quality as the app code. That’s why most tooling we use are written in typescript: type safety, code reuse…
I would argue the main reason is that Make is just bad. There are easier to use alternatives such as scons or rake that don't have this effect applied to them.
Why do some tools have this problem, and others not?
I think it's convention over configuration. Makefile can do anything, so every project is different and needs different configurations, and everything must be configured. Which means that when I use a tool like that, it's sooo many decisions to make, that I just copy something that I know works.
If instead it was some sane defaults, it would be pretty apparent where it deviates. And instead of thinking of hundred things and which to choose, I either don't think about them, or think "do I have a reason to configure this instead of using defaults?"
Makefiles have an even more interesting issue: They lost their main purpose. In many, many projects that I've seen, they only consist of phony targets. No dependency tracking is used whatsoever.
How many Makefiles are there that just Wrap npm, pip, or some other tool like that? A Makefile is supposed to be the build system, not trigger it.
Okay but make is a shitty build system. What it does have going for it is you can nearly universally expect it to be already installed or easy to install. That makes it a good way to name commands shorter in a portable way, with some dependencies maybe thrown in.
It’s used for the same reason we write shell scripts
> It’s used for the same reason we write shell scripts
Only worse since it also uses $ for its variables leading to "thing:\n\t env FRED=$$HOME/thing some-command -p $$AWS_PROFILE $(OTHER_THING) -d $$(date +%s)" level of squinting
So for those using it as a task runner from the 60s, without dependency tracking, now it's just laziness over a shell script that has "dependencies" in an imperative and shellcheck-able way
calling this "Makefile" effect is a terrible disservice. one could as easily call it "PHP" effect, "YAML" effect, etc. pick whichever language you'd personally like to denigrate.
there is nothing that makes makefiles inherently more or less susceptible to this. if it's more common, it's because people don't want to take the time doing more solid engineering and clean design for something like a ci/cd config or a makefile, being viewed as ancillary or less important. and so they often don't want to learn the language, so monkey-see-monkey-do.
as sibling comments state, this is better called cargo cult or maybe copy-pasta. and i've seen it with any language c, c++, python, scripts, config files, anything. i even see it in chat gpt answers because it's regurgitating someone else copy pasta.
The reason why it seems to apply to makefiles in particular is because most people think life is too short to bother learning and understanding makefiles so it seems to happen there more than anywhere else.
Also no matter how complicated and subtle you think your makefile is, true experts will tell you it's wrong and you instead copy their apparently over-engineered, hard to understand makefile
> Also no matter how complicated and subtle you think your [thing] is, true experts will tell you it's wrong and you instead copy their apparently over-engineered, hard to understand [thing]
not unique at all to makefiles, probably not even in the top ten [things] that "true" experts like to "help" with
> Does it need syntax of its own? As a corollary: can it reuse familiar syntax or idioms from other tools/CLIs?
I’m with the author here 100%. Stop inventing new syntaxes and formats for things that don’t need it. It’s not clever, it’s a PITA when it doesn’t work as expected at 3:30 on a Friday.
I see this effect in Java Maven pom.xml files. It's hard to get a straightforward answer on why each build step is needed, what each attribute means, what parts are optional or mandatory, etc. There seems to be a culture of copying these XML files and tweaking a few things without truly understanding what the whole file means. I briefly looked at Ant and Gradle, and their ecosystems don't look any better. The build configuration files seem to have too much unexplainable magic in them.
> I briefly looked at …Gradle… The build configuration files seem to have too much unexplainable magic in them.
This is largely due to the use of groovy. When the Kotlin DSL is used instead, it can usually be introspected by (eg) IntelliJ. Otherwise, it’s pretty opaque.
Unless you know this, there's zero way you will come up with this by typing `configure` and using just auto-completion. Might as well use Groovy and a String for the name of the thing you're configuring. Good tooling would be able to auto-complete from there whether it's Groovy or Kotlin (or Java etc).
That wasn’t my experience a few years ago with a large groovy-dsl project. Since groovy will take a look in several different namespaces to automatically resolve things in a script, editors I tried had no hope of telling me what anything was.
Also, groovy allows modification of private instance variables which leads to … un-fun situations. I converted tens of thousands of lines of groovy to Kotlin. A lot of those lines were automated. Too many were not automatable for myriad reasons.
As far as the magic in Kotlin, I can easily click through all keywords and jump to the implementation in IJ. Groovy (at the time and in the project I was in) was utterly hopeless in this regard.
Groovy closure delegates' type can be declared, giving as much information as with Kotlin. The reason you couldn't follow the code was that the people who wrote those things either didn't declare types, or IntelliJ wasn't using the type declarations (I believe Groovy support in Gradle files is less good than in general Groovy files, where the IDE does support this). You're correct that some plugins will resolve things dynamically and those cannot be resolved by the IDE. But that's not the fault of the language, if you're going to rewrite in Kotlin with types, you could just as well add types to your Groovy declarations for the same result.
Imo, the only solution is to avoid boilerplate generators and the parent poms projects like spring boot use for things like pom files: you can look at the boilerplate to get ideas for what might be necessary, but, if you’re starting a project, write the pom yourself. It’s a pain the first couple times, but it gets easier to know what you need.
Honestly for Java I really like Bazel. You should give it a shot. I have a project with a self contained jvm and jars from maven central. Its more explicit than the other options but way less magical IMO.
I guess this is an effect of declarative programming and layered abstractions. The declarative syntax and abstraction are an answer to code being repetitive and long and hard to follow, but this then creates its own issues by making it harder to reason (especially for beginners or occasional users) about what is actually going on. The price for learning how to get it right just becomes much higher with every layer of abstraction inbetween, because you always have to learn what's going on underneath the "cushions" anyway.
For me typical examples are Terraform configurations with their abstracted configuration syntax, which just mimicks some other configuration (e.g. AWS) and executes it in an environment where I don't necessarily have access to. Of course I'm not going to run endless experiments by reading documentation, assembling my own config and running it in painful slow CI pipelines until it works. I'll rather copy it from another project where it works and then go back to work on things that are actually relevant and specific for the business.
I end up doing the copy paste thing quite a lot with build tools, it was very common in Ant, Maven and then in Scala build tool. When your projects all have the same fundamental top level layout and you are doing the same actions over and over you solve the problem once then you copy and paste it and remove the bits that don't apply.
These types of tools there isn't much you do differently they don't give you much in the way of abstractions its just a list of actions which are very similar between projects. Since you typically with them are working in declarations rather than the usual programming primitives it often fundamentally falls down to "does my project need this build feature or not?".
Yeah, I've always been mystified by the idea that writing a new Makefile is some kind of wizardly mystery. Make has its design flaws, for sure, but how hard is it really to write this?
I haven't tested what I just typed above, but I'm reasonably sure that if I biffed it in a way that makes it nonfunctional, it will be obvious how to correct the problem.
I mean, not that you can't do better than that (I'm pretty sure anyone experienced can see some problems!), or that there aren't tricky and annoying tradeoffs, but it just doesn't seem like a big activation barrier the way people sometimes make it out to be?
Maybe those people just need to spend an afternoon once in their life working through a basic make tutorial? Maybe not the first time they work on a project using make, but, maybe, after the fifth or sixth project when they realize that this somewhat primitive inference engine is going to be something they interact with daily for years? At some point you're getting into "lead a horse to water" or "teach a man to fish" territory. There's a limit to how much you can empower someone who's sabotaging themself.
There's a slightly less minimal example in https://www.gnu.org/software/make/manual/html_node/Simple-Ma... with a full explanation. You can read it in a few minutes, but of course you have to experiment to actually learn it. The whole GNU Make 4.4.1 manual in PDF form is only 229 pages, so you can read it after dinner one night, or on your commute on the train over the course of a few days. And then you'll know the complete rules of the game.
Good news, you can change the output then! And for as much as you might not like its generated Makefiles, I assert that $(cmake -G Ninja) is 100,000,000x more "wtf" than -G Makefiles
I disagree about the autotools ones, I find them very sane although autotools itself can die in a rotting dumpster out back. And take m4 with it.
Ninja isn't really a reasonable comparison to make. Ninja is explicitly not designed to be human authored like Makefiles are. Ninja is designed to be an output from some build system generator like CMake and doesn't make affordances for humans as it's intended for machine generation and machine consumption so it can be _very_ fast (which it is).
Yeah, m4 is powerful in its way, and has an extraordinary strength-to-weight ratio (check out the implementation in Software Tools in Pascal) but ultimately I think it turned out to be a mistake. Make, by contrast, turned out to be a good idea despite its flaws.
Specifically, any generated makefile that refuses to take advantage of GNU make is necessarily going to be horrible.
BSD make is ... viable I guess, but only really worth it if you're already in the ecosystem - and even then I can't guarantee you won't hit one of its silly limitations.
Same with programming: You just copy some old code and modify it, if you have something lying around.
Same with frameworks (Angular, Spring Boot, ...). The tools even come with templates to generate new boilerplate for people who don't have existing ones somewhere.
A better name for this might be the JCL effect, as even experienced mainframe sysprogs copypasta the JCL it takes to build their COBOL programs from a known-good example and then mutatis the mutandis, rather than attempt to build a mental model of how JCL works from the impenetrable documentation and write JCL de novo.
It's no big deal to me to write a small Makefile from scratch. My editor (Emacs) even knows to always use tabs when I hit TAB in a Makefile, removing the confusion of whether I inserted tabs (correct) or spaces (horribly incorrect) on the lines with the commands to build a particular target.
> However, the occurrence of the Makefile effect in a simple application suggests that the tool is too complicated for that application.
The author's overall point is fine (specifically, that one should consider developer cut-and-paste behavior as an indicator of unnecessary complexity in a tool). However, when discussing the designer's perspective, I think the author should have taken a broader view of complexity.
Much of the complexity in Makefiles stems from their generality; essentially, the set of problems to which a Makefile can be a solution. Substantively reducing this complexity necessarily means eliminating some of those use cases. In the case of make, this is clearly possible. Make as a Unix tool has been around for a looong time, and one can look at the early versions for an idea of how simple it could be.
But the rub is, simplifying make doesn't necessarily reduce complexity. Once armed with a simpler, but more limited make, developers are now tasked not only with knowing the easier Makefile syntax, but also knowing when make isn't an appropriate solution, and when and how to use whatever tool exists to fill the gap. Compounding this is the fact documentation and shared knowledge regarding which tool is appropriate for which problem is much harder to come by than documentation for the tool itself. This can easily lead to the tool choice equivalent of developer cut-and-paste behavior: "so-and-so uses build tool X so I must use it too", "if your doing (general description of problem) the only build tool you ever need is Y", "I used Z before, so I'm just going to make it work again".
Essentially you can think of make as one "verb" in a sprawling and uncoordinated domain-specific language that targets building things. Developers need some level of proficiency across this language to succeed at their work. But trading complexity that must be mastered in one tool for complexity that must be mastered across tools can very easily increase overall complexity and promote its own kind of "Makefile Effect", just at a different level.
EDIT: Some might prefer the term "Cargo Culting" rather than "Makefile Effect" here. I suggest they are the same behavior just in different contexts.
I see this often on our codebase. It was mostly written by ex-C# developers who were new to writing Go, and there’s many ham-handed C#-isms in there. At some point, someone took a guess at how something should be, then subsequent changes were done by copy-paste. Years down the road, another copy-paste job happens, and when I point out that the patterns within are not good (like, can actually be buggy), I get a confused response, because that is what was there.
There is an implicit assumption that the code written espouses best-practices, but that is far from the truth.
Happens to us at my day job too. The codebase is primarily C++. My most recent horror story is that I was stepping through some code in an object that was statically initialized and a variable that was initialized as `static const double foo = nan;` had a value of 0 in it. This was very surprising to me.
I look at how we defined nan and it turns out that nan is a singleton that was initialized in a DLL somewhere. My variable was being initialized before the singleton nan was initialized. I asked around, and someone with access to the previous version control system (we migrated to git in 2016) discovered that this was part of the original commit to that VCS back sometime in 2003-2006 or something. We think that was probably from before our C++ compiler was updated to support C++98 and `numeric_limits` was added.
So of course I moved this over so that accessing our special nan singleton is just a static constexpr call to `std::numeric_limits<double>::quiet_NaN()`. But our special nan singleton is used all over the place in our codebase. So of course I have to check to see nobody's doing something weird with it.
Of course they are.
There are about a hundred lines of code that boil down to `if (foo == special_nan_singleton) { /* ...handle nan / }` which of course...isn't how nan works. This is a no-op and the compiler just straight up compiles it out of binary. This happens a lot*. Including fundamental infrastructure, like the special JSON serializer somebody reinvented.
IDK I feel like Go suffers from this a lot. I have seen a lot of Gava, Guby, and G# over the last few years. It happens in Python a lot as well. Some people just love to write Java in Python and the new type hints make it even easier.
Apart of not knowing / unable to start from scratch, this is about frequent, and infrequent use. The latter also means not-that-important-in-overall-landscape, and brings forgetting and perceiving it as less "ROI", so one (hopefuly) finds and copy-pastes that-last-working thing into.. whatever Frankenstein.
So, it's the tool's fault that the user chose it, and it's the tool's fault the user never learned how it works?
This is like taking a hike up a rocky hill because the trailhead had a smooth path, later tripping over a rock, and then blaming the rock.
I'd redefine the Makefile (or YAML, Bash, etc) effect as:
Tools that are easy enough that people try to use them
without learning how they work first, and hard enough that
people later blame the tool when they crash into their
own ignorance.
> Complex tools are a necessity; they can’t always be avoided. However, the occurrence of the Makefile effect in a simple application suggests that the tool is too complicated for that application.
This footnote actually made me think about IDEs and the JS toolchain even more than makefiles.
If I'm writing a small project (say, 10 code files) surely an IDE where most people only know how to use 4 of the 1000 buttons is overkill, and I'd use a makefile.
Similarly surely 10 code files with 10 config dotfiles to set up a JS environment and tooling for dependencies, versioning, linting, transpiling, etc is overkill too.
- Basic javac/gcc/swiftc/whatever commands are simple, even if they can scale up through every niche via configuration options.
- Basic makefiles are simple, even if they can scale up to something like the xnu makefile tree (the most complex make system I've encountered).
- Let's not talk about JS.
I'm hesitant to use the word "lazy" to describe people who do what the author is describing - not just because I sometimes do it myself but because I believe that laziness is a derivative observation of time constraint, executive function exhaustion, and other factors. It also reminds me of the classic "I'm going to learn X, which handles/wraps Y, so that I can avoid learning Y", which is generally a bad pattern of motivation.
At its core this feels like a failure to understand (or failure of others to teach) fundamentals / first principles of the tools being used.
> Think about CI/CD setups, where users diagnose their copy-pasted CI/CD by doing print-style debugging over the network with a layer of intermediating VM orchestration. Ridiculous!
I don't think the author understands the point of "CI/CD systems". And I don't really blame them, because workload orchestration systems have been abused and marketed to the point where we call them CI/CD systems instead. Sure, if you think the point of CI/CD is to just provide a unified build and deploy it somewhere, you can write that in whatever language you like, and not need to know a bunch of YAML-fu.
But the whole point of workload orchestration systems is to configure the flow of workloads between machines as they inherently execute on different machines. The status quo is to debug over the network because, fundamentally, different machines will be connected by a network and the workload orchestration system is figuring out which machine to put it on.
If you think you can just run your CI/CD on a single server without networking or virtualization, I have some very large, parallelized testing suites to show you.
> If you think you can just run your CI/CD on a single server without networking or virtualization, I have some very large, parallelized testing suites to show you.
Nowadays you can get a single server with 256 cores and several terabytes of memory. I would be interested to learn what kind of testing suites have actual needs beyond that.
Without virtualization though is definitely no problem. The whole docker/k8s/whatever shtick is mainly because devs think it's more fun to invent new things than to learn how to use old ones properly. At least as long as you're running your own code on your own hardware, there is not a single thing solved by virtualization that wouldn't be solved equally well (or better) with traditional tools like environment modules and Slurm.
For a start, any suite that takes >X hours on a single node, especially compounded if you have a large team of developers.
> At least as long as you're running your own code on your own hardware
Assuming you keep a consistent env/OS across all nodes you will want to run said code. Which can be difficult, even just between two users on a single node.
Not to mention the fact that a lot of (most?) code needs to (A) interoperate with other people's code and (B) at least sometimes run on other hardware.
> For a start, any suite that takes >X hours on a single node, especially compounded if you have a large team of developers.
If your testing suite takes several hours to run on a 256 core server, and this is something you want to run on every commit by every dev, then you have a problem with your testing suite. Running it on k8s is just slapping a bandaid on it.
> Assuming you keep a consistent env/OS across all nodes you will want to run said code. Which can be difficult, even just between two users on a single node.
> Not to mention the fact that a lot of (most?) code needs to (A) interoperate with other people's code and (B) at least sometimes run on other hardware.
Yes, this is the problem that has been solved since the 90s by environment modules. This is how clusters and supercomputers work. There is no virtualization, just tcl and shell scripts.
Clusters and supercomputers are able to fully control the hardware & OS. environment modules are useful in that setting but do not solve OS differences, not to mention themselves need to be installed/maintained. lots of time developers do not work on your same cluster (say, for any open-source code) and environment differences can be significant.
I regularly use large clusters and even there docker/singularity are very helpful at times (very simple example, glibc requirements).
When you are talking about well-written code that only requires posix (and nothing beyond) and does not interface with hardware, etc. etc. then virtualization seems crazy.
And you can get desktop workstations with similarly high core counts and RAM. You're missing the point. Your strategy can either depend on whether or not you can afford to buy out larger and larger vertically scaled servers, or you can plan for horizontal scaling. Almost nobody is willing to sign off on the vertical scaling strategy because the sheer presence of a ceiling frightens executives.
And yes, in enterprise, two things are usually at play: you need to test a system where the architecture includes the combined architecture of multiple corporate acquisitions, more than one of which were Vertical Monsters, and more than one of which presumed horizontal scaling; and where deployment scripts must be run from behind a no-ingress-permitted firewall, which means having workload orchestration runners installed behind that firewall.
> the tool (or system) is too complicated (or annoying) to use from scratch
Unless you're doing trivial things, any tool or system will require some setup (which people call "ceremony").
Tools and systems can be easy to use from scratch only if they are either super-specialized or they impose significant constraints on what you will be doing with that tool and how you will be doing it.
Such tools are usually very tightly couple to a specific job/environment/task and are hard to keep around, keep updated and to evolve.
Make and similar tools instead are generic and can be adapted. The fact that you can reuse the previous work done is actually a feature. You can dive as deep as you want or need. They're very widely used so it's not an issue to keep updated. They're so widespread you can find people already familiar with those tools. Learning such tools is a great investment because you can keep using them over and over across project and companies.
Some of those tools are either timeless (gnu make) or have a very long life (more than a decade, which very long for this industry).
Anecdotal example: I learned a bit of apache ant while in high school because my laptop at the time (a netbook with 1GB ram and an atom processor) could not run NetBeans decently, so I had to learn a bit of apache ant and resort to writing and maintaining my own build.xml file. Fast forward 14 years and I see a build.xml file in the $FAANG codebase I was working on. That learning did pay off beautifully many years later.
The article is shortsighted, if anything it's promoting a shallow way of working. You are supposed to learn about the tools you use.
The tacit assumption of the OP is that it is better to do something else. That is, to start with some sort of first principles and create from scratch the artifacts that your project needs. It is telling that his example is a build artifact - these tend to change infrequently. The only way for one human mind to truly "dwell" in the space of project builds is to maintain many of them at once, as many as it takes to fill your days with nothing but build concerns.
"Tools that enable this pattern are harder to use securely..." Harder than what? A totally custom build? One made from first principles? I would, in fact, argue the exact opposite. And in fact I would argue that copy-paste-modify serves practicioners very well, especially when it comes to on-boarding. If you disagree, do the gedankenexperiment where you imagine joining a team with a totally custom build versus one with a lightly edited common make file. Which experience would you prefer, all else being equal?
Copy+tweak happens IRL all the time. There's no reason everyone who bakes should have to reinvent biscuits from scratch. There's no reason chip manufacturers should have to reinvent N-type P-type sandwiches from scratch. The existence of adaptations of previous success does not suggest that baking, or physics, or Make, is overly complicated.
Make has to be one of the more unfairly maligned languages out there. Most “replacements” purport to solve problems make doesn’t have, and are strictly worse than make at what they do.
Anyway, the GNU make manual is a good read for anyone that needs to edit a makefile or design a project build. So is “recursive make considered harmful”.
I think this is completely normal for tools that you program seldomly. I write makefiles a couple of times a year, I've been using make for more than 40 years now, I use it every day, but I seldomly program it, and when I want something more than simple dependancies I often clone something that already works.
amazon's internal build tool experiences this same phenomena. engineers are hired based on their leetcode ability; which means the average engineer has gaps in their infrastructure and config tool knowledge/skillset. until the industrys hiring practices shift, this trend will continue.
As an undergrad, I did group projects with people who quite literally could not compile and run any actual project on their system outside of a pre-packaged classwork assignment, who essentially could not code at all outside of data structure and algorithm problem sets, who got Google internships the next semester.
But they were definitely brighter than I when it came to such problem sets. I suppose we need both sorts of engineer to make great things
A design philosophy called "Progressive Disclosure" tries to tackle this problem, where a tool is supposed to present a minimal set of functionality initially to allow a user to be productive without being an expert and progressively "reveal" more complex features as the user becomes more familiar with the tool and attempts to do more complex things.
I've heard the programming language Swift followed this philosophy during development, though I've never written any Swift code to know how well it worked out.
Honestly, my .zshrc file started out as a .kshrc file that was passed down to me by an older developer about 20 years ago, when I was still in university. I've added and removed a lot of things over the years, but there are still a few parts of it that I don't totally understand, simply because they work and I've never had a reason to think about them. The guy I got it from, in turn, got his from someone else.
In the old days, I had a .fvwm2rc config file that I got from my boss in the university computing center. I had no idea how it worked! And neither did he -- he got it from a professor when he was in university.
This is pretty thought provoking. I think the issue is "80% of the use of this complicated tool is for very simple ends". From there you get a lot of "I can't be bothered to learn git/make/sed/helm/jenkins, all I'm doing is X 15 minutes a year". My guess is SWEs hate ceilings, so we don't want to use tools that have them, even though they'd be far more fit for purpose. We also don't want to build tools with ceilings: why limit your potential userbase/market?
This is how I feel about systemd unit files for things that I used to use crontab for. They aren't even particularly complicated. But editing cron files was self-explanatory. I leanred it once and I did not need to look it up ever. Whereas, systemd unit files, I still have to lookup every single time. There's something wrong with that. They are of course very much superior in many ways... but not all.
To me it seems fine that a tool that is both complexity and versatile needs a config file that is beyond memorization. So I think this line of reasoning has limitations.
I could see it with say CLI tools though. Like if I need to reference my notes for a CLI command then that may well indicate a failure in tool design.
>repeatedly copy a known-good solution and accrete changes over time.
Alternative phrasing would be that it evolves. Arguably there is a positive trajectory there
On the other hand, there are cases where (beneficial/desired) verbosity prompts copy-paste and tweaking - not due to complexity but from some form of scale or size of the input.
In many cases this is a sign of something that should be dynamic data (put it in a db instead of conf) but that's not always the case and worth the tradeoff in the moment.
Old IBM mainframe scripting in JCL https://en.wikipedia.org/wiki/Job_Control_Language (so "OS JCL" now, I suppose) used to have a terrible reputation for this, but I've never actually touched the stuff myself.
> However, at the point of design, this suggests a tool design (or tool application) that is flawed: the tool (or system) is too complicated (or annoying) to use from scratch.
As someone who teaches and sees college-level students ask chatgpt what's 1 + 1, I disagree that it has anything to do with complexity or annoyance.
I do write makefiles de novo (including in corporate settings). But I start "backwards" with the "clean" and "distclean" targets, then get a single basic debug build target working. From there, I find it relatively easy to expand to larger and more complex operations. Brick by brick.
This happens to me all the time with bazel. It is too complicated and the documentation sucks so I just look for prior art and copy paste that. Sometimes I have to do something novel and it takes me several days of deep diving the bazel source code to figure out how to do something.
The author mentions that copy-pasting code by itself is not a bad thing. The problem with the phenomena they describe is that people copy-paste files around because they _don't understand it_, and end up with stuff that works, but is inefficient and hard to debug.
The traditional Unix man page or list of options output with --help is often a firehose of details that most devs will never use. Sometimes there are a few examples shown of common use cases which is a good place to start learning the tool.
Sure, but IME even when the tool in question is incredibly well-documented (like Django, or some other popular library), and has plenty of examples, most still don’t read the docs.
I don’t know how to deal with that mentality. I don’t mind showing someone how I came to an answer, but I also expect them to remember that for the next time, and do some searching of their own.
We recycle known good stuff to avoid getting bogged down and introducing fresh flaws.
The admonition to know what we're doing and act deliberately applies to so much in life, but flies in the face of Milton Friedman's point in "I, Pencil" => https://youtu.be/67tHtpac5ws?si=yhheE1Y5ELfjWXs-
Yeah I'm just not wasting my life (or professional time) learning Groovy, Maven, DotNET project files, DotNET 4.8, Gradle, Azure DevOps, Grafana, Prometheus, Docker, Docker compose, Kubernetes, Jenkins etc et all.
I need those things once at project setup. I copy-paste and change a bit.
Why copy-paste? It's a proven strategy with a high success rate despite little effort behind it.
I also don't want to learn templating for every little DSL I need to write one file per project with.
But if you love doing it "the right way", you're welcome to do that work for me.
Any long-lived project's build will have to be updated/improved at various times throughout its lifetime so there needs to be somebody around who truly understands the build.
The "$@" is the output (or target, think of @ as a bullseye on a target), and the "$<" is the input (think redirection). The only other commonly used variable is "$^":
For C and C++* projects under ~100k lines I wouldnt bother with incremental builds - I have a 70k C project with a single translation unit that builds in under 1s on my machine.
* C++ requires some discipline to not explode build times, but it can be done if you dont go nuts with templates and standard headers.
i think you got the wrong eponymous law, pournelle's iron law of bureaucracy (which i see happening all the time, btw.) has nothing to do with this issue.
Part of the low-code/no-code story is that conventional programming requires programmers to not just decide what has to be done but in what order to put those things. (This is connected with parallelism because if tasks are done in a particular order you can't do more than one at a time.)
An Excel spreadsheet is different from a FORTRAN program, for instance, because you can write a bunch of formulas and Excel updates the results automatically without you sequencing things.
is an easy approach to finding a valid order to do tasks in. It's used frequently in build systems but rarely in other software so it contributes to build systems seeming to be from another planet
---
I work in Java a lot and I used to hate Maven, because, if you look at Maven as "an XML vocabulary" you're going to frequently find "you can't get from here" and looking for answers in Stack Overflow is going to dig you in deeper.
The right way to think about Maven is that, like (part of) Spring, it is a system for writing XML files that configure a group of Java objects. Seen that way, writing a Maven plugin should be a second resort; the moment you're scratching your head wondering if it's possible to do something, you should (1) make sure you can't "go with the flow" and follow conventions, then (2) write a Maven plugin.
The thing is, a Maven plugin is just an ordinary Maven class. You're a Java programmer, you know how to do things in Java. All the skills you use everyday apply, you're no longer in a weird, mysterious and limited environment. That's part of the "makefile program"; you probably build your code (edit files in Java, C, whatever) 1000s of times for every time you change something about your build system. On a team you can be very productive if you know the main language but have no idea about how the build works (if the build the works.)
When you try this though you often run into political problems. In most places, for instance, only a few people on the team have the authority to create new maven projects (a plug-in is a class defined it's own project.) Maybe that makes sense because you don't want them breeding like rabbits, but a lot generally most systems are badly partitioned as it is, and I think many programmers wouldn't want to have the fight it would take to create a new project.
People are accustomed to builds being SNAFU and FUBAR.
When I first saw Maven I was baffled that, as a remote working on a system that had about 20 programmers and about 20 projects I couldn't build the system successfully at all. The build worked maybe 70% of the time at the office and people didn't seem to worry about it. I could live with that because they were building large complex systems that they were always throwing away and I was building simpler spike prototypes that worked.
I worked at a number of places where builds were unreliable in ways that seemed quantitative rather than qualitative, eventually I realized the problem was really simple, if you were using snapshot builds and a lot of people were working on a project and you didn't have any transaction control you would often get a set of snapshots that were not compatible with each other.
Most teams don't take builds seriously enough. I've found it rare to meet an engineering manager who can answer the question "how long does the build take?" and fantasize of going to a job interview, asking that question, and if I don't get an answer, standing up and walking out.
For many projects I've worked on I've built "meta-build systems" where the assumption is that have 12 maven projects and 4 npm projects or something like that (aren't most of us using mixed languages in the React age? why isn't this treated as a first-class problem), and such a system can be smart about what gets built and what gets doesn't, what is running out of snapshots and what can have a fixed version, etc. Merges in changes from develop automatically, and if seven steps are required that take a total of 15 minutes I ought to be able to think about something else entirely for 15 minutes and hear a "beep" when it's done.
Trouble is we don't take builds seriously enough: it's a technical problem and it's a political problem and we often don't face the technical problems because of the political problems.
Is it not a problem which is basically COMPLETELY SOLVED by LLMs ?
The reason this happens is because Makefiles (or CI/CD pipelines / linters config, bash scripts) are more or less "complete language" on their own, that are not worth learning when you can do ... exactly what the author says (copy/pasting/modifying until it works) 99% of the time.
But LLMs in general know the language so if you ask "write a minimal Makefile that does this" or even "please simplify the Makefile that i copy/pasted/modified", my experience is that they do that very well actually.
Completely solved? I'd say exacerbated beyond recognition. We have tools to let us get by so much farther without understanding anything, so it probably becomes less of a problem in more cases. But it basically guarantees that all but the most curious will not understand how the system actually works. Everything becomes magical copy/pasting from the most advanced information retrieval system with LLMs.
But LLMs is literally a "person in the room" that actually knows how it works.
The simplification and explanation abilities of chatGPT are off the charts in precisely these cases. I honestly don't understand why I'm being downvoted.
You wanted a simpler make file before ? go spend 2 days learning the make syntax
You want a simpler make file now ? just ask for it as long as as the explanation with it of all the concepts to ChatGPT and you'll get it
Imagine if copy-pasting LLM output is the valuable part of human participation in software development. The act of copy-past sounds like drudgery, not problem solving. Another user phrased it best:
– John Gall (1975) Systemantics: How Systems Really Work and How They Fail
https://en.wikipedia.org/wiki/John_Gall_(author)#Gall's_law