Hacker News new | past | comments | ask | show | jobs | submit login
Fake it until you automate it (understandlegacycode.com)
293 points by RebootStr on Jan 9, 2023 | hide | past | favorite | 49 comments

The thing about automation is that it is really hard to automate workflows completely while getting everything right. One approach I like is the "Pareto automation"[0]. Normally, you can automate the most fragile or tedious parts of a workflow with little effort and keep doing everything else manually. Also, sometimes you don't even need automation, you just need a checklist.

[0]: https://ricardoanderegg.com/posts/musings-thoughts-software-...

My favorite trick is simple.

I don't "fully" automate. I partially automate, then build my workflows from these composite parts. Composition.

For a start, I file all my bash scripts/notes/snippets in Vimwiki. Zettelkasten is truly brilliant.

Then... I write almost everything I do INTO Vimwiki (git is for programs, Vimwiki is for snipppets. Vimwiki is managed with Chezmoi). Hence, whenever I need to do... anything... I can build that workflow out of past snippets from the Vimwiki filing system.

Over time, my workplace and home workflows are automated. You know it's automated once you find a "set" composition, you use for 2 weeks straight without variation.

I only bundle "set" workflows, which haven't changed, into discrete scripts.

Checklists may sound dumb and boring, but they are actually great! They make it easy to just do the job and not think much about its structure, somehow reducing overall tediousness. Checklists together with snippets in some sort of a knowledge base (I simply use OBTF.txt with folds) help with processes that are too hard to automate timely and/or completely. This also enables gradual automation and knowledge collection rather than living in all-or-nothing world. Also, I find it very useful when your knowledge base grows without being shaped for and nailed down to a specific project with its own quirks. It becomes your own well-commented stdlib for doing things.

I also love the idea of having the hard to automate stuff be encoded in checklists (or creating checklists as a first step in formalizing a process). My biggest issue was that every time I looked there weren't tools around that would support this. If there was a web based, multi user tool (ideally open source) that allows you to define checklists and then have multiple "instantiations" of them to produce some form of paper trail of the execution that would be great!

> Normally, you can automate the most fragile or tedious parts of a workflow with little effort and keep doing everything else manually. Also, sometimes you don't even need automation, you just need a checklist.

That’s what the post’s approach gives you: a checklist you run in order, with some steps automated.

This seems like the same basic concept as https://blog.danslimmon.com/2019/07/15/do-nothing-scripting-...

I remember when that article first made the rounds. I had been doing something like that for a few years already and the article did a good job making all the same points I would have made.

In hindsight, the biggest challenge that I've found is getting your "do-nothing script" to become the authoritative source of how to do something. The fact that the steps themselves are still manual means that it's easy for people to create their own manual side paths to cover new situations and it's hard to get those added back into the script.

If your CI/CD server is running all the steps then the only way to make something happen is to get it into the build script. If there are humans in the loop then there will always be new things that come up, people who do them in slightly different ways, and stuff that doesn't get committed back into the scripts.

My interpretation was always that the do-nothing script was just a starting point - once you have it, you can gradually start automating the steps.

Once the script actually automates a few of the more tedious steps, people are much more likely to use it. And when they use it, they're likely to help maintain it when new situations need it.

Eventually, you may end up with a script that does automate it all, and can be put in CI/CD. (Hey, I can dream, can't I?)

Yes, I agree 100%. But until you get to the point where it's fully automated and running through CI/CD then you're constantly fighting new manual steps and ways to do things.

> And when they use it, they're likely to help maintain it when new situations need it.

Unfortunately, "likely" means "not always", which turns into something that is no longer a real source of truth, and you're back to people having their own undocumented way of doing things.

I've also found that it's easier to bikeshed when you're printing our commands for people to run. Either because they see the command more clearly or they have different dependencies installed.

It's still better than nothing, don't get me wrong. It's just become clear to me that if you don't build your "do-nothing script" with a real plan to get it automated (i.e. not just a dream that "we'll automate this one day") then it will quickly rot.

In general I find this concept pretty helpful. I've found it really helpful to make functions in my bash profile that have TODO statements for things that I have yet to completely automate, but still want an easy way to look up.

My only issue is that not automating the deploy is probably the worst thing to need to do manually. Having worked at a company that had a rather large and ever changing list of steps to complete in order to deploy (which took a 1-2 hours), it's something I find myself prioritizing now.

Prior discussion of that link: https://news.ycombinator.com/item?id=29083367

I personally like the name „checklist script“ (also checkscript) for that, which I found in that comments at the time. I think it conveys easily the two purposes of the thing (checklist gradually becoming a script).

The author of this blog also has published a library for people looking for a more advanced version of that script: https://github.com/danslimmon/donothing

Bottom up automation when top down automation can’t figure out what is really needed for a deployment. Once your weird cli deploy thing is complete, it can be given to a proper automation software team as a very clear spec.

But one side note: if you have sysadmin skills and can write some shell scripts and have used more than one distro, you are effectively a Do ker expert. Configuring a docker image is just says admin 101 with little bitty shell script fragments.

One thing I noticed is that knowing how things can be automated causes teams to actually get slower at getting stuff done.

First, because automation takes some upfront effort to get done.

Second, because once it is set up it frequently acts to prevent change.

I find it is useful to consider alternative solutions to the problem like a checklist (which I call a program that runs on people) which allows you to set up a process in minutes (+ potentially a meeting with your team) and then get back to the project. Once the project has been running for some time you will be in a better position to automate it. First, you will already have some experience with running the process (with the checklist). And second you can plan the task of automating at lower priority and without it becoming a blocker in your more important project.

Yes, it is considered technical debt. Not all debt is bad.

I was doing this a year ago. Now I'm redoing it. In my case, what I missed is that if the app is complex enough you should treat the deployer like a separate product.

Different teams might want to use it for different things, deploy different versions with it, etc. So it should be in a separate repo with its own tests and its own version string. Otherwise iterating on it is hell because you're doing so in a space that's ergonomic for app development, not deployer-tool development.

That way when your deploy target changes (goodbye helm, hello custom k8s operator), your users can keep using the stable deployer version while you work on the new one.

Following that, switching between the old way and the new way becomes an atomic action, which makes it much easier to triage issues of the "it worked with the old way but breaks with the new way" sort.

Or at least... I hope that's what I missed. Because if not I'll be reredoing it next year.

That is just the next iteration on the "fake it until you automate it" path.

And it can actually be a mistake to try to jump from zero directly to fully automated nirvana.

Agreed. I tried to jump too early. More faking it for longer would've been faster in the long run.

> So it should be in a separate repo

If it's in the same repo, that means all of the features of the deployer can match the deployed code automatically - the deployer is made for and tested for that version.

With different repos, you add complexity. Now you have to support deploying multiple versions so there are more code paths and there's version checking etc. And more complexity also means you need more tests.

Having everything together in the same repo is great if

- you don't care about tests that span multiple versions of the app (e.g. patterns of usage around database upgrades)

- you don't care about git bisect sessions that evaluate a new test against old versions of the app

- you never want to try the new app on the old infra or the old app on the new infra

- you're not worried about tests written by people not on your team whose failures indicate scenarios that don't require action on your part (but might require action on theirs).

if it's not in the same repo, it means that the deployer can be updated and rolled out independently, and scale with its own resources.

if it's in the same repo, you add complexity -- now I have to ensure my deployment of two unrelated code bases don't collide, there's more cross-team conflicts, and I have to clone down a 200gb repo.

Right, and not just one 200gb repo. If I want to do a git bisect of the app while holding the deployer constant I have to wrangle at least two of them.

For simple command-based automations, I've found just [0] to be a good replacement for makefiles with .PHONY targets. It has a cleaner syntax and fewer pitfalls compared to makefiles.

[0] https://github.com/casey/just

This really does look great. It makes a lot of the common things I want to do with a Makefile really easy.

The big problem I find is changing process in a big team is hard. One of the great things is make is usually installed on most developers machines. It reduces the friction when trying to get a team on board.

Looks pretty cool, thanks for sharing!

And if you're a shop that does your automation with Python you might want to look at https://www.pyinvoke.org

Others in comments have (rightly) called out the similarity to "Do Nothing Scripting", so I'll instead post a broader (but complementary) article on automation, perhaps my favourite technical "position paper" of all time - "Manual Work Is A Bug": https://queue.acm.org/detail.cfm?id=3197520

Try including the IT support cost of the collateral damage unskilled workers can do with administrative rolls. You will end up supporting hundreds if not thousands of operational features.

I am not suggesting you should give a toss, but at least admit what happens before you inflict the decision on your team.

Have a wonderful day =)

God, this is good advice. Living this hell at the moment. Just a complete "throw every feature at the wall" environment where nothing is documented and it's an infinitely expanding Kessler Syndrome-like failure of "automation".

Backpressure is required in order for a distributed system to continue to run successfully. Half the time the people who think they need a new feature just want to be unique.

The critical observation is that these people won't always be around or have this deployment plan fresh in their minds. You need most of your deployments to follow the same kinds of rules, so that when you look at a misbehaving service that you haven't had to deal with in ages, you can figure out what's going on and why it's wedged. Every new thing it can do is tribal knowledge that's not captured anywhere. It might have already walked out the door in fact.

I think the assumption is that the developers already manage deploying as well, but it's manual.

looks like "Do-Nothing Scripting: the key to gradual automation" https://news.ycombinator.com/item?id=29083367

Please don't do this on a $0 MRR project. Wait until it's a viable company before spending time automating everything.

They're talking about legacy software there. From my experience on a modern projects, it takes from minutes to few hours to make a push-to-deploy hook, e.g. using GitHub Actions/Travis/Jenkins.

If you're very early, you want to automate enough to be able to iterate quickly.

That's not the same as automating "everything", of course.

So it's like uber and self-driving cars?

Wow, please don't ship that. A whole suite of CLI tools for a custom deployment pipeline is my worst nightmare if something goes wrong. The purported benefits:

> It gets people into the habit of running a single command to initiate deployments

npx firebase-tools deploy --only functions -P staging

is also one command

> It’s a more obvious source of truth for putting the deployment instructions

If that command were put into CI/CD, that would be an even more authoritative source of truth.

> Because it’s already an executable script, it makes it easier for us to automate some of the steps listed

The executable script can't be run by CI/CD and is another potential source of bugs/mistakes.

The real "fake it until you automate it" for deployment would've been ssh'ing into the server, running git pull, and restarting the server. That's no longer possible with serverless nowadays, but this seems to go from taking something that's relatively straightforward to something that's very complex.

> npx firebase-tools deploy --only functions -P staging

Better yet:

    make deploy-staging
Even when it's just wrapping something else, I strongly prefer to use make(1) as the universal all in one command.

I don't know if that pattern has a name, but I really love that on my team if there is a script it should be runnable through the Makefile. I know some people are very good a remembering commands, but I'm not and being able to do `make docker`, `make init`, `make eval`, etc... really simplifies my workflow.

There can definitely be some downsides to this approach. For example, let me see what the CI script does to deploy. "make deploy", ok let me checkout the Makefile. Ok, that calls some python module "deploy.py". I tend to get frustrated with all the indirection and would prefer to just see the deploy steps in the CI script itself.

I mean that example is frustrating because obviously `make deploy` and `python deploy.py` are both pretty simple so they are easy to remember.

One common pattern that I'll see though is deploy.py not being top level so `python deploy.py` becomes `python scripts/deployments/staging/deploy.py` so you still get the benefit of not wondering where `deploy.py` is.

Another is the configuration file also having a default version that is not obvious. Something like `configuration/deployments/staging.yaml`.

Now you might argue that I'm exaggerating (and I think it's fair to say that some refactoring could be valuable), but the simple `python deploy.py` now actually is.

`python scripts/deployments/staging/deploy.py -c configuration/deployments/staging.yaml` which is significantly harder to remember than `make deploy-staging` for example.

I am a big advocate for and user of this pattern with Make.

However there is one non-trivial downside with (GNU) Make, and that is the non-visibility of env vars set with `export` when running `make -n`. That is, if your Makefile looks like this:

  export FOOBAR=123

  .PHONY: do-it
Then `make -n do-it` will not show the exported FOOBAR variable. This makes it somewhat more difficult to audit or inspect with `make -n`.

The output from `make -np` is incredibly verbose and isn't easy to filter with standard CLI tools, which makes this doubly frustrating. You basically need to write an AWK/Perl/Python program to parse it. If there was one feature I'd pay good money to add to a new version of GNU Make, it's an option to emit more-easily machine-readable output from `make -p`, or to ship a script in the GNU Make package that parses it into something standard like JSON or XML.

It's nice to be able to run the same commands locally as you do in CI. Having the deployment logic outside of the pipeline specific format will allow you to do this.

Make is great because its usually installed everywhere. The downside is that it has a lot of edge cases. Breaking out right away into a better language for the more complex tasks can make sense.

I can certainly sympathize with not liking the extra layers, but if you put the steps directly in CI then you either duplicate local/CI steps (and they inevitably get out of sync), or CI becomes the only way to deploy (which, depending on your situation might be fine). It can be a reasonable thing to do, but it's a pretty sharp tradeoff.

I don't think that depending on CI for deployments is the worst thing you can do. You get two good signals; 1) this compiled on my machine, and it compiles on some other random machine, so it's looking likely that this binary will start up on the prod servers ok. 2) there is some proof that the test suite passed before you ship your thing to production.

I'm really pretty OK with "if CI is down, and there's a prod emergency, and the test suite is broken", not everyone on the team can fix that. Escalate to someone who can, the 0.5 times this happens a year; the rest of the time, everyone clicks "merge" on their pull request and the code filters out into production quickly and efficiently. They don't want to think about it, but the customer gets their features as soon as possible. Not the worst thing. Way better than "our release person is on vacation, we'll restore your outage next week".

> make init

Even better, make this recipe's dependencies work so it doesn't do anything if you don't need to run init, then have everything else depend on init. That way no one needs to remember to run init first.

Agreed. I like keeping make as my universal abstraction for all commands in a project. I keep a template to use on each new project with all the best practice fixes I've found over the years. I keep my own documentation as comments for things I often forget about how to use make. I have the same for bash scripts too.

I like to also use the same commands in by build pipeline. It ensures I'm running the same commands as I do locally, with the added benefit of abstracting the deployment logic from what ever CI system you use.

Some issues I've found with this approach is having to install make on a build system, which these days is a slim docker container.

Some devs still run windows, but I think you can install some kind of equivalent.

But make kinda sucks for this kind of thing. The CLI tool we use based on pyinvoke gives you tab completion for all the commands and all the options which by itself is worth the switch from make.

Nothing wrong with rolling with something like that until there are more than a few people working on the project and you have actual customers.

Overengineering infrastructure early on is fun, but unecessary and something that can relatively simply be tackled later.

Not even half, but a pretty decent chunk of my tools are things that someone thought should only happen in CI/CD and then it breaks and there's no way to debug. People are not fond of using your tools when they create blocking issues that are potentially open ended.

Most of your tools should run in CI/CD. That's the goal, and a pretty high priority one. But they can't only be run there.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact