Hacker News new | past | comments | ask | show | jobs | submit login
You need to be able to run your system (catern.com)
291 points by catern 7 days ago | hide | past | favorite | 161 comments

Well this is timely.

Just this morning, I’ve been extending a Firebase app I developed for a very large enterprise customer, and I’ve been constantly reminded of some of the things the author speaks about as I tried to recreate the FaaS environment locally.

Not only did I waste valuable time setting everything up locally when the same would’ve been trivial in an old-school web framework; I now have an explosion of environment permutations and smells in my code, and I don’t completely trust that local and prod will behave the same so I will have to do extra testing.

Despite Firebase coming a long way - some of these local emulation facilities plainly didn’t exist or barely worked a couple of years ago - there is no way I would choose it for a new project. The only module that has genuinely saved me time is auth; none of the rest have made me happier or more efficient than a web framework. Even for prototyping (the reason why I picked Firebase initially) I would still pick Django or Express or whatever else over this. (And I haven’t even delved into the big elephant in the room, which is lock-in.)

Regarding costs and scalability there are a gazillion (fairly easy) things I could’ve done to achieve a similar cost range bringing my own stack.

FaaS and the like may have their particular use cases but for anything that has even the slightest potential to turn into a full-blown app I’d go for a good old monolith every time.

For realtime requirements, I'm keeping a close eye on Supabase, which is building an open-source realtime Firebase-style API on top of the Postgres WAL. Theoretically, it's just Postgres with a bunch of services on top, so if you bring your own database migration and fixtures system, you could run a copy locally. Not sure that tooling is fully there yet (and it could use some of the model-layer bells and whistles I remember from the days when Meteor was the way you'd go for realtime), but the dev experience is very promising:



Yes, I also think that realtime is something where Firebase or Supabase may help. The app I'm talking about has no such requirements, and likely won't have them ever.

Even then, if I were to build an app with a predictable realtime requirement I would (a) find a full-stack framework with this as a core feature (b) use a framework without this core capability, but look very hard for the simplest library that would allow me to pull off the requirements, and only if all else fails do (c) which is to isolate the realtime component into something like Firebase and build everything else in the core stack. (I haven't tried Supabase, but from what I understand it might fall more into (b) than (c), since you can bring your own infra)

One of my favorite apps, Clubhouse.io (the project management solution, not the controversial audio social network), mimics realtime by just having a really, really optimized polling endpoint for the last timestamp the project was touched, then the client refreshes as needed (which would hit cache if you account for distributed systems propagation, if you even need to worry about that). It's an amazingly simple solution and one I'd recommend for anything in B2B modalities.

I'll preface this by saying that I haven't even looked at Firebase in 5 years, so apologies if I'm completely off base.

Why can't you just use a "live" development environment in Firebase?

Obviously this isn't practical for some cloud services due to cost (AWS Redshift is the first example that comes to mind), but I don't see this being an issue with Firebase. If a company is paying six figure salaries to their developers, surely they can afford to give them dev environments on whatever cloud provider they use.

Requiring an internet connection doesn't seem like it would be an issue either. Who in 2021 is building web apps with no internet access? I can tether to my phone practically anywhere, and even when I'm on a flight I have Wifi access.

The product I work on uses a few live AWS services in dev, and for the past 5 or so years everything I've worked on has needed some kind of internet access to run. I can count on one hand the number of times this has been an issue. Dev/Prod parity is a good thing in my books. It's been more common for me to run into issues on prod due to slight differences in how things work between running things on my MacBook and on AWS, not just due to services emulators being different to the real thing, but due to intrinsic differences between running things on a MacBook and on remote Linux servers (even with docker).

> Who in 2021 is building web apps with no internet access?

Defense contractors? Government entities? Entities worried about chain of trust in packages? Choice of upgrade path instead of forced? Ability to keep working when the net is down?

There's a reason for lots of people to still want 'on prem' and to not want to have the whole Internet as a dependency.

> Defense contractors? Government entities? Entities worried about chain of trust in packages? Choice of upgrade path instead of forced? Ability to keep working when the net is down?

I think you partially misunderstood the parent comment - he was talking about requiring the internet for building the app ("Who in 2021 is building web apps with no internet access [while developing]?"), but you responded to using web apps w/o internet ("Who in 2021 is building web apps [for use] with no internet access?").

A bit of your point still stands, but, as the sister comment correctly pointed out, when you're building a web app for offline usage, Firebase is not the right choice of technology.

I don't think that is true. Having worked in regulated environments, the depedencies are mirrored internally with something like artifactory and must be approved for use after a license vetting, etc. The application build process would not touch the public internet to retrieve any content.

I think you're missing the original point. If your app is built on top of something like Firebase, then clearly your org has approved that use. So when developing the app, you should have access to Firebase.

If your org has not approved any cloud services, then you can run everything on your development machine (database, webapp, cache, etc.), or a provisioned dev environment if the full stack is too heavy for a single workstation.

How many of those people running airgapped systems are using firebase or s3 though?

Due to govcloud, S3 + docker is increasingly normal. We're helping a team who have that... but not much else (ex: no outside internet).

In utilities/industrial... docker, but not s3.

> Why can't you just use a "live" development environment in Firebase?

I have a live (as in online) dev environment with its own DB, which is good for most database reading/writing.

However, for Firebase Functions (FaaS), I really don't like the dev experience of using an online environment. You have increased latency for everything, but above all, logging is terrible.

So you have to set up the local emulators, which is not nearly as easy as doing `python manage.py runserver` or `nodemon` or what have you. I had problems with the setup this morning, and I've had to modify the code to write to the emulators instead of live functions depending on the environment (there is no way to do it purely through env variables, you have to do it though code).

I'll grant that it's improved massively over the last year or so - when I started the project I couldn't get the emulators to work at all, and SO and Github were riddled with issues on the topic.

On the topic of databases, I say it's good for most reading and writing, but last time I checked the data from emulated databases is not persisted so it's only good for fleeting tests, but not for playing around with data locally. I'm sure I'm not alone in that I often like to play around with bulk operations (such as big initializations or fetching) and manually explore the data (the "live" data explorer in the form of Firebase console web app is atrocious, there are infinitely better local solutions like pgAdmin, or the JetBrains' database explorers)

All in all the DX is just about acceptable, but I'm not impressed given that the product is almost a decade old, massively adopted, and led by Google. My original point is I believe web frameworks give me "tighter" DX, and I don't see a reason to pick Firebase over other options as a complete stack most of the time.

The complete emulator suite seems to be mostly suited for running tests. When working on firebase functions, I find running only them (`firebase emulators:start --only functions`), with everything else connected to the dev firebase project online, works pretty well. Persistent data but fast feedback loop for what I'm working on. You do need some scripts to manage what you are connecting to from the front end, as if you're just working on that it's probably nicer to not need the emulators at all.

I agree that it's just barely acceptable. I wish the web console wasn't so slow, and the data explorer needs a major overhaul.

> Who in 2021 is building web apps with no internet access?

Over the past 20 years, a lot of software development has shifted to apps that run in the client's browser.

But, there are still a lot of computers which are not connected to the internet. Sometimes it is because of security, sometimes environmental factors, sometimes it is simply because the use case does not necessitate it.

I have a computer that uses a web app that isn't connected to the internet on my desk right now -- a printer.

I imagine mostly because deploying to the cloud is slow - it would be the new waiting for compiling.

A fast-paced edit-compile-debug cycle is a huge development productivity benefit, and doing it via unit/integration tests and what have you is nice, but not equivalent.

My company was doing this when I joined (and to some extent still does this, but most of our data is in postgres now). It works ok as a development environment, but it's massively constraining for automated testing.

With a local instance, I can just spin a new one up for each test run. You can't do that with firebase projects (they do now provide an emulator which is just about workable).

A thing I've been noticing is that PaaS companies (e.g. Netlify, Cloudflare, Heroku) are shipping tools that let you test locally on your own machine. The experience is unmatched. I'd bet it's what led Google to (attempt to) do the same thing for Firebase, although I've noticed as well that their emulation is still off.

Overall, I hope this is a trend that continues. The fact that these things are sometimes baked into their clients, which are usually freely licensed and developed in the open, available for patching, is something that hugely eliminates some of the lock-in risk associated with building on top of platforms. Worst case scenario? The company either goes out of business, deprecates some corner of its platform that you're relying on, or becomes prohibitively expensive to use, so you spin up a VPS and daemonize the client. This doesn't work for large projects, but for anything that is relatively small (but not so small as to characterize the development costs as trivial), it's sufficient.

anecdote: about a year ago we had a Rails app on heroku and ran into a problem where the tests passed locally but hung in the remote heroku-CI pipeline. We followed the instructions for running the tests locally but ultimately weren't able to reproduce the hang, despite it happening every time in Heroku CI. Their support was good but ultimately it felt like some developers in the CI/container/platform team poked at it for some time-boxed session and couldn't figure out exactly what was going wrong. The problem had started after upgrading some gems, and we eventually solved it by upgrading some other (unrelated) gems. Somehow there was a very precise set of gem-versions that were incompatible with something else in the hosted environment. I've forgotten details but the tests that hung involved capybara or some part of the front-end / "integration" testing, which IMO tends to be flakier than backend-only tests.

Anyway sharing here just to note that while thankfully it was a CI issue and not a production issue, it did convince me that "test locally on your own machine" tooling still had some subtle gaps vs. what was happening in "the cloud."

Well my experience with Heroku has always been that the code barely cares that it's being hosted there: a few config/environment variables here and there and that's it. Nothing to do with Firebase, where your project structure, front-end code, scripting utilities, config files, etc. become coupled with Firebase even if you try your best to contain the coupling.

I'm interested, what does Heroku offer in the way of "local testing" anyway? I just use a web framework's normal test runners in my local shell.

I work on the Firebase team and have been working on the Firebase Emulator Suite for the past year or so. We really want every Firebase developer to be able to run their app locally and offline, although I'll be the first to admit there are rough edges.

If you or anyone else wants to discuss this more and give us some specific feedback on what went wrong, please reach out:

  * https://groups.google.com/g/firebase-talk - our mailing list, good for open-ended discussion
  * https://github.com/firebase/firebase-tools - GitHub repo for the Firebase CLI and emulators if you have specific bugs to report.
Or if you want to reach me personally I'm samstern@google.com, always happy to chat 1:1 about this stuff.

> I now have an explosion of environment permutations and smells in my code, and I don’t completely trust that local and prod will behave the same so I will have to do extra testing.

I always run into this problem with microservice architectures, and it's a pain. There might be a B2B opportunity there, "recreate my distributed architecture as a service" or something...

> recreate my distributed architecture as a service

Isn't that essentially Docker Compose?

I liked Firebase for prototyping, but for the same reasons you mentioned, I feel I should have just gone with Redis or something else that I can easily run locally.

I get the feeling that Firebase (and some parts of AWS) are meant as services for larger enterprises that are really hooked on GCP (or AWS).

I fundamentally disagree with the article - sorry! I used to believe it - for many years. But as systems continue to add essential (as opposed to accidental) complexity, the only way to run production is to run production.

Why do you want to be able to run the app elsewhere? To test new features? To reduce regressions? Reasonable goals, but at the end of the day the only thing that is identical to production is production. Unless you have a real time copy of all of the data and you continually run copies of all real time production requests, it's not production. It might be the same code and a very similar infrastructure, but without the same data and load, you are going to get regressions in production. Maybe it's flaky historic data or unexpected patterns of load, but whether we like it or not, we're all already testing in production.

I'm a huge fan of unit tests, CI, and all of the other common best practices to reduce the number of bugs that are identified in production, but you also need to have the kind of tooling and processes required to minimize recovery time and to be comfortable with testing in production. Small, easily testable, quickly shipped units of work and some flavor of feature flagging so you can dark ship code, recover quickly from outages, and do things like canary roll outs to ensure the new queries don't break with the production data at scale!

I disagree. Yes, fully replicating the production environment is not possible for big apps or apps with customer data, no discussion there. But when you want to debug components, test error conditions etc, a copy is extremely important. You can do "[DEBUG] added some logging" commits left, right and center, but it's not going to replace a debugger.

Additionally, you might want to create custom or corrupt datasets. Yes, you can theoretically add a flag, but then this flag needs to be checked everywhere and might come with its own bugs. Using the "customers_debug" table instead of "customers" table (for example) works as well, but then you replicated staging in your production environment - added complexity which is definitely _not_ needed.

Lastly, this misses the other points the article makes - a local copy allows you to freely play with the configuration, shut down related services etc. You can not do that in production.

But I'll give you that - resiliency in production is still very necessary and you usually won't be able to replicate everything locally. The ability to run local isn't everything - but that was not the point the article was making.

Debugging code running in the production environment with access to production resources and production versions of external services is extremely useful and not always impossible to arrange -- I think it's a reasonable expectation for good development teams to have this capability in many applications ...

I was unable to run the codebase for a project I joined once and switched to using heroku one-off dynos (servers from that app's cloud provider -- these nodes are identical to production nodes excepting that they don't receive external requests and can be segregated in logs) as my development environment ... did some dumb things with ssh and tmux and was able to persuade all my tools that all the remote node's processes were accessible from localhost -- with a little finagling being able to use all the tools normally used for local development and debugging directly against production provided an extremely productive set of lenses for peering into the behavior of production resources -- especially when onboarding into the new company's codebase ...

> Why do you want to be able to run the app elsewhere? To test new features? To reduce regressions? Reasonable goals, but at the end of the day the only thing that is identical to production is production. Unless you have a real time copy of all of the data and you continually run copies of all real time production requests, it's not production. It might be the same code and a very similar infrastructure, but without the same data and load, you are going to get regressions in production. Maybe it's flaky historic data or unexpected patterns of load, but whether we like it or not, we're all already testing in production.

That is letting the perfect be the enemy of the good. Sure, there are classes of bugs you'll only be able to find in production, but there are also large classes of bugs you can find in a development/test environment. In general, it's a good thing to minimize the bugs you find in production. It's certainly less stressful.

Of course the development environment will never be the same as the production environment, due to the difference in load of the stateful services like databases and queues. I think we are all in agreement about this.

For me, the article mostly concerns about the stateless services. Let me rephrase the article's main "ethos" in a different way, if you may..

If all your stateless services are running in the same kubernetes cluster in production, and you accidentally destroy the cluster (ahem terraform ahem), would you able to automatically re-deploy them from scratch? Does anyone really understand the dependencies? Is there any circular dependency which would require some hack?

And if you can't do that in the local development environment, what makes you think that you can recover from a production incident like this?

When your production environment involves thousands of servers and massive databases, at least one testing stage is essential.

You need CI and regression tests running on smaller stages with mock data, because a bad rollout can take some time to revert, even if it didn't include a rogue database query or migration that impacted your data. Knowing how to rollback quickly is important, but a penny of prevention is worth a pound of cure with large services.

And IMO you shouldn't let people "test in production" when the changes involve a production database; those should be off-limits. It's all fun and games until someone forgets which instance they're logged into.

You'll still get plenty of regressions, but the goal is to avoid major issues that impact most of your users/customers/etc.

The tech giants are each their own unique thing and not generally a pattern that anyone else can or should follow.

Most organizations, for example, do not have hyperscale volumes and face no insurmountable barriers in setting up parallel environments for testing.

IMHO, if you can't rebuild all of production from scratch, minus the sensitive data, the you don't have a full understanding of your production system, and that's where serious problems begin.

I'll carry that torch when I feel it's appropriate. But the reality is, this practically requires making the decision to use free software and keep the entire open source community at a long arm's length.

Why? Because business doesn't give two shits about your productivity. They care about one thing, the productivity of the software. And so they will buy other goods and services that will at some point be closed source. And then you're the one that has to integrate them.

Are you deploying to GCP? Already you can't run your full stack locally. My last week was spent hacking Google's Anthos CLI tool, nomos into a kpt function. Kpt is open source, nomos is not, it's not even source-available. It's just proprietary. And the platform it interacts with, you can't run locally. Go ahead and check out the troubleshooting page for one Anthos tool:


No sections on spinning up a local env. You can't self-host or run locally, Google won't let you.

This isn't isolated to just Google. People need to get it into their heads that open source is the realm of business and not the realm of hackers. You're a cog in their machinery.

If you can't run free software for your whole stack, then your troubleshooting steps will necessarily involve understanding at least two environments, one of which will be a very black box.

Devs, if you want a better world, start contributing to copyleft software.

It seems like running a separate instance in the cloud counts as running the system, though? The key point is that you, the developer, get your own instance.

> Devs, if you want a better world, start contributing to copyleft software.

Your version of "a better world" is one in which authors, musicians, movie producers, artists, and all other creative types are free to make money from their works, but not software developers. When they try to do it, it's immoral. No thanks.

What? I didn't mean to imply that. I'm an associate member of the FSF. I pay my dues with the money that comes from my corporate job which is open source all the way. I use Emacs, and keep my personal hardware as close to free as I can. One day when I'm into the ecosystem enough, I'll start contributing code to GNU projects.

The more we can build up free software as an alternative to open source, the more the business world will be forced to use it. They need your dollars more than anything.

And my claim is that the fundamental tenants of the FSF are abhorrent fanatical nonsense, made clearly evident if you replace "software" with "novel". They are deeply offended by the idea that your computer could be running proprietary software that you're not free to modify, re-sell, or do whatever you want with; but are they equally offended when your Kindle contains a book you're not allowed to modify, re-sell, and do whatever you want with? Do copyrights on novels offend them as much as copyrights on software?

Contributing to open source software is one of the most effective forms of altruism in history -- but it is altruism! Permissive licenses like MIT and LGPL embrace and enable that altruism. Copyleft licenses and the FSF are built around the idea that you are entitled to it, and it is immoral to write commercial/proprietary software. That is fanatical and unhelpful.

Unless you think it's always immoral for anyone to make money from a copyrighted work? In that case I'd be a lot more interested in hearing that perspective, because at least it's not hypocritical.

Copyleft does not forbid making money. It is perfectly legal to sell a GPL product.

Also, with a novel on paper, you generally have similar rights as the GPL grants on software, such as reselling it, making modifications (I can just write on the paper!) and that sort of thing. While scanning and distributing copies is not allowed, quite some GPL companies also crack down on this by simply not providing updates anymore (I believe this approach is taken by the Linux grsec patchset developers).

Copyleft doesn't mean you can't make money from it. Software marketing is hard, sure, but if it fits your case, copyleft can actually be turned into a revenue source by selling alternative licenses that allow users to keep their modifications private ("dual-licensing").

Our server went down for a couple of days and if we could "run our system on localhost", I'm positive we would have been back online very quickly as opposed to the multiple days it took to track down stored procedures not in version control. Front-end was left twiddling their thumbs during the outage because the server wouldn't run on local and our frontend wouldn't run without a server (we neglected updating our frontend model mocks for years).

Did we learn our lesson from the outage? A big _nope_. I suspect it's because being able to run a somewhat complicated system on local requires thinking in brand new ways with benefits that aren't very obvious from the outset. After that experience, I sympathize a lot with the author's points and hope to work in an environment (ha!) where spinning up a docker container is all it takes to have a _full_ dev environment.

I often set up a "how to create a dev environment" wiki and then we exercise it many many times.

IBM got a bad shipment of laptop hard drives that exhibited a MTBF of about 2 years, and our equipment dept bought a stack of laptops from that batch. Over a summer we had 6 machines go belly up. Mine was number #5. People still looked at me like I announced that I had stage 3 cancer. Oh you poor poor man. I found this reaction disappointing.

By then the process was about as documented as any I've had. It just took me a day to get it up and running (because the base image left a very slow step until after 2nd boot, which I still maintain is dumbness squared).

A coworker from that cohort had an experience that I still use as an example. He tried to put his work laptop in the back seat of his car. He missed and hit the door frame. Killed the laptop. Similarly, taking your laptop down the stairwell could be a one way trip to the trash bin.

If the information is important for us GET IT OFF OF YOUR COMPUTER. As soon as you know. Put it in storage, or at the very least in some coworker's head/computer. If you do this, consistently, then losing your machine is a shitty inconvenience, but nothing more dire than that.

Strongly agree. In the last three years I've rebuilt my macOS laptop from scratch twice (dead hard drive, and a dev tool run amok), and once swapped it for a different size because I really wanted the larger battery.

My co-workers were appalled, but I had less than a day of downtime each time. With continuous backup to an external drive (always on while I'm working), a password manager I can also store certs in, and keeping everything important in a cloud-backed git repo, there isn't a single thing on my work computer that's hard to replace. It's been fantastic.

Nix is one part of the solution to this problem. About a year ago I lost my data to a corrupt file system. I tried to recover it for some time, eventually gave up, and was back to a working machine, with all the software installed and configured as it had been previously, within two hours of a disk wipe. Most of that time was waiting while nix downloaded and installed packages from the internet.

I wish I could easily find beginner-friendly Nix tutorials. Even the official ones immediately assume you know their terminology.

Do you have any good links?

Nix Pills was the "standard" intro text I started with a couple of years ago. At the end, you'll be able to navigate packaging with Nix in general, even if you're a little unfamiliar with some of the modern idioms and tools. It's very approachable, not especially short, not especially long.


Local backup and remote backup. Always.

I only write code on my laptop, but I sync my projects folder with my cloud desktop. It's saved me from several crappy accidental losses.

You must invest heavily in development tooling. You must invest heavily in an architecture model that caters to local as a first class citizen. You must invest heavily in developer mindshare to actively and continually reinforce that local is a must-have, not a nice-to-have. And to do all that takes an excruciating amount of effort beyond just building your products. Ultimately its a tradeoff, what are you actually trying to achieve. At some point you'll cut a corner, or a developer on your team will, or the next person will. Eventually the system doesn't work locally because of edge cases and the only way you claw this back is by mandating it a policy to all that it has to work locally.

I built a thing with a local first view and I still battle this everyday because whether we succeed or not will not be dictated by whether it worked locally, but by actually shipping a product that people want and are willing to pay for.

Tradeoffs. Sometimes you have to just let go of the purist view point.

> In my experience, they're always wrong. These systems can be run locally during development with a relatively small investment of effort. Typically, these systems are just ultimately not as complicated as people think they are; once the system's dependencies are actually known and understood rather than being cargo-culted or assumed, running the system, and all its dependencies, is straightforward.

Whenever I hear these statements, it always sounds like "I need to have an identical copy of this skyscraper in order for me to be able to replace one tap on floor 42."

Also good luck running a system that operates on few hundred of terabytes of for instance YouTube data locally.

Also running the whole system locally is usually a pretty good way of creating a "distributed monolith" -- yea, there might be microservices, but also a dozen of assumptions here and there that different parts of system are being deployed simultaneously (usually they are not), or that certain distant parts of a whole system share some behavior that can be changed simultaneously.

So no, you don't need to run the whole system locally. On the contrary, you need to be able to run smallest part of it (hello, microservice) locally, and that part should be responsible for one thing. APIs and frontends can easily share JSON schema to make sure they send and receive valid data, and each service can have tests against that schema, the ultimate source of truth for them.

Boom, suddenly I can develop "big system" piece by piece in isolation on my 7-year old macbook with no problems against tests / storybook / debugger.

>Whenever I hear these statements, it always sounds like "I need to have an identical copy of this skyscraper in order for me to be able to replace one tap on floor 42."

The building industry also has the problem the OP describes:


The problem as I see it is that people who go all in on unit tests tend to be dogmatic about it and suffer the above type of issue whereas the people who want, broadly speaking, to run things as realistically as possible are pretty aware of the real constraints.

Moreover modeling larger is also frequently cheaper because the real thing often comes for free while creating elaborate frequently changing unit test mocks has very high opex.

Oh yea, and also you get that fuzzy "I clicked random things around, so everything should work, because I run everything locally" feeling. ¯\_(ツ)_/¯

> Also good luck running a system that operates on few hundred of terabytes of for instance YouTube data locally.

Why can't that be scaled down to a few hundred megs of data running through a system, end-to-end, on the local machine?

Not being able to scale down seems like a code smell to me. Is there so much overhead in the microservices that even at idle, or very low usage, they can't even be started?

For instance PostgreSQL's performance might be very different depending on the size of data query has to operate. Also if you deal with something like search results, it gets pretty annoying to deal with "well, I don't have enough data locally for this particular thing I develop".

Again, talking here about the "run the world and click on things" approach vs developing against tests / API schemas / whatnot.

Ive had pretty good results with scaling down real world data and using it to run tests against it.

Postgres performance will be different to prod, yes. That's all part of the realism - cost trade off. All models are wrong, some are useful.

The point is that diving headfirst into making matchstick model unit tests is dumb when you could build something to test against at reasonable cost which is a lot more representative of reality.

IMO this is an obvious point if you adopt a cost/benefit approach to development but it's often impossible to see for the dogmatists.

I think we're talking about slightly different things here. Running service that I need to develop right now with a lot of local data (or even read-only connection to production mega-db) is one thing. But needing to run that service with a lot of data only to be able to develop some other part of the system (that is not even related to that particular service, but "nothing works" without it) is a pretty annoying development experience.

Load / performance testing isn't always feasible in a local environment, sure. Everything else, though? It ought to be.

What if the error only occurs after ~1TB of data has been created?

Well then you have to do some bug hunting, but at least you'd have a fully working version of the system in miniature that you can test any hypotheses against.

To be honest, short of issuing every developer with their own production-grade environment to debug against I am not sure what exactly would satisfy this line of questioning.

There's nothing wrong with running your app locally. We're writing software, not building skyscrapers.

You don't need hundreds of terabytes to run an app with test data.

Monolith's aren't bad.

Running app locally -- yes, there's nothing wrong with that. Running 25 different apps that compose "the platform" only to be able to develop something -- sounds overkill to me.

If you have 25 apps to run locally, chances are that your system is over-engineered.

It depends what kind of system you are running.

NASA had two versions of the space shuttle software developed in parallel by teams at IBM and Rockwell. None of the devs had a whole space shuttle.

Of course there are always exceptions. More than likely, people on here aren't thinking about NASA software.

Which is probably why they ran it all locally first!

> You don't need hundreds of terabytes to run an app with test data.

Well you could just mock the data and code, but then the according to the author:

> "Run an individual service against mocks" doesn't count. A mock will rarely behave identically to the real dependency, and the behavior of the individual service will be unrealistic. You need to run the actual system.

If your data is too large to replicate locally then downloading a subset of data is better than nothing.

Test data is not a mock.

What matters here are the odds that the production system will behave the same way as your tests. For functional requirements (not performance), the normal situation is that if you replace your data, those odds do not decrease a lot. If you want to test performance, that changes, and you may need an environment as large as your production one.

> and you may need an environment as large as your production one.

Yes, but isn't that exactly what the author suggests?

The author is suggesting running your code (all of it) in a separate environment, that isn't prod. There is a passing acknowledgment that data exists, but nothing more about it. Very likely, he don't talk about data because bringing all of your data into another environment is indeed not viable for a lot of people, or even legal for some.

If you replace all of your data, it's still your code running there. But you must have some data, or your code won't run, and it must look like real data, or your environment will be fake again... where "look like real data" is completely problem dependent.

If your assumptions of a system fail on a single box they will fail more dramatically when they are distributed. If your system is designed such that it can't be executed on a single box then you're setting yourself up for a world of pain.

> Also running the whole system locally is usually a pretty good way of creating a "distributed monolith"

I'm not sure how you came to this conclusion. Nowhere does he mention monoliths or microservices. Even if that distinction is made it is good practice to be able to run your architecture on a single box as a goal.

Going through that exercise alone will force the designer to think about how the application scales up and down, having representative samples of data for test suites, coupling between services and so on.

The orders of complexity increase dramatically for each component running on a separate machine for the system under inspection.

Being charitable, the things you mention do have their place but they don't address many of the problems you'll face when building a complex system.

Testing in production is the best thing we have ever done with our customers.

The key is to be able to quickly undo some scary new thing and get back to a known-good point. Also, you need to make sure the testing you are doing is easily distinguished from normal business activities. For us, the humble feature flag is all it took to get unblocked on moving mountains of complex scary things to production. Being able to reassure the customer that we can instantly revert a piece of experimental functionality has been a game changer.

We used to spend months agonizing over how the test environment is such a poor facsimile of production and complaining about vendor XYZ not setting up various things that we would need to prove our code works as expected. We used to make insane bullshit promises about how if things worked well in staging the move to production should be flawless. Those were some dark times for us.

Both of these things are true:

1. Developers can be far more productive when they can run everything locally. 2. Until you’ve deployed into and tested against production, you don’t know if your shit actually works.

In fact, they’re complementary: developers should have excellent local dev facilities and excellent facilities for running experiments in production.

I completely agree. We have written a simulator for the production system that allows us some reasonable degree of make-believe regarding our local development efforts. We also use this in our QA environments, and any other system which isn't permitted direct access to the production systems.

Everyone on the team is aware of the caveats as well. We say things like "It worked on simulator, but we still need to test in live".

Isn't this just blue-green deployments but pushed into your app's codebase instead of the deployment posture? I have no doubt that this works for you but why not just just flip between two versions of your app?

Yeah, never underestimate the value of solid rollback steps. Never deploy anything without some plan to get the system into the previous state, ideally with a button push. And for those rare cases when you can only roll forward, like with a major schema change, make sure that is easy to do too. There’s nothing like the stress of a broken build to make you stupid.

Major schema changes can also be done in a backwards compatible/recoverable way - I like to follow this procedure. Specifically the "The Five Phases of a Live Schema Change" section: https://queue.acm.org/detail.cfm?id=3300018

We have spent a lot of time to make our schema as stable as possible in the face of potential changes. In places where we might end up with 100+ columns that vary wildly depending on context of use, we use JSON documents which can store objects of arbitrary complexity. This is a very delicate balancing act for us, because we also like to be able to write SQL queries against important business keys, and having those wrapped up in JSON can make it difficult to get good, fast results. In many cases we will duplicate the storage of some information so that it can be indexed as well as be available in the complex object graph.

This is quaint and wholesome. I long for the simpler days when I could agree with this, blissful in my naivety of large scale organizations.

Nowadays, I accept this reality is largely impossible and you must always draw some boundary. This doesn't mean all your developers should use a shared MySQL because nobody knows how to bootstrap the database- but it means you have to consciously decide where you sit on the continuum. Always expecting to run the whole system on your laptop (or even in a cloud) is also an unreasonable expectation, unless you're Netflix and your revenue per employee is into the millions. The reasons for this are many and complicated, but at a high level the work required to make it happen will cost too much and won't be a priority.

I'll quote here from the excellent "Testing Microservices the Sane Way" by Cindy Sridharan:

> asking to boot a cloud on a dev machine is equivalent to becoming multi-substrate, supporting more than one cloud provider, but one of them is the worst you’ve ever seen (a single laptop)

https://copyconstruct.medium.com/testing-microservices-the-s... "Full stack in a box- a cautionary tale"

This webpage has no CSS and no JavaScript, just a `<head>` with `<title>`, an `<h1>` for the heading, and a number of `<p>`s for the text. It's styled according to my browser's settings for font, size, etc. How refreshing!

I’m so glad that reader mode exists to make it at all readable on mobile.

That said, such design is perfect for precisely this kind of customisability.

It's perfectly readable on my mobile devices. Do you mind if I ask what it's lacking on yours?

Font is very small, I need to blow it to 300% to be able to read it without squinting. Black on white is also quite blinding and very uncomfortable when scrolling.

Edit: I thing thought that Apple is to blame here by setting the defaults to ridiculous values.

Font is too small.

It is quite elegant.

If the browser's default style was a little better, it could be even more elegant - not a full window width, for example. Could browsers make such improvements to their default style? Or are assumptions about the default baked into sites in such a way that changes would break them? Could browsers detect those assumptions and provide an appropriate default accordingly?

it looks unprofessional, it is hard for me to take technical advice on the internet from someone running an unprofessional looking site.

There is another blog post written by the author that docker containers are harmful that I strongly disagree with but not totally unexpected given the design choices of the site.

Their site doesn't provide any opinion styling. Blame your client.

Why do you even bother responding to someone who uses such unprofessional software?

I wondered if the GP post was from a frontend guy who had a different definition of "professional".

Clicking through their profile to their business/project website was a loading screen for 5+ seconds. I never got a true "professional" impression as I closed the window before it finished.

the outpouring praise of no CSS and no javascript websites in 2021 is tiring and not reflective of what most internet users outside of the hackernews bubble want in my opinion. Your intuition was correct, I am a Senior Frontend Engineer in California.

If the author's "system" is as minimal as that web page, I can see how the advice about being able to run the entire thing would make sense.

Two recent anecdotes from two different projects:

Project 1, Elixir.

> Run an individual service against mocks" doesn't count.

We got an error in production a couple of weeks ago. The code was pretty old so it was a surprise. Found the problem, a rare case we never thought about, and the culprit: almost all the tests using that function where run against a mock of that function because it requires quite an elaborate database setup to work. The function itself was tested only in a couple of unit tests which exercised only the naive cases. Big mistake!

Project 2, Python.

We spin EC2 VMs to run some long running CPU bound task and kill it. Of course we can do that in a development environment. We're already using SQS in development. And yet spinning remote VMs takes longer than locally. So we build a couple of scripts around VBoxManage (VirtualBox) to run those VMs locally. Booting them from an SSD is really fast and it works when working offline. We have two classes with the same methods, one wraps VirtualBox, the other wraps EC2. The configuration file of the application picks the right one given the runtime environment.

This is an incredibly bold assertion:

> Developers of large or legacy systems that cannot already be run in their entirety during development often believe that it is impractical to run the entire system during development. They'll talk about the many dependencies of their system, how it requires careful configuration of a large number of hosts, or how it's too complex to get reliable behavior.

> In my experience, they're always wrong. These systems can be run locally during development with a relatively small investment of effort. Typically, these systems are just ultimately not as complicated as people think they are; once the system's dependencies are actually known and understood rather than being cargo-culted or assumed, running the system, and all its dependencies, is straightforward.

In my 28 years of professional experience working in 9 different organizations, only one of them did fully run the core system in development, and another two of them could have done that with enough time and effort. For the other six, there was no reasonable way to run the full system in development.

> Typically, these systems are just ultimately not as complicated as people think they are

In fact, most long-running systems are quite a bit more complicated than people think they are.

Granted, a variable but usually large chunk of that complexity can be removed/simplified, but at their core, these 'legacy' systems, the ones that make real companies work, do their job because they have, over many years, figured out how to successfully come up against all kinds of real world complications.

To be clear: I get what (I think) this person is fundamentally trying to say: the more expansive/end to end a development environment is, the better. That it's better to have as wide a view into the system you're working on as possible. Good stuff!

Aren’t you making the point of the author?

Most places don’t have a way to run in development because they haven’t sat down and gotten the application to a running state. Every year it gets ignored, things get harder and more tedious but never impossible

I see both sides of this argument. Often legacy systems are tied to their deployment methodology/dogma and the world moves on from that model. Without a lot of resources poured into maintaining that methodology and upgrading it, the expected environment can age out of current practices far enough that running it locally does take a lot of effort.

I think docker is an obvious enabler of this because people can just spin up an image from 6 years ago in a container and keep running against old libraries/dependencies. Even worse is depending on that container makes it difficult to upgrade dependencies to modern versions of software because the container is so old.

Maybe a corollary to the article is, "without constant maintenance, software becomes un-runnable."

It seems tantamount to declaring that a system can never interact with the outside world.

If I'm creating tax filing software, do I need to spin up a local instance of the IRS to test with?

> If I'm creating tax filing software, do I need to spin up a local instance of the IRS to test with?

No, integrate directly with the test endpoints provided by the IRS [1].

[1]: https://www.irs.gov/pub/irs-schema/Portal%20Transition%20Gui...

You can possibly stub out the IRS with realistic test cases that you care about.

> "Run an individual service against mocks" doesn't count.

> "Run most services in a mostly-isolated development environment, calling out to a few hard-to-run external services" doesn't count.

I think they're talking about services that you or your firm develops, or maintains, or ships. In this case the IRS is none of those things.

Yes, obviously this. People who don't understand this either just want to argue, or haven't thought about this at all.

In the spirit of the article, I think that the approach would be to ask your partners to provide a sandbox/test environment that your sandbox/test environment can interact with, or test accounts in production at least.

> In my 28 years of professional experience working in 9 different organizations, only one of them did fully run the core system in development, and another two of them could have done that with enough time and effort. For the other six, there was no reasonable way to run the full system in development.

Only ~20 years here but I've worked almost exclusively on three or four large systems with test environments despite their complexity. I think it may be the nature of the companies/business that makes the difference.

Ancient (1980's code) hospital ADT+billing system (medseries 4 if anyone cares) on an AS/400. We kept the last generation of hardware around to run the test/demo environment which was end-to-end connected via HL7 to all the other test environments. That included the other thing I managed, Cerner Millennium, a hodgepodge of a medical records system on AIX and Oracle, where again the old hardware got used as the test and build systems. The test environment was critical because even though it was made up of a dozen commercial software packages talking to each other over HL7 and flat files (and serial for lab devices and who knows what else) it was critical to make sure the whole environment worked when doing upgrades or new development. We kept a full copy of prod data for final integration testing. The tooling for (most of) the software supported replicating environments.

The largest example is Photon/Ubiq at Google; end-to-end test environment running with deterministic sampling to save resources at all stages of the pipeline so everything from injection through logjoining and aggregation to the final database updates were continuously running the latest code for testing before a release to prod. Thousands of prod machines devoted to it. That was in addition to the integrated tests and unit tests during the development process.

At FB the statistical counter system I worked on also kept a continuous test pipeline running, again with event sampling to reduce cost, to ensure everything would work properly before a new release.

My takeaway is that where it matters companies are willing to spend the money and people to have full end-to-end testing, often in realtime, and have some leeway to save on resources with subsampling, smaller datasets, etc.

> We kept the last generation of hardware around to run the test/demo environment which was end-to-end connected via HL7 to all the other test environments.

If I understand right (and I may not, both your description and the OP), this may violate the OP's prohibition I've quoted below. What do you think?

> "Run an individual service in a shared stateful development environment running all the other services" doesn't count. A shared development environment will be unreliable as it diverges more and more from the real system.

The OP seems to be awfully strict.

"Run all the services that make up the system in an isolated development environment" was maintained in all the test environments. It'd be a moderate disaster to mix live patient/billing data between prod and test and so the test systems linked together with independent test/demo/prod HL7 gateways.

Right. If I understand it, it does seem to me to be an acceptable and reasonable solution, and I can't think of a better one.

But isn't it the "shared stateful development environment" the OP is saying is unacceptable? Since every developer doesn't get their own "last generation of hardware"?

But I may be misunderstanding the OP?

I am not trying to critisize what you did; I am trying to understand and possibly critique the OP. The OP seems unrealistic to me, despite it's protestations that it really isn't; but I may be misunderstnading the OP.

Shared stateful development environment implies that "hard-to-test" things are not tested.

In that sense, sure; we had one active directory, one physical network, one physical workstation to trigger commands, one office chair, no test version of me, etc.

So if the OP's language is not careful then it could include things in "the development environment" that don't make sense.

I'll agree in general with the sentiments of the author, but for a lot of people it isn't practical, because things are already gigantic, overcomplicated, etc, and it's out of your hands. So instead:

Be prepared to run _any part_ of the system locally, like databases, message queue thingies, web applications & services, etc. This gives you lots of flexibility and control you wouldn't have otherwise. The up-front work can be painful, but it pays off over time.

Know how to proxy missing parts: For example, have a local nginx server proxying requests to a shared server that can handle some things.

Set up integration tests that actually bang on running applications (over, say, HTTP) instead of testing little blocks of code. Even if these only run on-demand (full runs may take hours, require overnight scheduling, bleagh) they'll be super-handy.

Ultra-distributed systems are a nightmare, but you do what you gotta do. In the end you'll end up being your own "devops" engineer and know your architecture better.

Edit: A lot of people think of "mocks" as unit-testing types of things, but "mocking" an entire web server is often the way to get out of an integration-testing pinch, especially when some production-only remote thing interferes with the critical path you're on.

In hardware, it's not enough to design the object itself. Objects need to be made. At large scale, they need to be manufactured. When you manufacture novel objects, you must also design all of the object-specific tooling that goes into producing your thing, like molds and fixtures, which are just as important in producing your object as the design of the object. It could not be made at scale without designing those fixtures and molds. For complex objects, creating effective molds or fixtures is a high-skill job whose difficulty is equivalent to that of the object designer.

After working on developer infrastructure at a big company for a few years now, what I've realized is that most companies don't spend very much time at all designing the 'molds' and 'fixtures' their software teams need in order to manufacture a worthwhile object that they know will work. They pay lip service to the importance of testing and solid dev tooling, but they're reluctant to invest in it. (I don't blame individuals here — they're behaving rationally given market pressure — but it's a sad state.)

At my organization I'm unable to get even virtual machine instance(not local) for testing and CI, yet they are perfectly content with testing software on production server... Actually they are able to spin-up some virtual machine for me - but without root access... So if I need some software or library I need to ask them and wait for days...

Plot twist: can't run any virtualization software on my development laptop and we are cut off from "the internet"(so no ssh or remote cloud).



Developers need full access to their machines or they are stifled.

This organisation is not interested in you doing quality work.

Your options are therefore to either resign yourself to not doing quality work, or leave that organisation, ideally making sure that management knows exactly why you're going.

(Of course, the chance of management actually caring is about one in thirty seven trillion - if they cared about quality of work or employee satisfaction, your machine wouldn't be locked down. So not bothering to tell them is fine too. Tell them if you're feeling charitable)

Bring the sysadmin a box of donuts (or scotch, if it’s that kind of place) and ask again.

Barring that, if you have admin rights and your laptop is running Windows you can enable Hyper-V[0] and get a great hypervisor that comes in the box. (If you’re already using WSL2, you’re already doing virtualization). If you’re on macOS, try to find something that uses Hypervisor.framework (multipass[1] is good if you can get along with Ubuntu) so you don’t need to install any kexts.

But this is really something you should bring up with your manager, it sounds like it’d be difficult to get any good work done without being able to run tests in a not-production environment.

0: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-w...

1: https://multipass.run/

Beer and donuts doesn't work as its "policy". As for Hyper-V: virtualization is disabled on BIOS/UEFI level, and I'm unable to change it.

Canary deployments. Don't replace a production version, but install the new version next to it. Switch between versions via loadbalancer or use queues to decouple different implementations from storage. Adjust your development process to favor small incremental backwards compatible changes and feature flags (make deploying your code separate from enabling/disabling a feature). Lots and lots of monitoring and alerting.

Life finds a way.

Within the boundaries of the space is a solution, we just don't know where the boundaries are most of the time.

A bunch of years ago we were under a similar predicament, we needed to tune some software that was data dependent, yet we were cut off from prod data, and the test data as hot garbage.

It took weeks to get new software into production. We had one code push left before launch but we still needed to tweak with production traffic. So we did what everyone does, ship remote firmware updates. In the last code push we included `eval` and the ability to POST code to an endpoint. We were just doing some query rewriting, but the code could have run anything.

Once the last build made it through dev and test, we had a serverless deployment platform that allowed us to dynamically update the code in realtime.


What is keeping you at this company?

Very good money for my skillset(IMO I'm below average programmer), "prestige"(company is in TOP 5 in its field), and things I do are not ordinary boring stuff like web apps - but rather specialized stuff.

The interview process in tech is often long and laborious, and there’s no guarantee the next place will be better.

Tell your manager you can't get work done up to industry standards. The high-level problem here is one of resource allocation and policy: you will have a similar technical problem in the future and you need to be confident that it's solvable, which requires management to be willing to resource solving it. You need either some other team's management to decide that this pace of development is something they will prioritize solving or your management to "go rogue" and get you set up on public cloud.

But if you want a solution to this specific problem, the first suggestion I'd have is to install a non-root-requiring package manager like Linuxbrew or Nix (which is better if you can get a directory /nix writable by you, but workable even if not). Then you can install software without contacting the sysadmins and waiting.

Alternatively, do you have unprivileged user namespaces, i.e., does "unshare -Urm" work and get you an apparent-root shell? If so, you can install rootless Docker in your homedir, or use https://github.com/containers/bubblewrap . (A few years back I wrote https://github.com/twosigma/debootwrap which ties together bubblewrap and debootstrap to get you an unprivileged Debian chroot.)

Get them to install docker on the VM and add you to the docker group. You can now use docker like VMs to run tests.

So this is a a sample of the reason why I like Elixir (or Erlang). Your 'traditional' microservice architecture, running on Google/Amazon/Azure infrastructure, using lots of their pieces is very hard to test in an end to end way and you've got no chance of running it up locally. Kubernetes provides an abstraction layer that improves this but it's not perfect. Even a Django framework probably has a load of workers doing stuff off process.

In Elixir, what you've got is a lot of Elixir. Sometimes you need other external services and there's some work there, but you can often get a surprising amount of your system in one codebase, running on one more instances of the VM. It makes testing SO much easier.

We've seen one counterexample recently: NASA Perseverance Rover.

They emphasized that this was their first time of running everything in the real environment although I'm 99.999984% certain that they've tested it in the closest setup as possible. I'm also certain that there must have been a few glitches here and there, but the overall system worked. There's a process of mitigating and containing the faults and failures, and NASA is known for that.

I don't disagree with the principle of OP, but just repeating it in a dogmatic way doesn't make things better. I want to see practical examples and ways to fill the gap (between the dev and prod).

Most hardware engineering is done ahead of time without a full production style environment. This is because the cost of iterating is much too high. You can't build a bridge every time you want to try a new cable or bolt. It forces designers to make models and assumptions about their systems and, inherently, puts downward pressure on complexity. It also forces them to truly understand the principles behind what they are building.

The fact that Perseverance and other Mars rovers have been successful is amazing and took an incredible amount of work to accomplish. These are complicated systems that were vetted using models and simulations without ever having been run "in production". This comes at a high cost.

Critical software is never tested in production or run "in system" before it is deployed. Airplanes, banks, medical systems all require extensive validation through testing on models and simulations. You can't test your changes for the first time on a live aircraft or living tissue. Costs reflect that.

The truth is, a lot of software is not critical. You can get away with hacking / trial and error type development and never fully understand the system you are helping build. Frankly there is a lot of money to be made providing brand new services that are unreliable or quirky or ephemeral because those services never existed before.

My point is that how you test and validate your software product depends on your application. Sometimes the costs don't make sense to "run everything" and sometimes its physically impossible. I agree that you should always advocate for the highest fidelity testing your business case can afford, but be prepared to settle for less than everything and rely on your engineering skills to buy down risks in the gaps.

Such a terrible advice. I've been several times involved in companies where this approach was followed, and where you could only run your own service with mocks, and we had incredibly better results with the latter approach by a far, far, far amount.

Just imagine an Amazon developer running amazon in its own laptop (or cluser, or server, or whatever... multiply it by the number of devs... it is insane).

I one time met a guy who was trying to debug an issue he was having in production on a sprawling Django code base which he couldn't run in any other environment. It was impossible until he got the system running in the local environment.

I also came on once to a set of projects where there wasn't a good separation of the environments; test and prod shared the same database, b/c it was "too hard" to replicate the prod data back to test. Consequently, no real DBA work could be done b/c it couldn't be tested, only small hot-patchy kind of work.

Separation of concerns is a slightly different problem but the root cause of the issue is similar, you need to be able replicate the complexity of your system in production as much as possible in a lower environment, otherwise you won't be able to make as significant changes as you'd like to your system.

I don't think this is terrible advice, while I recognize there are limits to what is possible to run locally/in lower environments, the central notion of 1) separation of concerns and 2)matching the complexity of the lower environment to production as much as possible is I think good advice.

So many +1s

> Developers of large or legacy systems that cannot already be run in their entirety during development often believe that it is impractical to run the entire system during development.

This is just a flat out ridiculous statement this author makes. My wife works on an automation system intended to respond to incoming events in the form of positive tips in surveillance system collections that in turn cues further tipping rules to retask those collection systems. One of the external dependencies is the entire spy satellite control system of the United States. How are you supposed to run this in your development environment? Some things you have to mock.

I think the spirit of the article is: you should be able to "download real data from yesterday until now", and be able to run the system locally (or in a stage server wherever). Sure, with some mocks or not, but that's super convenient for testing, debugging, etc. without having to risk nasty "write tests" in production

Downing real data at a lot of companies sounds like a PII nightmare.

Stating the obvious but the problems start when you pretend you're Amazon on 1/100000th of Amazon's budget, and you can't keep up with the costs and complexities of trying to be like them. The author's advice is probably derived from smaller-scale experience - not necessarily even startup work, just low-budget, small, inexperienced teams, etc.

ha you'd be surprised how many services run their own development instances per developer

This post to me reads like the author has given up on defining contracts between systems.

While I think it's useful to be able to exercise a system for real, I generally prefer to do it in production rather than attempt (and fall short) of reproducing production in an isolated environment.

Meanwhile, I'll invest my time in defining and understanding the contracts between services, and develop against those contracts so I don't have to run the entire stack of connected pieces every time I want to make a small change.

Yes, if you can not run the real thing locally you should not be touching a production system at all.

I worked in so many places where nobody knew how the system worked and the most invaluable thing I do is to bring the production system into localhost.

It pays off real quick and people get amazed how fast I can solve long standing issues in the system by using tcpdump and co.

> In my experience, they're always wrong.

In my experience, if someone claims that someone else is always wrong, then he is wrong.

> "Run most services in a mostly-isolated development environment, calling out to a few hard-to-run external services" doesn't count.

That's pretty harsh, as it seems to rule out any system that integrates 3rd party features, such as payments.

Payment processors all (?) have a test API you can run against. Stripe, for instance, gives you a separate account with an API key that behaves the same but no money moves. You can test your whole process including exceptions, reconcilation, and chargebacks.

Running a service that handles payment without testing the entire flow is insanity.

external services don't always have a test API, and it's not always as cut and dry as processing user payments. sometimes it's less important than that, and testing it would require replicating an external server to be able to test it. in those cases the tradeoff of time spent vs. value isn't as clear.

I don't think GP was saying they didn't test, they are calling out to the test apis, still 3rd party. One should go a step further and stub out or have a local payment processor so that everything runs locally and with no delay.

That breaks the "no mocks" rule.

"Mostly-isolated development environment" is more important here than "external services", IMO.

I've been at organizations where local development required connecting to a MSSQL instance in a server closet at HQ; This is unacceptable, and devs will constantly be at war with each other.

Mocking your analytics provider (or whatever) seems much less problematic to me.

Yeah. I think the rule should be that you need to be able to run your system; at the point where you're interacting with an external system (one that you don't fix the bugs in yourself), running your system "locally" should treat it as an external system too.

Stubbing out payments is really nice, you can have end to end integration test flows that execute in milliseconds. Use the affordances that the concept of interfaces provide.

We used a variant of this[0] when I was at Canva (and I'm sure they still do). Having a hard requirement that you are able to fake out any external dependency in dev environments provides a massive boost to dev speed. The investment is tiny too - often a single interface and a day of dev time to build out a functional fake service for, say, S3 or SQS.

[0] https://product.canva.com/hermeticity/

Is it possible to use, say, S3, and meet these criteria?

(Then, something more complex or special purpose, like say Amazon Elastic Transcoder or something).

I had to actually do something like this.

Due to the hilarious nature of the business, getting an S3 account to use for 'local' (i.e. dev machine) development was a huge chore.

We wound up creating an Interface to abstract our usage of S3, and wrote a 'local' provider that just used the local filesystem. Of course, many use cases that would not work....

But the big -advantage- of it was, we at least knew that if we ever wanted to switch away from S3, it wouldn't be hard. The Interface had a spec that could be used to ensure that any plugin would work with our application's usage.

It wasn't -that- much more work.

> We wound up creating an Interface to abstract our usage of S3

Which violates these requirements, right? Using something that mocks S3 instead of actually being S3?

(If I was going to do that, I'd just use open source Minio S3 clone, assuming it adequately cloned the S3 features I needed).

I think there is always an ideal case, but if you can't get there 100%, doesn't mean you shouldn't try, ending up with 95%.

Well, if you use S3 then it has to be, because what this person says HAS to be true. So, yes, you can use S3, so we need to carve out external services to say we don't need to run those services personally. So, what if you have a separate business unit that you don't share code with that created the service? Well, I guess we will make an exception for that too, but they REALLY need to make sure their stuff works. Ok, what about the group on the 4th floor? I don't really want to run their code because wow that's a hassle and it is on a completely different stack and test system. Well then just treat it as external. Same with J's service, they sit next to me. And it's really the same stack but it's still hard to run it as a complete system to test. Well that's ok too then I guess. Ok we are back to microservices and why they are useful.

All that said, I agree with the sentiment of the post, testing the system as a whole can reveal issues before they get to prod. But maybe that means you need to harden your contracts and internal SLAs.

Sure, if you use a separate bucket for your teat instances, and reset it between tests.

This can probably work with more complex services too.

to avoid conflict between two entities running tests at once (say via CI), I guess it would need to use Amazon API to construct the bucket and tear it down around each test?

But I'm not sure that would comply with the requirements, they seem to forbid running against external services at all, no? '"calling out to a few hard-to-run external services" doesn't count.'

Yep, to use AWS S3, you'll need a lock to ensure a single job at a time, or create buckets programmatically for each test.

I think every team needs to decide how much to follow the original advice, there is certainly a balance between simulating external service locally (with possible associated behavior changes) or using an external service.

If you do want to run everything locally, there are servers which offer S3 API, like "minio" and "fake_s3". Probably would be very nice for localhost operation. But if you need a complex thing like Transcoder this is not going to work.

I know it says "no mocks", but you could use Minio to have a local S3 alternative. Their client can even be used as a frontend to both S3 and a Minio instance.

That's available for S3, although I think it would still violate OP's requirements, but in spirit it'd be good.

Not usually available for most other AWS services you might be using though. Say, Elastic Transcoder for transcoding A/V media.

I would bet 78% of managers trying to get their developers to go faster will coach them to try to get that PR done in one day instead of 3 RATHER than investing the three days in taking heed of this advice. The main reason developers are slow is because of problems described in this post. But, investing in that change will be really, really scary.

There really needs to be an expectation during sprint planning meetings that developers get a chance to adjust a "multiplier" for all story points, as a way to keep in everyone's minds the long-term costs of short-term decisions to cut corners by foregoing proper testable designs and good engineering practices (including security).

One approach I've suggested in the past is that every time a developer has concerns about a design and is over-ruled by a manager, the developer should take a jelly bean from a jar that starts full of jelly beans. The team can keep track of the weight of the (contents of the) jar, and use the inverse of the "fullness" fraction as a multiplier for how long a ticket will take.

For example, if a story has a complexity of 2 points, but the jar is two-thirds full, then the ticket should be expected to take as long as a 3 point ticket would. That might not be exactly right for every ticket, but this process might at least be more accurate than ignoring the hidden cost of accumulating technical debt.

Of course this means that once the jar is empty, all estimates become infinity, or undefined, but if it ever gets to that point then at least the developers will have had some jelly beans to cheer them up. Similarly, a manager can have the satisfaction of "seeing" developer productivity increase when, at the end of a sprint focused on paying down technical debt, the developers add more jelly beans to the jar, to restore the fraction towards unity.

Sounds a lot like the budget described in the Google SRE book. Agreed.

And then it gets interesting when your are not able to run your system by yourself. Which happens in plenty of circumstances.

Eg You are writing software for a planned device which is operating on Mars. First, the device doesn't exist yet, second, it's hard to simulate the Mars environment locally.

I did such projects for very expensive devices. Think of nuclear power plants or formula 1 race cars. Thanksfully you cannot complain loudly "I need to be able to run this thing locally!". Your job is also to figure out how to simulate the environment, and to do a proper job without being able to test everything beforehand.

I’ve always been curious, if you’re working on a really complex system with lots of disparate services (or even if you are using a managed database like Spanner for example), what does your development environment look like? Do you spin up containers for all the services? Run a compatible RDBMS instead of the managed database? All my experience has been with systems that can be set up locally - how do you go about developing/testing/debugging without that?

What does "system in its entirety" even mean? Should an Amazon developer be able to run every Amazon property simultaneously on their laptop?

As long as you have strict contracts between parts of your system (whether that is in a form of microservices, IPC, classes or whatever else), a change made in isolation should be perfectly fine.

I mean 'users' are part of the entire system

Tell that to people who design rockets.

They seem to be given an allotment of several tries that result in blowing up their systems.

Imagine you're working at MS and have to build/run windows from VS.

Monolith, sure.

Microservices, no.

The easiest way to follow this advice is to bang on your production system directly.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact