> So what have we now? A “mono repo” codebase, because clearly a Git repository ...

mattmanser · on Dec 20, 2020

Everyone's chasing "scalable", mostly for no reason at all.

There's also this enduring myth you see repeated even here of how much saving there is by not having "devops", when only very big companies needed them in the first place. A few minutes per month is the most devops any startup would need to do in reality running their own VMs or servers, after an hour or two of setup. And that hour or two is often less than faffing around getting all the labyrinth moving parts of AWS/Azure/etc. working.

And the whole thing becomes a transparent farce when you look at providers like Azure, where their own admin interface is so slow and prone to random breaks you have to ask:

How is this scalable?

It's clearly not scalable for themselves, so how is it "scalable" for everyone else?

There are benefits, mainly for me I'll use them for hassle free deploys[1], but the whole thing sometimes feels like a massive con to sell the performance of a potato by advertising it as a jet engine.

[1] Like a month back I deployed a demo prototype web app I wrote in 3 hours onto Azure straight from github with like two clicks, and got the contract, can't really beat that.

thethethethe · on Dec 20, 2020

> There's also this enduring myth you see repeated even here of how much saving there is by not having "devops", when only very big companies needed them in the first place.

I dunno, the first company I worked at was very small, under 30 people, sales included, and I was hired on as an SA. If they were using serverless, they would not have needed to hire me.

Deployments and system upgrades were very toilsome. The dev lead would spend a few hours deploying new code and we’d only do it around midnight in case there was a problem (our customers were almost entirely US based). I spent a lot of my time automating OS upgrades on the web servers behind our proxy. All of this work would be unnecessary for that small company if they were using serverless.

Additionally, as mentioned before, the majority of their traffic was during the day so they could have scaled down automatically at night, cutting costs. Would this be cheaper than running machines 24/7? I’m not sure, but I could see why you’d be interested in at least looking into it

kortilla · on Dec 20, 2020

> Deployments and system upgrades were very toilsome. The dev lead would spend a few hours deploying new code and we’d only do it around midnight in case there was a problem (our customers were almost entirely US based). I spent a lot of my time automating OS upgrades on the web servers behind our proxy. All of this work would be unnecessary for that small company if they were using serverless.

Not using serverless doesn’t mean there can’t be a CICD pipeline.

thethethethe · on Dec 20, 2020

Yeah of course. I’m not saying that serverless is the only ci-cd solution, many businesses build ci-cd pipelines on top of serverless apis. However, most serverless platforms take care of a lot of the deployment process out of the box, which would have been enough for the business I was describing in my comment

winrid · on Dec 20, 2020

You don't even need serverless for those things. Any of the automated deployment/orchestration systems in the last ten years solve those problem (ansible, terraform, etc).

thethethethe · on Dec 20, 2020

Yeah it’s true but it requires a lot more config. The job I mentioned in my comment involved thousands of lines of anisble to perform operations you wouldn’t even need to consider/could perform in a few clicks when using a serverless platform.

Don’t get me wrong—serverless platforms are not a ci-cd silver bullet. Many organizations build pipelines on top of serverless deployment apis, but it certainly does take care of a lot of things for you—not to mention other features mentioned elsewhere in my comments

1337shadow · on Dec 20, 2020

You're supposed to encapsulate those in reusable roles, see an example methodology here: https://yourlabs.org/posts/2020-02-08-bigsudo-extreme-devops...

And since we have containers it's become really much easier actually ... Deploy a Traefik container which will watch dockerd socket and self-configure HTTPS and all when your other script starts container. Really, it's much easier than back in the days: https://github.com/jpic/bashworks/tree/master/vps

nojvek · on Dec 20, 2020

Wasn’t the original post complaining about avoiding a gazillion lines of config in the first place ?

1337shadow · on Dec 21, 2020

Lines of config are like lines of code, refactoring them regularity is basic hygiene like brushing your teeth. You can "have no config lines" if you go ahead and click your config but then you don't have "infra as code" anymore so that's also "going back to the 70's" I guess.

1337shadow · on Dec 20, 2020

When your infra is in AWS, you need an "AWS Certified DevOps"[0] instead of a "typical DevOps", I mean, the typical OSS hacker - which is kind of sad really, and makes me wonder: is this how we're going to build an Idiocracy[1] ? Anyway, things like backups, monitoring, alerting, debugging, is going to be a requirement anyway, and for me it's harder when locked in proprietary stuff such as AWS, I'd rather have OSS I have freedom to hack ;)

[0] https://aws.amazon.com/certification/certified-devops-engine... [1] https://en.wikipedia.org/wiki/Idiocracy#Plot

grey-area · on Dec 20, 2020

I don’t think your past experience reflects current practice.

My reality at a 20 person shop with a few devs is much closer to the OP - a few hours setup (we save the cloud config and can remake servers in a few minutes), servers run for months without intervention, uptime is on par with cloud services, deploys take seconds. Managing servers or deploys just are not major problems for us.

Serverless is not a solution to deploy problems - mostly because they’re just not significant problems for us, if anything the dev deploy story is worse than renting VMs. It’s an interesting take on serving requests and I do believe some variant of it might be the future (a variant more like cloudflare workers IMO), but at present there are serious drawbacks and it is not clearly better, particularly when provided by a predatory company like Amazon.

manigandham · on Dec 20, 2020

All you needed was PaaS, which has existed for a long time. Or the even older name: managed services, offered by pretty much any decent hosting provider for decades.

I remember using companies like MediaTemplate and FullControl 10 years ago for clients and not having to worry about OS updates.

thethethethe · on Dec 20, 2020

Serverless is a PaaS. And yeah, those services could work too. Serverless isn’t the only solution that works.

empthought · on Dec 20, 2020

> A few minutes per month is the most devops any startup would need to do in reality running their own VMs or servers, after an hour or two of setup.

Right up until something crashes or some hardware fails.

jiggawatts · on Dec 20, 2020

> Right up until something crashes or some hardware fails.

Unlike the public cloud where Nothing Ever Goes Wrong (tm).

There are never brownouts.

There are never latency spikes.

No undocumented quota limits.

Documented but confusing quota limits.

Unexpected performance issues caused by other customers.

Forced upgrades.

Forced patching.

Shock bills.

No siree Bob, that kind of thing just doesn't go on in the public cloud! It's all rainbows and unicorns....

empthought · on Dec 20, 2020

I fail to see how this supports your claim that you can get by with "a few minutes per month" of devops. In fact, it's a classic red herring.

Your original claim is quite unrealistic and you'd do well to retract it. It's bad advice both for public cloud and for self-managed efforts alike.

daniellarusso · on Dec 20, 2020

When I used to work for an MSP, we would sell Dell servers to rural banking groups.

We supported disaster recovery and high availability deployments.

Any failed component, Dell would send us a replacement in less than 24 hours.

AnIdiotOnTheNet · on Dec 20, 2020

24? I guess it's that long because it is rural? We use Dell servers and I sleep a bit better having seen their 4hr support replace a motherboard with a bad lifecycle controller within 2 hours of the initial call.

Try calling Amazon and getting that kind of response for your problems.

daniellarusso · on Dec 20, 2020

Yeah, my closest Target store and major airport is over 2 hours away.

But, yeah, that is worse-case scenario.

kortilla · on Dec 20, 2020

But it is a few minutes per month. A properly architected application won’t depend on a single server so a hardware failure will be as unimportant as it is the serverless world.

empthought · on Dec 20, 2020

Architecting a system properly is one of the key functions of devops. You don’t get it by investing a few minutes per month, whether public cloud or bare metal.

rpedela · on Dec 20, 2020

You have all that DIY too, except maybe shock bills, but it's your problem to fix. Like with everything, extremes are bad and all managed or all unmanaged are both bad. My personal guideline is use off-the-shelf services if they are cheaper than DIY, including labor, or if DIY provides no competitive advantage and off-the-shelf is cheap enough. But if DIY is much cheaper or gives a competitive advantage, do it yourself.

tluyben2 · on Dec 20, 2020

You do not notice that either with the right setup. I also rather hire a good SA and pay 10-100x less monthly (for which I can hire an army of devops here in the eu by the way) with the risk of all hardware dying at the same time (which does not happen, not now, not in the 90s when I started hosting).

Sure serverless etc has it's upsides and in that case I do use it, I just encounter close to 0 companies per year that need it or benefit from it. Aws ways benefits.

grey-area · on Dec 20, 2020

The choice is not only serverless or hosting on bare metal - there are a lot of options in between. You can hire VMs and have the vendor transparently deal with hardware faults for example without downtime. Running your own VMs doesn’t mean running your own hardware.

The original claim is not at all unrealistic, I’ve been running a small shop with about 20 VMs without issues for years, uptime seems to me about comparable with serverless or hosted services from big players who regularly have small outages and there really is very little work involved.

FpUser · on Dec 20, 2020

I always have standby copy deployed elsewhere. And I make it a point not deploying production until a single script could rebuild new clean system by using backed up data, configs and software packages. So for me it is no a big deal at all. Well if the hardware died I have to pay for replacement which is not fun but it is a very rare occurrence.

otabdeveloper4 · on Dec 20, 2020

Yes, "the cloud" is bunk from any technical point of view.

It exists to shift CapEx to OpEx for accounting. Any technical considerations are a distant third place on the list.

3pt14159 · on Dec 20, 2020

Well, if you mean CapEx because one needs fewer programmers and a higher AWS bill, then yes I agree with you. But if you mean CapEx because one needs fewer physical computers that they own and a higher AWS bill for computers that they rent, then no I do not agree with you.

First, the markup in just sheer ops terms is enormous. Minimum 4x, and in some cases 20x. Sure, you get a couple more nines of reliability, but 99% of startups don't need that much reliability.

Second, in countries like Canada where capital expenditure on computer systems can be heavily marked down[0] people still use AWS and DigitalOcean by default.

The main reason people don't do colocating is that it is a waste of time better spent. DigitalOcean is cheap enough. AWS is cheap enough and if the team doesn't have experience running Postgres, fine, they use RDS. Sometimes it's just sheer laziness at first that later blossomed into "holy shit, we're growing too fast to even think about migrating off of these giant AWS bills" which was essentially Dropbox's story if I recall correctly.

[0] 55% per year, and in practice they allow full expensing in the first year for petty amounts of outlay. For example, if your business is a data centre, sure it's 55% per year, but if you run a flower farm and fully expense a new laptop for sales the CRA is generally fine with it.

otabdeveloper4 · on Dec 20, 2020

I'd guess that the vast majority of "cloud" customers are boring enterprises doing boring enterprise stuff, probably not even Internet-related stuff at all.

flyinglizard · on Dec 20, 2020

That can happen without serverless - renting VMs is OpEx enough.

otabdeveloper4 · on Dec 20, 2020

Yes, if you manage to shift the responsibility of maintaining these VMs on the developer positions you already have.

noizejoy · on Dec 20, 2020

Leasing programs did that well before the cloud ever came along

otabdeveloper4 · on Dec 20, 2020

"The cloud" is a leasing program except even more optimized for OpEx - pay-as-you-go monthly billing and you now don't even need to hire ops people because it's "serverless".

(Of course you have to force your devs to do the ops stuff now, but that's not accounting and HR's problem now.)

WJW · on Dec 20, 2020

> Everyone's chasing "scalable", mostly for no reason at all.

No technical reason, maybe. There is a social reason though: if you point out that there is no need to build scale (yet), you are pointing out that the business will not achieve overnight success. Even if it's true, it kills excitement and is therefore not something that people want to hear. Who wants to realize that what they're spending large amounts of time on (and possibly even neglecting other things like health, sleep and social contacts) will not even make all their financial dreams come true? Much easier to not think too hard about it and keep on doing what everyone is doing.

So, building out scalable infra is signaling that you too believe in the company and are preparing for the moment (any time now!) that "webscale" is needed. It's showing that you are part of the team.

throwaway2048 · on Dec 20, 2020

Its like a well stocked grocery store with thousands of brands, I think the biggest draw of "the cloud" is the illusion of infinity plenty, all you need to do is press a button (even though the reality really doesn't measure up, at least not in ways people naively assume)

BaronSamedi · on Dec 20, 2020

Scalability is just one reason for using serverless. There are many others. Most of which have to do with being able to focus on the part of your code that actually does something instead of on infrastructure and integration. For example, not having to configure and maintain VMs, containers, or Kubernetes, off-loading a part of security onto the cloud provider, and built-in integration with supporting services like logging and metrics.

Serverless, like every design decision, comes with its own set of trade-offs. Whether those trade-offs make sense for your environment is something you have to determine. At my company we have seen a productivity increase and shorter development times using serverless. Of course, that was after the learning curve which is certainly not trivial.

xnx · on Dec 20, 2020

I sometimes think of how far we still are from utility-quality computing services. Every company needs electricity with near-perfect uptime, but almost no company has a full-time electrician in-house.

sveme · on Dec 20, 2020

Does anyone know of a resource showing me some parameters for estimating how far different managed servers reach? How many concurrent users, bandwidth, etc.? Have a hard time getting a mapping for expected concurrent users -> hardware requirements. How quickly can you adjust the HW resources in a non-cloud setup if your startup product is more successful than expected?

skolsuper · on Dec 20, 2020

Impossible question, it depends what your app does and what your users do. Even within the same company, 1000 concurrent users of Google Maps will use completely different resources than 1000 concurrent users of gmail. It'll even change over time for the same app and same users, as their behaviour or the world around them changes.

tsss · on Dec 20, 2020

> by not having "devops"

Apparently, you don't even know what that word means. You can not "have" DevOps. It's a methodology and not a thing. Besides, DevOps means exactly the opposite of what you are implying: it is the practice of not having a dedicated operations team and letting devs handle as much of the operations for their own applications as they can.

> few minutes per month

I can tell you my team spends a good 30% of its time on operations. Some of that (for example adminstrating K8S clusters) could be offloaded to cloud providers for money.

waheoo · on Dec 20, 2020

Makes me think of those old "Webscale" skits on YouTube.

amiga-workbench · on Dec 20, 2020

https://www.youtube.com/watch?v=b2F-DItXtZs

https://www.youtube.com/watch?v=bzkRVzciAZg

tyre · on Dec 20, 2020

I think what’s important to remember is that these kinds of “truth” are not the full story.

Yes, a technology probably won’t live up to its marketing.

Yes, what works is probably good enough.

Yes, changing has switching costs and sacrifices institutional knowledge.

Yes, we’ve seen these technologies before and switched away from them from various reasons.

That’s all fine and sometimes frustrating as an engineer. But it’s not the only reason decisions are made.

Sometimes pitching your exec team on The New Hotness gives cover for changes that actually do need to happen.

Sometimes adopting The New Hotness is marketing to recruit engineers who want to try something new.

Sometimes adopting The New Hotness is because some people are bored and uninspired in their jobs and just want to believe in something good again.

The success of things like MongoDB and blockchain are not because they are better or even very good at most things people use them for. They’ll never live up to the hype. But people need things to get excited about. It’s psychological. Attacking them purely on technical points can lead one to miss the actual point.

muglug · on Dec 20, 2020

Thanks for putting this into words. People want a reason to log on, a reason to show up to work.

I wrote a blog post on Monday about a relatively old and not-hot language. The post nevertheless trended on Medium and received 50k views. A lot of the feedback was from people who were just happy to see someone senior say something positive about the not-hot language.

Anyway, working with The New Hotness also gives people something to put on their resumé: “I built system X on top of The New Hotness. Nobody had built such a system before for The New Hotness”.

ab8 · on Dec 20, 2020

Is being paid hundreds of thousands of dollars plus stocks and bonuses not enough of a reason to show up at work?

jlokier · on Dec 20, 2020

I suspect the majority of readers are being paid less than $100k with no stock or bonuses, and correctly intuit that new-hotness things on their résumé will help them climb the salary ladder towards that magic $100k threshold.

kortilla · on Dec 20, 2020

Well the funny thing is that once you crack FAANG interviews you can usually get offers from multiple of them and then money no longer becomes the decision factor. The choice becomes, “work on X and make $400k or work on Y and make $400k”. You bet the tie breaker often ends up being the prospects for working on cool shit.

randmeerkat · on Dec 20, 2020

You would think so, but as it turns out, it’s not. First world problems^2.

fractionalhare · on Dec 20, 2020

Not when that becomes table stakes, no. If you have five different companies offering you that kind of money, eventually you'll start thinking about more than just the money.

watermelon59 · on Dec 20, 2020

I think it's sad that the vast majority of people in our field (at least in my experience) is so damn excited about building new stuff all the time. There's no appreciation for maintaining, optimizing, and improving existing systems. There's only talk of "adding value" but very little of "keeping value."

So people keep moving on to the next project, leaving the old ones running live but to rot, and at best every once in a while someone will be assigned the dreadful task of having to figure out why the "legacy" (even though it's still fully in operation and servicing customers) system had an outage and can't recover.

Our field suffers from a serious lack of focus. People don't learn to work under constraints. Instead, they prefer to build a whole new thing, full of complexity and hidden issues, just to get around a few nuisances that, with the right mindset, actually force one to be creative (which should make work fun).

I've recently commented here on HN about how I feel like I'm outside the norm on these things [0]. I was happy to learn I'm not alone. I do think those of us who don't mind working on "legacy" systems and perhaps with limited tools are the minority though.

Sometimes it's not even about working on old systems. You work with current tech, build something, and as soon as it's barely deployable, you're expected to move on to the next thing. "It's a great opportunity for your career," they say, "you're gonna get to build something new and make key decisions."

Why does it have to be decided for me? It's my current predicament. I'm working on a project that took me two months to get in the right mindset for and get all the context I needed. I had to try things out, sort out ideas, until I got a clear picture of what to build. Now I've built the first iteration of it. It works, and I'm super excited about it. But it's not done by any means. It needs more features. It needs to be polished so it doesn't make operations miserable running the thing. And yet... I have to move on, and hand it over to someone who doesn't have the same excitement for it that I have, and has maybe 10% of the context.

All because building new things (and adopting The New Hotness, as you say) is what "everyone" wants.

[0] https://news.ycombinator.com/item?id=25404039

ljm · on Dec 20, 2020

It doesn't help if your career path in a company is tied towards 'visibility,' where certain roles will never see a promotion no matter how critical they are.

I've chosen to take on more than one heavy maintenance job, no client facing visibility in terms of fancy new features. Feedback in performance reviews would always be "it's hard for people to see what you're doing." In the end you have to find your own way to game the system to get appropriately recognised, but of course that kind of office politics isn't really what you want to spend time either.

I'm doing more 'new feature' style work now but I'm happy to not be tumbling through the waves of hype. We get a little bit of new hotness, because of course you need to embrace some aspect of that to avoid stagnating and getting too stuck in your ways, but not so much of it that the engineering team is incentivised to churn through half-assed features.

satyrnein · on Dec 20, 2020

It's great for those engineers that they are so far up Maslow's hierarchy that they're willing to bring significant risk to their employers in the pursuit of self actualization. However, there are a lot of engineers in the world (outside SF and NYC) that are still concerned about lower levels like physiological and safety needs. I wonder if the world getting more accustomed to remote work will change the cost benefit analysis on hiring those engineers instead.

still_grokking · on Dec 20, 2020

I don't think burning in sum hundreds of billions of Dollars (or even more) because "it's OK being irrational, because emotions" makes sense.

travbrack · on Dec 20, 2020

The business case is that you won't be able to retain good engineers if you refuse to rewrite your AS400 COBOL app.

yters · on Dec 20, 2020

I’ve wondered whether these good engineers are really a net benefit. They crank out something spiffy, move your big systems to it, and then disappear to work on the next hotness, leaving an already burdened tech team to support yet one more spiffy stack that is not quite as reliable or easy to use as we would like.

sgt101 · on Dec 20, 2020

I think you need to either talk to your system architecture function or get a system architecture function in your organisation.

Managing this stuff is a job, and it requires discipline and organisation. It's one of the few functions of an IT department that is left once you go to cloud (as dealing with Oracle's bullshit is then gone). The younglin's doing development in the business functions have to play by the rules - and the business functions need to agree that there are rules or they will :

a) spend much monies on the issue you describe

b) get cracked open like fresh eggs by any passing hacker who is keen on sports cars

still_grokking · on Dec 20, 2020

I'm not sure I would call people that would rewrite something just because it gets boring otherwise as "engineers". I think the proper term would be "children".

kortilla · on Dec 20, 2020

Yet the best engineers in the field aren’t working on legacy banking apps. The problem is that once a product is put into production and the hard problems are solved, the engineering becomes significantly less challenging (unless you’re in an ultra rare scenario of exponential growth).

The problem we have in software is that we have engineers build bridges and then somehow expect them to stay around to paint the thing, change light bulbs, etc and stay fulfilled. A person who is an expert at making new things won’t like being in maintenance. It’s that simple.

still_grokking · on Dec 20, 2020

Sure, that's absolutely true. But it doesn't mean that you need to rebuild already working stuff over and over.

It's much more fun to improve the status quo by inventing something nice as to repeat the work of the past generation (only painting things in a different color).

mikepurvis · on Dec 20, 2020

"You deploy, get an error message, and login to CloudWatch to see what actually happened - it’s all batch-driven, just like the bad old days, so progress is slow."

Even without cloud APIs and containers and whatever, I find there's a creeping thing that happens here which has been helpful to become consciously aware of and intentional about fighting back on.

Basically, the thing is that wrappers become interfaces. You write tool A, and then tool B, and then some kind of script C which calls the two tools and produces a result. Then someone says "dang, so convenient", and sets up a Jenkins/Rundeck/Airflow/whatever job that runs script C against some source of inputs automatically— maybe there's even a parameterized build capability where arbitrary inputs may be submitted to it on demand.

Before you know it, the fact that the original tools and script exist has been forgotten, because everyone's interface to this functionality is a handy webpage. And when someone wants to add more features, they don't even go back and change the script, they end up just adding it directly into the job definition. Before you know it, the full scope of what is being done on "the system" isn't even really available any more to be installed, invoked, and tinkered with locally, not in a meaningful way, anyway.

It's not even necessarily all bad, but when I see this happening in a domain where I have the awareness of the underlying tools (or I may even have written or contributed to them), I do try to help demystify them by writing up docs on how to do the local setup, helping others get going with it, etc.

pjc50 · on Dec 20, 2020

You can solve anything by adding a layer of abstraction - except having too many layers of abstraction.

"Serverless" always reminds me of CGI/fastcgi, which was still a thing the last time I wrote web apps for money. It had the advantage of "scaling to zero" and could run a huge number of low-usage apps on a single 2000-era server.

bob1029 · on Dec 20, 2020

I love the scalability "argument". The capabilities of a single thread of processing in a modern x86 CPU are underestimated by several orders of magnitude in most shops. The unfortunate part is that the more you chase the scalability with the cloud bullshit, the more elusive it becomes. If you want to push tens of millions of transactions per second, you need a fast core being fed data in the most optimal way possible. Going inwards and reducing your latency per transaction is the best option for complex, synchronous problem domains. Going outwards is only going to speed up if you can truly decouple everything in the time domain.

heipei · on Dec 20, 2020

Judging by the occasional post on HN, even a lot of experienced software engineers (the "experienced" being my take) seem to have no good handle on what a single thread and a single machine should be capable of handling, what the latency should look like and what it should cost as a result. The posts that I mean are about "look how we handled the load for a moderately active web-app which is 90% cacheable with only five nodes in our k8s cluster".

I really don't know why that is, my guess would be that few people have really built something beginning to end with the most boring tech possible (start out with PHP+MySQL like everyone did 15 years ago). Or they always operate at levels of abstraction where everything is slow, so they have simply gotten used to it, like their text-editor not running smooth unless it's on an i9, because the text-editor now is a pile of abstractions running on top of Electron when vim was able to run smooth decades ago. It's sad and both an opportunity at the same time, because you can be the one with the correct gut feeling of "This should really be doable with a single machine, if not we're doing something fundamentally wrong".

alpaca128 · on Dec 20, 2020

I think you're right on the money with people having gotten used to it. Once I truly started harnessing the power of Vim combined with shell scripts and terminal multiplexers, my patience for many other programs and tasks decreased even further.

We have the computing power to run complex physical simulations or AI training sequences on a normal home computer, but for some reason use programs that take 100 times longer to start than old software despite not having more features, and websites that sometimes even come with loading screens. Electron isn't even as bad or slow as many people think, but somehow developers still manage to throttle it so hard that I might as well just use a website instead.

As someone who is just starting with nodejs and web development I find a lot of the tech feels nice but sometimes also unnecessarily abstract. Sure, it tends to make a lot of the code very elegant and simple, but every additional framework adds another guide and documentation to look at, another config file that has to be correctly referenced by an already existing one so npm knows what to do, another couple of seconds of build time and another source of bugs and build problems. Then of course you need that trendy package for easier imports maintenance - which IDEs automatically handled in the past, but now we gotta use an editor running in a hidden web browser which started from scratch in terms of features but is just as slow in return.

bob1029 · on Dec 20, 2020

Once I get our current product in a good spot WRT maintainability and sales pipeline, I am planning to spend some time (~6 months) looking at developing ultra-low latency developer tooling. I feel like I can deliver these experiences through a browser using some clever tricks.

Going through my current project has really worn me out with regard to tolerating UX slowdowns in developer tools. I am getting so incredibly tired of how shitty visual studio performs on a threadripper with effectively infinity ram. I've had to turn off all of the nice features to get our codebase to be even remotely tolerable. And, dont get me wrong. This is not a plea to Microsoft to fix VS. I think everyone involved can agree that fundamentally VS will be unfixable WRT delivering what I would consider an "ultra-low" latency user experience (mouse/keyboard inputs show on display in <5ms). Any "fixing" is basically a complete rewrite with this new objective held in the highest regard.

Current developer experiences are like bad ergonomics for your brain. This madness needs to end. Our computers are more than capable. If you don't believe me, go play Overwatch or CS:GO. At least 2 independently-wealthy development organizations have figured it out. Why stop there?

alpaca128 · on Dec 20, 2020

> mouse/keyboard inputs show on display in <5ms

That's not possible, at least if you measure the full time from actuating the key to the screen showing the updated content. The record holder in that regard is still the Apple II with 30ms[0].

But I agree, modern software should react much faster. It's kind of weird how on one hand we have hardware that can run VR with 120Hz and extremely low input latency, and at the same time some programs that only need to display and change a couple bytes in a text file need 4 seconds to even start.

[0]: https://danluu.com/input-lag/

bob1029 · on Dec 20, 2020

I am referring mostly to the latency that I do have control over in software. I.e. time between user input event available for processing and view state response to user.

jjav · on Dec 21, 2020

> Or they always operate at levels of abstraction where everything is slow, so they have simply gotten used to it

That's a lot of it, I believe. One startup I dealt with was honestly proud of how they were scaling up to hundreds of requests per second with only a few dozen AWS VMs to handle it.

I'm old enough to know how ludicrous those numbers are, having built systems handling more load on a single desktop machine with 90s hardware.

But I've come to realize some of the younger developers today have actually never built any server implementation outside of AWS! So the perspective isn't there.

davedx · on Dec 20, 2020

Yup. At the last place I worked me and a colleague would joke that given the traffic our ecommerce platform actually had, we should be able to run the whole thing on a single Raspberry PI. I think he even ran some numbers on it. What we actually had was two containerized app servers and a separate large RDS instance and SES, SQS, SNS, ELB and all the other bits and pieces AWS buy in gets you. The cloud bills for the traffic we handled were ridiculous.

FpUser · on Dec 20, 2020

>"Invariably, their response has been to tell me that I just don't get it"

I cured this one when my native server running on not so expensive multicore CPU with enough RAM left their cloudy setup in a dust performing more than a 100 times faster and still having huge potential reserve in vertical scalability. They were just astonished at what a single piece of modern hardware in combination with the native code could do.

rangoon626 · on Dec 20, 2020

We need to meme c++ back into popularity

FpUser · on Dec 20, 2020

I do not think there is any real need. To me it is doing just fine. It does not have to beat every other language into submission.

vntok · on Dec 20, 2020

Congratulations, from now on you are on call 24/7 in case anything happens to that machine. It is now your own pet. When RAM fails in unpredictable random ways, you'll spend hours reading the syslogs. When your SSDs fail, you'll spend hours ordering new ones, setting raid up and migrating the backups you've had to provision, automate, supervise and secure yourself.

When the server gets even remotely suspected of having been hacked into, you must trash it overnight and never use any of it again.

chousuke · on Dec 20, 2020

You talk like using a physical server somehow prevents building a highly available system or from using automation. Why would the hardware be any more of a pet than any random VM in the cloud? You could also just run a hypervisor on the host and get the benefits of virtualization along with full control of hardware.

Sure, if a drive breaks you'd probably want to replace it instead of just replacing the entire machine; it's a thing you don't need to do with VMs, but that doesn't mean it's difficult. It definitely doesn't take "hours".

Managing physical setups is different from managing fleets of cloud instances, but it's not necessarily automatically inferior.

FpUser · on Dec 20, 2020

>"You talk like using a physical server somehow prevents building a highly available system or from using automation. Why would the hardware be any more of a pet than any random VM in the cloud?"

It is pure scaremongering and making not very nice assumptions about people's abilities without any real substance. If this is the idea of attracting customers or advocating the approach it seems like a very poor job.

FpUser · on Dec 20, 2020

I always have up to date standby for a fraction of money they spend on cloud. In the last 10 years or so I do not remember RAM failing in an unpredictable ways. When/if my SSD fails It'll take me a whole 10 minutes ordering another one from amazon. Oh and I always have spares lying around. My backups are automated.

Anyways nice job advertising cloud and trying to scare people into using it. You happy with it - good for you. No need to convince me as I have my own reasons and do not have to buy into yours.

GrumpyYoungMan · on Dec 20, 2020

> "...I do not remember RAM failing in an unpredictable ways."

I'm about halfway through my career and I have seen something like that happen exactly once; the erratic behavior of the machine had my team scratching our heads for a fair bit until we ran memtest86 on it after running out of ideas. So I would say that it's not impossible, just extraordinarily unlikely.

The rest of your post I entirely agree with.

FpUser · on Dec 20, 2020

Things did happen with my stuff as well. I had graphics card semi failed in an unpredictable way so I just replaced it. I also had one system randomly crashing because thermal paste dried up on one ancient CPU. It is just a little annoyance that happens extremely rarely. Anything is possible and I'll deal with it when/if. Nothing world shattering here. Cloud can and does fail as well. The list of reported outages affecting large chunks of customer beats my accidental cases with the healthy margin.

ta988 · on Dec 20, 2020

The fun thing is that many C-people are fed the VC kool-aid about scalability. They want everything to br scalable because they want to cash in. Except that not many companies reach the level where you need to scale your tech stack "exponentially". And you usually don't want to scale the dumb prototype that ended up in prod anyway. The trap here is that those systems will allow you just that in a totally unauditable way... Good luck replacing parts of those systems as well when everything is interconnected...

heisenbit · on Dec 20, 2020

> I've mentioned as much to several C-suite executives and professional venture investors I know who tend to wax philosophical about "exponential growth in cloud services and big data" and whatnot. Invariably, their response has been to tell me that I just don't get it, or that I make no sense. They give me puzzled looks and maybe even get a bit exasperated with me.

One word: Worldcom

I still remember coming up to 2000 sitting in a meeting room of an equipment provider I recently joined looking at projections of market growth, competitors and our own sales and profitability. Shortly afterwards that part of the business was sold, I remained with the other non internet part and we sailed through the .com crash unscathed.

A lot of what AWS offers makes no sense. A lot what some AWS customer's teams are building is overly complex and a horror show to maintain. But there is also value: Maintained images, backups, on demand resources and specialized resources. We hit one of the inflection points in Moore's Law and that drives us into horizontal scaling and specialized hardware.

Like with Worldcom and the fantastic internet growth story back then there is a kernel of truth and value. Find people that understand it and be glad others are playing around and subsidizing the investments in the core.

flarg · on Dec 20, 2020

The cloud makes money for people so it's important. Would you rather that money went to shareholders or to wage earners?

mlthoughts2018 · on Dec 20, 2020

I find your quote here and the follow up seriously unconvincing. You know what a miserable workflow is? Preparing all my build scripts according to the half baked build tool that our in-house Deployment team cooked up to use fabric to spray my deployment onto a bunch of nodes in our data center, all the tests and script linting pass, so then I go push it through our CI/CD pipeline (which can’t do the above tests for me because again this is all threadbare maintained by in-house tooling teams who could not be less aligned to end product deliverables), and lo and behold it breaks. Good thing I’m not paying so much for CloudWatch so I get to literally go ssh to the worker node of the CI/CD system where my task failed and cross my fingers that it wasn’t reclaimed for the next task and that the log files weren’t rotated off, grep my way to tracebacks that probably involve things no one on my team has ever heard of, spend 1/2 a day chasing them all down (if I’m lucky), and finally kicking the box to restart a build a few times and it goes. Then it goes to the in house homebrewed kubernetes deployment scripts where it promptly chokes on a secret that wasn’t configured right (thanks home made script linters), so then I have to go down a rabbit hole of eye melting Helm config of secrets for an hour but thank god I’m not paying for a simple cloud UI for this. So we get the secret cleared up and everything is great, until the kubernetes scheduler can’t schedule my pods because our data center is at capacity and now I literally can’t get work done until I chase down some probably incorrect and certainly unmaintained Grafana chart of who is using how much excess memory on which nodes, and wrangle someone with authority to force another team to stop eating lunch and downsize pods right now. After all this, 6-7 hours later and 20-30 labor hours later, my workload is finally deployed, but it will die overnight for an uncaught reason that doesn’t alert anyone, and even if it did, the team managing compute resources is in a permanent state of PTSD from never ending pages so they wouldn’t bother caring about why my workload failed and will just assume it’s application logic bugs or Docker bugs on my side for as long as they can just to avoid yet another un-debuggable failure landing on their plate, so that’ll be great for morale in the morning when I get to derail yet another day and spend time doing every type of ops work that is not my job and prevents us from delivering product goals.

This is every company I’ve ever worked for or ever heard any colleagues, friends or associates work for, that tries to build their own SRE practice on top of data centers.

Large cloud platforms alleviate so so much of this pain. The “back to the 70s” workflow sarcastically described in the quoted passage you shared sounds like an absolute dream compared to the nightmare of shipping code on internally created developer tool & deployment platforms interfacing to bare metal or space leased from data centers.

I can’t give the cloud providers my money fast enough and I have only ever felt happy about that for years and years now.

TheColorYellow · on Dec 20, 2020

This is the type of cultural accuracy I've been looking for.

Are all the commenters in this thread just cynical engineers who really can't see past their own narrow world view? I get snake oil is snake oil, but there is such a strong brand of "anything new must be snake oil" that runs through the HN community it boggles my mind.

The world is actually full of complex systems. And only once these systems are in place does the software engineer get to do his work! People on this site act as if they have all the understanding and knowledgeable necessary to decry something like cloud vendors because clearly these same people know better than everyone else and these are actually simple problems if only we had their perspective and knowledge.

The ego and arrogance of people in tech really shines through here.

nocman · on Dec 20, 2020

> there is such a strong brand of "anything new must be snake oil" that runs through the HN community.

I would say there's at least as strong a "brand" of "if it is a new way it is probably a better way" on HN as well.

I think making blanket assumptions in either direction is a terrible idea. Best to be aware of your own biases, and try to be as objective as you can in your decisions.

Experience matters, and there are new solutions that are inferior to older ones in many cases (yes, there is plenty of "snake oil" for sale). However, sometimes new people look at an old problem with fresh eyes and come up with better ideas on how to solve it.

This really isn't anything new. When I was fresh out of college I remember older programmers talking about "such-and-such" is just the old way of doing it repackaged. There is lots of oscillation in the computer industry.

papaf · on Dec 20, 2020

People on this site act as if they have all the understanding and knowledgeable necessary to decry something like cloud vendors because clearly these same people know better than everyone else and these are actually simple problems if only we had their perspective and knowledge.

A 1000 times this. These are solved problems. Anything new has to be better to be taken seriously.

mlthoughts2018 · on Dec 20, 2020

Exactly, which points out how much better cloud tools are (given the level of seriousness most organizations are applying to them).

1337shadow · on Dec 20, 2020

> The ego and arrogance of people in tech really shines through here.

You mean, "the ego and arrogance required to bend code and machines to their will and not the other way around". Maybe try to pass that and learn a thing or two.

kortilla · on Dec 20, 2020

Cool story, but the majority of that is not related to serverless. You still need a CI/CD pipeline with serverless and I’ve dealt with just as many miserable setups there.

> I can’t give the cloud providers my money fast enough

People run VMs and k8s on the cloud as well and run into every one of the problems you just described (aside from literal hardware failures).

Your rant seems to mainly be about having a good CICD pipeline and that’s not really related to cloud or serverless.

mlthoughts2018 · on Dec 20, 2020

I think you completely miss the point. With cloud tooling, my team is unblocked to spin up the resources we need, manage CI / CD in our unique way that solves our problems, quickly try out serverless if it’s a good model for a given use case, change logging or alerting, get new compute resources like GPUs, etc., all without being blocked by a choke point of in house engineering limitations and staffing limitations and political in-fighting over who owns what and who gets to define how abstract vs purpose-specific the tooling will be.

You toss out “build a good CI/CD pipeline” like that’s just some localized, solvable problem, and it’s exactly that lack of imagination and lack of understanding about the true multitude of use cases that causes this problem in the first place.

I agree “serverless” isn’t a panopoly solution to everything, but externalizing and democratizing all infra tools to cloud vendors is a major transformative solution that is worth a lot of money and solves a lot of problems.

For example, if someone like you had any leadership influence in my company’s infra org, I’d be trying to spend money on cloud tools just to get away from your authority as fast as I could, and I would be 1000% correct to feel that way and would end up with much healthier tool solutions and less blockages on use-case-specific delivery requirements by doing a giant spin move around your myopic perspective.

twh270 · on Dec 21, 2020

This has nothing to do with technology or tooling, this is your team having the autonomy to do what works for you.

At my current client, my team is dependent on the "Platform Team" for _everything_ that isn't directly related to the code we're writing. Do we need a new Azure queue? That's a Platform ticket in JIRA. Do we need a new tool in our CI/CD pipeline? That's a Platform ticket in JIRA. Do we have a build failing for something other than tests? That's a Platform ticket in JIRA. Do we have an issue in production? We need to engage with Cloud Ops because we don't have access to anything in prod.

Plus, it takes six weeks to get a feature from development to production because horrible workflow and manual processes.

Plus, development teams are prevented from implementing anything without the approval of Solution Architects who are too far removed from the business to understand what's needed and too far removed from the technology to be able to design a working solution.

It's all fucking horrible, and yet we're using all the modern cloud tooling: Spring Boot, Azure, containers, K8S, Gitlab pipelines.

In contrast, a couple clients ago my team had full autonomy to do what we believed was the right solution. Stories would often be groomed, implemented, and delivered into production all within a 2-week sprint. No muss no fuss. When they weren't, it was normally due to us having to wait on another team to deliver their work.

FpUser · on Dec 21, 2020

>"...compared to the nightmare of shipping code on internally created developer tool & deployment platforms..."

I feel sorry that it worked for you this way (well rather it does not work). I have totally opposite experience. It takes single click of a button for me to build, test, deploy a code change using on premises and rented hardware.

I did cloud as well when it was mandated by client and it did not feel any simpler, faster etc. More complex and way more expensive it sure was but since it was not me paying the money I could safely ignore that part (not for my own projects and servers though as I do count my money).

1337shadow · on Dec 20, 2020

I'm sure you will be able to setup proper pipelines on new projects within a 24h time frame and be brilliant at finding root causes by reading the stack trace and associated sources in the blink of an eye when you will be a senior hacker, I'm not worried about that!

Anyway, I think your comment proves that once we decide to invest in something, our brain does everything it can to justify that choice, Human nature really :)

mlthoughts2018 · on Dec 20, 2020

I don’t understand your comment. It seems like you think it carries a rhetorical punch to (I guess) suggest that in the cloud version we’d still be slow and make mistakes? The writing is so unclear it’s hard to tell. But if that is what you’re saying, I think you deeply missed the point.

Yes of course my team won’t solve bugs instantly just because of cloud tools and of course we will make mistakes implementing bad infra designs, especially early on as we are still gaining experience with efficient and cheap cloud patterns. Nobody said otherwise and my comment before has no connection to anything like that.

Rather, if we have control over our own infra, we can adaptively fix those issues for ourselves without getting delays and philosophical or gatekeeper arguments from central data center admins. We would have autonomy to change our own deployment system, provision more compute resources, generalize something to work with GPUs, test out different delivery and runtime frameworks, create our own ingress failure robustness patterns, all without blockages and mandated central infra bottle necks.

To create such a configurable and adaptable API-driven portal into bare metal or a data center yourself is way too slow and costly, only the largest companies can do it, and then they turned it around and made it into cloud vendor options.

When medium sized companies try to do it, the most common outcome is a ton of exceedingly arrogant SRE types will create piss poor tooling that barely works for maybe the top 2-3 use cases and then will whine and complain that they can’t support generalizing it for the long tail of other use cases, and because they have political power as the gorilla in the room sitting on top of the only interface to the data center, nobody can argue with them, and you get a burnout mill with constant turnover on all the other teams who have to do SRE’s job for them while listening to the arrogance about what a rip off cloud tools are.

1337shadow · on Dec 20, 2020

I don’t understand your comment. It seems like you think "anyone talking about the rip off cloud tools" are "exceedingly arrogant SRE types creating piss poor tooling that barely works, whining and complaining that they can’t support generalizing it, sitting in actual data centers, with a burnout mill with constant turnover on all the other teams". But apart from that, I'm sure you're a lovely person to discuss with ;) Anyway, I was just saying that senior hackers have more options than you seem to have, also that we tend to find excuses to justify our investments once we are locked in them, it's perfectly natural and nothing to be ashamed of.