Along with this, vertical scaling is severely underrated. You can do a lot and p...

SatvikBeri · on March 16, 2021

Several years ago, I was chatting with another engineer from a close competitor. He told me about how they'd set up a system to run hundreds of data processing jobs a day over a dozen machines, using docker, load balancing, a bunch of AWS stuff. I knew these jobs very well, they were basically identical for any company in the space.

He then mentioned that he'd noticed that somehow my employer had been processing thousands of jobs, much faster than his, and asked how many machines we were using.

I didn't have the heart to tell him we were running everything manually on my two-year-old macbook air.

movedx · on March 16, 2021

F- me. I love this. This is a really important message.

It's like Jonathon Blow asks: why does Photoshop take longer to load today than it did in the 90s despite the (insane) advances in hardware?

I believe it's due to a bunch of things, but over complicating the entire process is one of the big issues. If people (developers/engineers) would only sit back and realise just how much computing power they have available to them, and then realised that if they kept things simple and efficient, they could build blazing fast solutions over night.

I cringe thinking about the wasted opportunities out there.

dzonga · on March 16, 2021

Jonathan Blow's Preventing the Collapse of Civilization [0] is an excellent talk, on how too much abstractions destroy knowledge.

This is a problem with the software industry today. we have forgotten how to do the simple stuff, that works and is robust.

[0]: https://www.youtube.com/watch?v=ZSRHeXYDLko

movedx · on March 23, 2021

That was a really great watch, thanks!

adflux · on March 16, 2021

Compare photoshop between the 90s and now. 10x on features. Size of photos grown exponentially as well.

movedx · on March 16, 2021

I see the point you're trying to make, however the increase in features (and their complexity) plus the size of the average graphic that a high-end professional has maybe grown by 300-500% since the 90s. In fact I'll tell you what: I'll give you a growth of 10,000% in file sizes and feature complexity since the since the 90s...

... computational power has grown ~%259,900 since the 90s.

The point being made is this: Photoshop does one job and has one focus should), yet it has gotten slower at doing that one job and not faster. Optimising the code AND introducing incredibly hardware to the consumer market should see Photoshop loading in milliseconds, in my opinion.

coldtea · on March 16, 2021

>The point being made is this: Photoshop does one job and has one focus should), yet it has gotten slower at doing that one job and not faster.

Has it though? Without measurements this is just idle talk.

And I've used Photoshop in the 90s and I use it today ocassionally. I remember having 90s sized web pictures (say, 1024x768) and waiting for a filter to be applied for tens of seconds - which I get instantly today with 24MP and more...

And if we're into idle talk I've always found Photoshop faster in large images and projects than competitors, including "lightweight" ones.

It's hella more optimized than them.

In any case, some e.g. image filter application (that takes, e.g. 20 seconds vs 1 minute with them) just calls some optimized C++ code (perhaps with some asm thrown in) that does just that.

The rest of the "bloat" (in the UI, feature count, etc) has absolutely no bearing as to whether a filter or an operation (like blend, crop, etc) runs fast or not. At worst, it makes going around in the UI to select the operations you want slower.

And in many cases the code implementing a basic operation, filter, etc, hasn't even been changed since 2000 or so (if anything, it was optimized further, taken to use the GPU, etc).

brnt · on March 17, 2021

I recall my dad requiring overnight sessions to have Photoshop render a particular filter on his Pentium 166MHz. That could easily take upwards of an hour, and a factor 10 more for the final edit. He'd be working on one photo for a week.

tasssko · on March 16, 2021

To me it feels as though the last decade and a half computational power has not grown vertically. Instead Intel and AMD have grown computational power horizontally (i.e adding more cores). I'm looking at the difference the M1 has had on compute performance as a sign X86 strayed.

rcxdude · on March 16, 2021

It has also grown substantially vertically: single-core speeds keep going up (about 10x from a decade and a half ago), even as core count increases. (and M1 is not substantially faster than the top x86 cores, the remarkable thing is how power efficient it is at those speeds).

phonon · on March 16, 2021

This was state of the art 15 years ago.

https://en.wikipedia.org/wiki/Kentsfield_(microprocessor) Modern processors are nowhere near 10x faster single core. Maybe 4x.

If you take into account performance/watt you do get 10x or better.

irthomasthomas · on March 16, 2021

The 3.4ghz P4 was released 2004 https://en.m.wikipedia.org/wiki/List_of_Intel_Pentium_4_proc...

If the trend had continued we would have 1500Ghz cores by now.

rcxdude · on March 16, 2021

Clock speed != single-threaded performance. Clock speeds plateaued a long time ago, single threaded performance is still improving exponentially (by being able to execute multiple instructions in an instruction stream in parallel, as well as execute the same instructions in less clock cycles), though the exponent approximately halved around 2004 (if the trend had continued we would be at about a 100-500x improvement by now).

https://github.com/karlrupp/microprocessor-trend-data

phonon · on March 16, 2021

Hard to say it's still "exponential"...what do you think the current constant doubling period is now?

Here's the single thread raw data from that repo. If you take into account clock speed increase (which, as you agree, have plateaued) we're looking at maybe a 2x increase in instructions per clock for conventional int (not vectorized) workloads.

Is there even another 2x IPC increase possible? At any time scale?

https://github.com/karlrupp/microprocessor-trend-data/blob/m...

erhk · on March 16, 2021

No one is choosing Photoshop over competitiors because of load time

neuronic · on March 16, 2021

And somehow Photopea is fast, free and in browser and suffices for 85% of whatever people do in Adobe Photoshop.

coldtea · on March 16, 2021

Those people for whom Photopea is fast and suffices didn't need Photoshop in the first place.

Sebb767 · on March 16, 2021

Opening PSDs is a big reason. A lot of designers will send PSDs and usually you also want them to check layers, extract backgrounds etc.

elondaits · on March 16, 2021

Yes, I use Photoshop because, as it happens, I need to do 100% of my job.

b212 · on March 16, 2021

Back in the day I was using Intel Celeron 333 MHz and then AMD Duron 800 MHz.

I did not know how to use Winamp playlists because Winamp has been "an instant" app for me, I just clicked on a song and it played within miliseconds. That was my flow of using Winamp for years. This did not change between the Celeron and Duron, the thing was instant on both Celeron and Duron.

Then Winamp 3 came out and I had to use playlists because a song once clicked took good second or two to start playing. Winamp 5 from 2018 still starts slower than my beloved 2.73* did 20 years ago. On Celeron 333 and 5400 RPM HDD with 256 MB of RAM. I think even the good old Winamp 2.x is not as fast as it was on Windows 98/XP.

Something went wrong.

* not sure if it was 2.73, but I think so

Note: I realise Winamp 3 was crappy as hell, but still...

mlok · on March 16, 2021

This is why I did choose and stick to coolplayer at the time (before I converted to Linux) : no install, so light, so fast, and it had everything I needed. I loved it when I could find such an elegant a versatile app. I don't need to get "more" every 6-12 months.

I learned that every time you gain something, you also loose something without realizing it, because you take it for granted.

desktopninja · on March 16, 2021

Winamp 2.73 is still my default player and works Win10 Pro 64x build $LATEST. I will never change. It is 2MB pure gold.

movedx · on March 16, 2021

This is sad to read :(

I remember having a 486 my self and everything being so damn snappy. Sigh.

yowlingcat · on March 16, 2021

Are you me? Perhaps a little bit later on and a different set of signifiers (foobar2000/xp/celeron 1.7) but the same idea. Things were so much snappier back then than on my previously-SOTA MBPR 2019. Sigh.

rograndom · on March 16, 2021

I was at a graphic design tradeshow back in the mid 90's and there was a guy there demonstrating this Photoshop alternative called Live Picture or Live Photos or something like that. And he had a somewhat large, at the time, print image on the screen, probably 16mb or so, and was zooming in and out and resizing the window and it was redrawing almost instantly.

This was AMAZING.

Photoshop at the time would take many many seconds to zoom in/out.

One person in the group asked, "Yeah, but how much memory is in that machine?"

The guy hemmed and hawed a bit and finally said "It's got a bit, but not a lot, just 24mb."

"Yeah, well that explains it." Nobody had 24mb of RAM at that time. Our "big" machine had 16mb.

shrubble · on March 16, 2021

Live Picture was the first to use what I think are called image proxies, where you can have an arbitrarily large image and only work with the screen-image sized image. Once you have applied all the changes and click save, it will then grind through the full size image if needed.

A feature that Photoshop has since added, but it appeared in Live Picture first.

rograndom · on March 16, 2021

Yup, that must have been it. I think Adobe implemented that as a feature into Photoshop within a year of us seeing that other software.

brandonmenc · on March 16, 2021

This might be a long shot, but did the demo include zooming in on a picture of a skull to show a message written on one of the teeth? If so, I've been trying to find a video of it for years.

rograndom · on March 16, 2021

No, it was a photo of a building. The "demo" was the guy right there playing with the software, no video.

eproxus · on March 16, 2021

Perhaps, but size of photos should affect load times when starting the app (and in my opinion, so shouldn’t most feature either, but that depends on your architecture I suppose).

lostlogin · on March 16, 2021

> Size of photos grown exponentially as well.

Photoshop can grow them itself and the below looks no and thread are amazing. https://news.ycombinator.com/item?id=26448986

thefz · on March 16, 2021

Yeah Jonathan Blow isn't exactly a luminary in computer science. I once read him going on a meltdown over linux ports because "programming is hard". This is the kind of minds Apple enable, i.e. "why isn't this easy?"

neuronic · on March 16, 2021

The entire point of computers is to make things easy.

The Apple question "why isn't this easy" is missing from 95% of UX in modern software. Stop acting as if software devs are the only users of software. And even then: computers should do work and not produce work or mental overhead.

dghf · on March 16, 2021

> The Apple question "why isn't this easy" is missing from 95% of UX in modern software.

I switched to a MacBook at work a few months ago, and it's been an epic of frustration and Googling: e.g.,

1. I set the keyboard layout to UK PC, but it kept switching back to default. (I was accidentally mashing a keyboard shortcut designed to do just that. I've since disabled it.)

2. Clicking on a link or other web-page element will, occasionally and apparently randomly, take me back a page or two in my history rather than opening the link or activating the element. (At least in Chrome: I've not used another browser on macOS yet.)

3. Command-tabbing to a minimised application does not, as one would naively expect, automatically unminimise it. Instead I'm left staring at an apparently unchanged screen.

4. If I open Finder in a given folder, there is, as best can tell, no easy way to navigate to that folder's parent.

Now arguably #1 was the fault of my own ignorance (though some kind of obvious feedback as to what was happening and why would have been nice), and #2 may be down to Google rather than Apple.

But #3 and #4 are plain bad design, bordering on user-hostile.

So far I'm not seeing that Apple's reputation for superior UX is justified, at least not in laptops.

sjcoles · on March 16, 2021

Window switching in OS X is so unintuitive it drives me MAD when you are remoting into a Mac.

Finder is just god awful and does all it can to obscure your actual filesystem location but the go to folder (cmd+g?) can get you where you need to go.

blacktriangle · on March 17, 2021

The Apple reputation was absolutely justified back in the OS9 era, and the early iPhone as well. However both OS X and iOS7 and beyond were huge steps backwards in usability.

At this point I think Apple still deserves their reputation for superior UX, however that's a result of how epically bad Google, MS, and Facebook are at UX, not Apple doing a great job like they used to.

bigbizisverywyz · on March 16, 2021

for 4. ...as best can tell, no easy way to navigate to that folder's parent

You can add a button to the toolbar in Finder (customize it from options) that when dropped down will show the complete path to the current folder as a list. You can use that to move up the tree.

mpcjanssen · on March 16, 2021

Cmd-Up brings you to the parent folder.

ricketycricket · on March 16, 2021

You can also cmd + click the current directory name in the menu bar to get the same menu.

otabdeveloper4 · on March 16, 2021

Are you implying Apple software and devices are "easy"?

I think that's marketing, and not borne out of real experience.

Try setting up a new Apple device, there is very little in IT that is as frustrating and confusing.

coldtea · on March 16, 2021

>Try setting up a new Apple device, there is very little in IT that is as frustrating and confusing.

It's a 5 minute process, and there's very little to it that it's not optimized for ease. Any champ can do it, and millions do.

Not sure what you're on about.

>I think that's marketing, and not borne out of real experience.

Yes, all those people are delluded.

otabdeveloper4 · on March 16, 2021

Clearly you've never tried it, because it's certainly not 5 minutes, it's optimized for selling you bullshit Apple services and it's buggy as hell, with no feedback to the user why everything is broken and why you're having to re-authenticate five times in a row.

And good luck if you're setting up a family account for several devices with different iOS versions. You're gonna really need it.

coldtea · on March 16, 2021

>Clearly you've never tried it, because it's certainly not 5 minutes, it's optimized for selling you bullshit Apple services and it's buggy as hell

I've tried it tons of times, have over 20 iOS/macOS devices over the years, and for some pervese reason, on macOS/OS X I like to install every major update on a clean disk too (and then re-import my data. It's an old Windows 95/XP-era reflex), so I do it at least once every year for my main driver (plus different new iOS devices).

What part of this sounds difficult?

https://www.youtube.com/watch?v=70u0x8Kf6j4

And the whole "optimized for selling you bullshit Apple services" is a couple of screens you can skip with one click -- and you might want to legitimately use too.

monkey_monkey · on March 16, 2021

Honestly, literally millions of people do this every year, and for most of them, it's like 10 minutes, plus the time waiting for the iCloud restore. Even my dad was able to set up his new iPad, and he's as techophobic as it gets.

otabdeveloper4 · on March 16, 2021

Technophobic people are exactly the target audience that has a huge tolerance for broken software built on piles of abusive bullshit.

monkey_monkey · on March 16, 2021

My father has very little patience for broken things, software or otherwise, so I'm really not sure what you're talking about.

pjmlp · on March 16, 2021

It is definitly easier than configuring any GNU/Linux distribution.

ensignavenger · on March 16, 2021

Watching the video that coldtea posted, no, it is not. Ubuntu and most of its derivatives have very easy installers, and take a fraction of the time. The video didn't even include the time it would take to read and understand all the terms and conditions!

pjmlp · on March 16, 2021

Assuming one wants a partial working computer after installation.

ensignavenger · on March 16, 2021

I don't know what you mean by this- care to elaborate? I have several fully working computers running Ubuntu derivatives without having to do anything after the install.

pjmlp · on March 16, 2021

I bet none of them is a laptop.

ensignavenger · on March 16, 2021

I currently have two laptops, one a Lenovo from work with Kubuntu and the other cheap Asus with KDE Neon. Both required no additional work to be fully working after install.

coldtea · on March 16, 2021

>This is the kind of minds Apple enable, i.e. "why isn't this easy?"

So the kind of minds we want?

"Why isn't this easy?" should be the bread and butter question of a programmer...

thefz · on March 16, 2021

Except for problems that are hard. I wholeheartedly disagree, Blow showed many times he has not the right mindset.

yowlingcat · on March 16, 2021

> This is the kind of minds Apple enable, i.e. "why isn't this easy?"

I dunno, personally that's why I've used Apple products for the past decade, and I think it's also maybe part of why they have a 2T market cap, and are the most liquid publicly traded stock in the world?

thefz · on March 16, 2021

So making simplistic products that treat users as dumb is profitable, yeah, I agree with that.

meepmorp · on March 16, 2021

Believe it or not, lots of people have more important things to do with their computers than dick with them to make shit work.

Most people don't really know how their car works, and shouldn't have to. Same goes here.

yowlingcat · on March 16, 2021

Where's your evidence for that? The argument that it "treats users as dumb" and that doing so is "profitable" is oft trotted out, but I never see any substantiation for it. Plenty of companies do that. What's so special about Apple, then? I mean, it's gotta be something.

You gotta be careful about these arguments. They often have a slippery slope to a superiority complex (of "leet" NIX users over the unwashed "proles") hiding deep within.

Needlessly complex or powerful user interfaces aren't necessarily good. They were quite commonplace before Apple. Apple understood the value of minimalism, of cutting away interaction noise until there's nothing left to subtract. Aesthetically speaking, this is approach has a long, storied history with respect to mechanical design. It's successful because it works.

What Apple understood really acutely and mastered is human interface design. They perfected making human centric interfaces that look and feel like fluid prosthetic extensions of one's body, rather than computational interface where power is achieved by anchoring ones tasks around the machine for maximal efficiency. Briefly, they understood intuition. Are you arguing that intuition is somehow worse than mastering arcana, simple because you've done the latter?

Now, I'm not going to say that one is better than the other. I love my command line vim workflow dearly, and you'll have to pry my keyboard out of my cold dead hands. But there's definitely the idea of "right tool for the right job" that you might be sweeping by here. Remember, simplicity is as much a function of cherished *NIX tools you probably know and love. It's where they derive their power. Be careful of surface level dismissals (visual interfaces versus textual) that come from tasting it in a different flavor. You might miss the forest for the trees!

movedx · on March 16, 2021

It's easy to take a stab at someone online, behind a keyboard, but I'd suggest you show us all your work and we'll judge your future opinions based on it.

thefz · on March 16, 2021

By no metric I am comparatively as successful as the guy, but I am still able to disagree on his point that Linux is held back by its tools. The fact that he did not want or had the time to learn Linux's tooling don't mean anything in particular except that he's either very busy or very lazy. Any interview I read with him he's just ranting and crying over this or that "too complex" matter. If he does not want to deal with the complexity of modern computers, he should design and build board games.

As an example, read how the one-man-band of Execution Unit manages to, in house, write and test his game on three operating systems. https://www.executionunit.com/blog/2019/01/02/how-i-support-...

It's not a matter of "this is too hard", Blow just does not want to do it, let's be honest.

majjgepolja · on March 16, 2021

preferably future work and dreams and ramblings too..

preferably over a twitch stream..

pjmlp · on March 16, 2021

If I wanted that experience I would kept my PC with MS-DOS 3.3.

mason55 · on March 16, 2021

I think there are two things that cause this.

One is that there’s a minimum performance that people will tolerate. Beyond that you get quickly diminishing user satisfaction returns when trying to optimize. The difference between 30 seconds and 10 seconds in app startup time isn’t going to make anyone choose or not choose Photoshop. People who use PS a lot probably keep it open all day and everyone else doesn’t care enough about the 20 seconds.

The second problem is that complexity scales super-linearly with respect to feature grown because each feature interacts with every other feature. This means that the difficulty of optimizing startup times gets harder as the application grows in complexity. No single engineer or team of engineers could fix the problem at this point, it would have to be a mandate from up high, which would be a silly mandate since the returns would likely be very small.

spondyl · on March 16, 2021

If you haven't come across it, you might appreciate this post: https://adamdrake.com/command-line-tools-can-be-235x-faster-...

movedx · on March 16, 2021

I'll give it a read. Thank you.

(I like anything that echos well in this chamber of mine... just kidding ;)

LdSGSgvupDV · on March 16, 2021

The problem is that if everything is simple enough. How to set your goals this year? Complication creates lots of jobs and waste. This makes us not starved but others somewhere in the world or in the future when resource all gone.

movedx · on March 16, 2021

I see where you're coming from with this, but this is getting into the realm of social economics and thus politics.

To solve the problem you're describing we need to be better at protecting all members of society not in (high paying) jobs, such as universal basic income and, I don't know, actually caring about one another.

But I do see your point, and it's an interesting one to raise.

comeonseriously · on March 17, 2021

"But developer productivity!"

Most orgs (and most devs) feel developer productivity should come first. They're not willing to (and in a lot of cases, not able to) optimize the apps they write. When things get hard (usually about 2 years in) devs just move on to the next job.

yourapostasy · on March 16, 2021

> If people (developers/engineers) would only sit back...*

It is a matter of incentives. At many companies, developers are rewarded for shipping, and not for quality, efficiency, supportability, documentation, etc.* This is generally expected of a technology framework still in the profitability growth stage; once we reach a more income-oriented stage, those other factors will enter incentives to protect the income.

movedx · on March 17, 2021

Maybe. I'm not convinced.

I think you can build something complex quickly and well at the same time. I built opskit.io in three weeks. It's about 90% automated.

yourapostasy · on March 19, 2021

> I think you can build something complex quickly and well at the same time.

One definitely can. It's a real crapshoot whether the average developer can. For what it's worth, I consider myself a below-average developer. There is no way I could grind l33tcode for months and even land a FAANG interview. I can't code a red-black tree to save my life unless I had a textbook in front of me. Code I build takes an enormous amount of time to deliver, and So. Much. Searching. You get the picture. I'm a reasonably good sysadmin/consultant/sales-engineer; all other roles I put in ludicrous amounts of effort into, to become relevant. Good happenstance I enjoy the challenge.

For the time being however, there is such enormous demand for any talent that I always find myself in situations where my below average skills are treated as a scarcity. Like near-100 headcount testing organization in a tech-oriented business with an explicit leadership mandate to automate with developer-written integration code from that organization...and two developers, both with even worse skills than mine. When a developer balks at writing a single regular expression to insert a single character into the front of an input string, that's nearly the definition of turning one wrench for ten years; while I'm slow and very-not-brilliant, I'm a smart enough bear to look up how to do it on different OS' or languages and implement it within the hour.

This is not unusual in our industry. That's why FizzBuzz exists. That's just to clear the bar of someone who knows the difference between a hash and a linked list.

To clear the bar of "something complex quickly and well at the same time" though, I've found it insufficient to clear only the technical hurdle and obtain consistent results. The developer has to care about all the stakeholders. Being able to put themselves into the shoes of the future developers maintaining the codebase, future operators who manage first line support, future managers who seek summarized information about the state and history of the platform, future users who apply business applications to the platform, future support engineers feeding support results back into developers, and so on. That expansive, empathetic orientation to balance trade-offs and nuances is either incentivized internally, or staffed at great expense externally with lots of project coordination (though really, you simply kick the can upstairs to Someone With Taste Who Cares).

I'd sure as hell like to know alternatives that are repeatable, consistently-performing, and sustainable though. Closest I can think of is long-term apprenticeship-style career progression, with a re-dedicated emphasis upon staffing out highly-compensated technical writers, because I strongly suspect as an industry we're missing well-written story communication to tame the complexity monster; but that's a rant for another thread.

darthrupert · on March 17, 2021

Has Jonathan Blow actually measured these things or is this perhaps a skewed memory?

littlestymaar · on March 16, 2021

Reminded me of Fabrice Bellard's Pi digits record[1]

The previous Pi computation record of about 2577 billion decimal digits was published by Daisuke Takahashi on August 17th 2009. The main computation lasted 29 hours and used 640 nodes of a T2K Open Supercomputer (Appro Xtreme-X3 Server). Each node contains 4 Opteron Quad Core CPUs at 2.3 GHz, giving a peak processing power of 94.2 Tflops (trillion floating point operations per second).

My computation used a single Core i7 Quad Core CPU at 2.93 GHz giving a peak processing power of 46.9 Gflops. So the supercomputer is about 2000 times faster than my computer. However, my computation lasted 116 days, which is 96 times slower than the supercomputer for about the same number of digits. So my computation is roughly 20 times more efficient.

[1]: https://bellard.org/pi/pi2700e9/faq.html

cobookman · on March 16, 2021

I once joked with a colleague that my sqlite3 install is faster than his Hadoop cluster for running a report across a multi-gig file.

We benchmarked it, i was much, much faster.

Technically though once that multi-gig file becomes many hundreds of gigs, my computer would loose by a huge margin.

pmlnr · on March 16, 2021

"Command-line Tools can be 235x Faster than your Hadoop Cluster"

https://web.archive.org/web/20200414235857/https://adamdrake...

Frost1x · on March 16, 2021

I recently did some data processing on a single (albeit beefy node) someone had been using a cluster for. I composed and ran ETL in a day what took them weeks in their infrastructure (they were actually still in the process of fixing it).

Aeolun · on March 16, 2021

At that point you can just get a bigger computer though.

ramraj07 · on March 16, 2021

No you cannot, you cannot infinitely scale SQLite, you can’t load 100 G of data into a single SQLite file in any meaningful amount of time. Then try creating an index on it and cry.

I have tried this, I literally wanted to create a simple web app that is powered by the cheapest solution possible, but it had to serve from a database that cannot be smaller than 150GB. SQLite failed. Even Postgres by itself was very hard! In the end I now launch redshift for a couple days, process all the data, then pipe it to Postgres running on a lightsail vps via dblink. Haven’t found a better solution.

trashtester · on March 16, 2021

My rule of thumb is that a single processor core can handle about 100MB/s, if using the right software (and using the software right). For simple tasks, this kan be 200+ MB/s, if there is a lot of random access (both against memory and against storage), one can assume about 10k-100k IOPS per core.

For a 32 core processor, that means that it can process a data set of 100G in the order of 30 seconds. For some types of tasks, it can be slower, and if the processing is either light or something that lets you leverage specialized hardware (such as a GPU), it can be much faster. But if you start to take hours to process a dataset of this size (and you are not doing some kind of heavy math), you may want to look at your software stack before starting to scale out. Not only to save on hardware resources, but also because it may require less of your time to optimize a single node than to manage a cluster.

xnx · on March 16, 2021

> "using the right software (and using the software right)"

This is a great phrase that I'm going to use more.

yowlingcat · on March 16, 2021

This is a great rule of thumb which helps build a kind of intuition around performance I always try to have my engineers contextualizing. The "lazy and good" way (which has worked I'd say at least 9/10 times in my career when I run into these problems) is to find a way to reduce data cardinality ahead of intense computation. It's 100% for the reason you describe in your last sentence -- it doesn't just save on hardware resources, but it potentially precludes any timespace complexity bottlenecks from becoming your pain point.

coldtea · on March 16, 2021

>No you cannot, you cannot infinitely scale SQLite, you can’t load 100 G of data into a single SQLite file in any meaningful amount of time. Then try creating an index on it and cry.

Yes, you can. Without indexes to slow you down (you can create them afterwards), it isn't even much different than any other DB, if not faster.

>Even Postgres by itself was very hard!

Probably depends on your setup. I've worked with multi-TB sized Postgres single databases (heck, we had 100GB in a single table without partitions). Then again the machine had TB sized RAM.

mschuster91 · on March 16, 2021

> but it had to serve from a database that cannot be smaller than 150GB. SQLite failed. Even Postgres by itself was very hard!

The PostgreSQL database for a CMS project I work on weighs about 250GB (all assets are binary in the database), and we have no problem at all serving a boatload of requests (with the replicated database and the serving CMS running on each live server, with 8GB of RAM).

To me, it smells like you've lacked some indices or ran on a rpi?

christophilus · on March 16, 2021

It sounds like the op is trying to provision and load 150GB in a reasonably fast manner. Once loaded, presumably any of the usual suspects will be fast enough. It’s the up front loading costs which are the problem.

Anyway, I’m curious what kind of data the op is trying to process.

ramraj07 · on March 16, 2021

I am trying to load and serve the Microsoft academic graph to produce author profile pages for all academic authors! Microsoft and google already do this but IMO they leave a lot to be desired.

But this means there are a hundred million entities, publishing 3x number of papers and a bunch of metadata associated. On redshift I can get all of this loaded in minutes and takes like 100G but Postgres loads are pathetic comparatively.

And I have no intention of spending more than 30 bucks a month! So hard problem for sure! Suggestions welcome!

shrubble · on March 16, 2021

There are settings in Postgres that allow for bulk loading.

By default you get a commit after each INSERT which slows things down by a lot.

ramraj07 · on March 16, 2021

How many rows are we talking about? In the end once I started using dblink to load via redshift after some preprocessing the loads were reasonable, and indexing too. But I’m looking at full data refreshes every two weeks and a tight budget (30 bucks a month) so am constrained on solutions. Suggestions welcome!

legg0myegg0 · on March 16, 2021

Try DuckDB! I've been getting 20x SQLite performance on one thread, and it usually scales linearly with threads!

Silhouette · on March 16, 2021

Maybe I'm misunderstanding, but this seems very strange. Are you suggesting that Postgres can't handle a 150GB database with acceptable performance?

ramraj07 · on March 16, 2021

I’m trying to run a Postgres instance on a basic vps instance with a single vcpu and 8gb of ram! And I’ll need to erase and reload all 150 GB every two weeks..

arnaudsm · on March 16, 2021

Had a similar problem recently. Ended up creating a custom system using a file-based index (append to files named by the first 5 char of the SHA1 of the key) Took 10 hours to parse my Terabyte. Uploaded it to Azure Blob storage, now I can query my 10B rows in 50ms for ~10^-7$. It's hard to evolve, but 10x faster and cheaper than other solutions.

ramraj07 · on March 16, 2021

My original plan was to do a similar S3 idea, but I forgot about it’s charge per 1000 gets and puts and had a 700 dollar bill I had to bargain with them to waive! Does azures model not have that expense?

arnaudsm · on March 16, 2021

I recall Azure was much cheaper and 30ms faster on average.

unchar1 · on March 16, 2021

Curious if you tried this on an EC2 instance in AWS? The IOPS for EBS volume are notoriously low, and possibly why a lot of self-hosted DB instances feel very slow vs similarly priced AWS services. Personal anecdote, but moving to a a dedicated server from EC2 increased the max throughput by a factor of 80 for us.

trashtester · on March 16, 2021

Did you try to compare that to EC2 instances with ephemeral nvme drives? I'm seeing hdfs throughput of up to several GB/node using such instances.

karmakaze · on March 16, 2021

You can use locally attached SSD instances. Then you're responsible for its reliability so not getting all the 'cloud' benefits. Used them for provisioning own CI cluster with raid-0 btrfs running PostgreSQL. Only backed up the provisioning and CI scripts.

ramraj07 · on March 16, 2021

Got burned there for sure! Speed is one thing but the cost is outrageous for io heavy apps! Anyways I moved to lightsail which doesn’t have io costs paradoxically so while io is slow at least the cost is predictable!

im3w1l · on March 16, 2021

A couple of 64gb ram sticks and you can fit your data in ram.

movedx · on March 16, 2021

It's all about the right tool for the job('s scale.)

StreamBright · on March 16, 2021

You can skip Hadoop and go from SQLite to something like S3 + Presto that scales to extremely high volumes with low latency and better than linear financial scaling.

bartimus · on March 16, 2021

Does hundreds of gigs introduce a general performance hit or could it still be further optimized using some smart indexing strategy?

z3t4 · on March 16, 2021

Everything is fast if it fits in memory and caches.

yarcob · on March 16, 2021

Unless it's accidentally quadratic. Then all the RAM in the world isn't going to help you.

Silhouette · on March 16, 2021

And in 2021, almost everything* does.

*Obviously not really. But very very many things do, even doing useful jobs in production, as long as you have high enough specs.

tootie · on March 16, 2021

I've had similar experiences. Sometimes we'll have a dataset with tens of thousands records and it will give rise to the belief that it's a problem that requires a highly scalable solution because "tens of thousands" is more than a human can hold in their head. In reality, if the records are just a few columns of data, the whole set can be serialized to a single file and consumed in one gulp into a single object in memory on commodity hardware no sweat. Then process it with a for loop. Very few enterprises actually have big big data.

SatvikBeri · on March 16, 2021

My solution started out as a 10-line Python script where I would manually clean the data we received, then process it. CEO: "Will this scale?"

Me: "No, absolutely not, at some point we'll need to hire someone who knows what they're doing."

As time passed and we got more data, I significantly improved the data cleaning portions so that most of it was automated, and the parts that weren't automated would be brought up as suggestions I could quickly handle. I learned the very basics of performance and why `eval` is bad, set up my script so I didn't need to hard-code the number of files to process each day, started storing data on a network drive and then eventually a db...

I still don't know what I'm doing, but by the time I left it took maybe 5 minutes of manual data cleaning to handle thousands of jobs a day, and then the remainder could be done on a single machine.

Cthulhu_ · on March 16, 2021

Said enterprise WISH they had big data. Or maybe it's fear, as in, 'what IF we get big data?'

SketchySeaBeast · on March 16, 2021

I'm aware of a couple of companies who behave like that - "well, we could increase our user based by an order of magnitude at any point here, so better spring for the order of magnitude more expensive database, just in case we need it."

Feels like toxic optimism.

dvfjsdhgfv · on March 16, 2021

It's not just about scaling databases, some people are simply unable to assess reasonable limits on any system. A few years ago certain Scandinavian publisher decided to replace their standard industry tools by a single "Digital Experience Platform" that was expected to do everything. After a couple of years they understood it's a stupid idea and gave up. Then later someone in the management thought that since they already spent some millions of euros they should continue anyway. This behemoth is so slow and buggy the end users work at 1/4th speed but everyone is afraid to say anything as the ones who did have been fired. The current PM is sending weekly success messages. It's hilarious. And all because someone once had a fantasy of having one huge system that does everything.

spratzt · on March 16, 2021

I love the phrase 'toxic optimism'.

tim333 · on March 16, 2021

I've noticed business people have a different idea of what 'big data' means to tech guys. The business guys think it means a lot of data like the records of a million people which is a lot of data but not the tech guy definition which tends to be data too large to process on a single machine.

Those come out at something like 1GB and 10TB which are obviously rather different.

d_burfoot · on March 16, 2021

Unfortunately, this kind of behavior will be rewarded by the job market, because he's now got a bunch more tech buzzwords on his resume than you. Call it the Resume Industrial Complex: engineers build systems with as many bells and whistles as possible, because they want to learn all the hot new tech stacks so they can show off their extensive "skills" to potential employers.

api · on March 16, 2021

I wonder what percentage of data center power is wasted running totally unnecessary trendy abstractions...

joshxyz · on March 16, 2021

God I love stories like this

medium_burrito · on March 16, 2021

My favorite part of conducting design interviews is when a candidate has pulled some complex distributed system out of their ass, and I ask them what the actual throughput/memory usage looks like.

darkwater · on March 16, 2021

And what would happen the day you were hit by a bus?

krab · on March 16, 2021

On that day, most probably nothing with regards to this task.

Then, later, probably someone would check out the scripts from a shared repo. Then, read an outdated README, try it out, swear a bit, check for correctness with someone dependent on the results, and finally learn how to do the task.

There is a lot of business processes that can tolerate days or weeks of delay in case of such a tragic (and hopefully improbable) event. The trick is to know which of them can't.

nicbou · on March 16, 2021

That second paragraph is very relatable.

darkwater · on March 16, 2021

> There is a lot of business processes that can tolerate days or weeks of delay in case of such a tragic (and hopefully improbable) event. The trick is to know which of them can't.

This is really true BUT that kind of problems are OK - nobody cares - until somebody starts caring and then all of a sudden it is urgent (exactly because they were undetected for weeks/months due to their periodicity)

krab · on March 16, 2021

I meant it's fine to take calculated risks.

E. g. we have less than 1% chance per year that a given person leaves us on bad terms or suffers a bad accident or illness. In case it really happens, it will cost us X in delays and extra work. To lower the probability of this risk to Y% would cost us Z (money, delay, etc).

If you do this math, you can tell if it's a good idea to optimize here, or if you have more pressing issues.

In my experience, this sort of one-man jobs gets automated or at least well described and checked for fear of mistakes and/or employee fraud rather than "downtime".

darkwater · on March 16, 2021

Mistakes are "downtime" as well in a way. Or maybe better, downtime is a mistake, cause errors and lead to problems.

neuronic · on March 16, 2021

Another guy install Postgres on his machine, runs a git clone, connects to the VPN and initiates the jobs?

SatvikBeri · on March 16, 2021

I wasn't even using a db at the time, it was 100% pandas. We did eventually set up more infrastructure, when I left the data was loaded into the company's SQL Server db, then pulled into pandas, then uploaded back into a different table.

SatvikBeri · on March 16, 2021

It's true – at that point, if I had disappeared without providing any transition help, the company would have been in trouble for a few days. But that goes for any employee – we were only 7 people at the time!

Eventually I built out some more infrastructure to run the jobs automatically on a dedicated machine, but last I checked everything still runs on one instance.

tjalfi · on March 16, 2021

All of my routine tasks are documented in the operations manual. I'd be missed but the work would still get done.

littlecranky67 · on March 16, 2021

SO is always impressive - love that their redis servers with 256GB RAM peak at 2% CPU load :)

SO is also my go-to argument when some smart "architect" proposes redundant Kubernetes cluster instances for some company-local project. People seem to have lost the feeling what is needed to serve a couple of thousand concurrent users (For company internal usages which I specialize in, you hardly will get more users). Everyone thinks they are Google or Netflix. Meanwhile, SO runs on 1-2 Racks with an amount of server that would not even justify kubernetes or even docker.

sofixa · on March 16, 2021

SO really isn't a great example, they have considerations most companies don't - Windows and SQL Server licensing. When shit like that is involved, scale out rarely seems like a better choice.

It's not only about the amount of users, it's also a matter of availability. Even the most stupid low-use barely-does-anything internal apps at my company get deployed either to two machines or a Nomad cluster for redundancy ( across two DCs). Design for failure and all that. Failure is unlikely, but it's trivial to setup at least active-passive redundancy just in case, it will make failures much easier.

bscphil · on March 16, 2021

The "1-2 racks" perspective is great too, really makes you think the old XKCD joke [1] about tripping over the power cable might not be that far wrong. ;-)

[1] https://xkcd.com/908/

mschuster91 · on March 16, 2021

> SO is also my go-to argument when some smart "architect" proposes redundant Kubernetes cluster instances for some company-local project.

Technically you don't need Kubernetes, yes. But: There are advantages that Kubernetes gives you even for a small shop:

- assuming you have a decent shared storage, it's a matter of about 30 minutes to replace a completely failed machine - plug the server in, install a bare-bones Ubuntu, kubeadm join, done. If you use Puppet and netboot install, you can go even faster (Source: been there, done that). And the best thing: assuming well written health checks users won't even notice you just had a node fail as k8s will take care of rescheduling.

- no need to wrangle with systemd unit files (or, worse, classic init.d scripts) for your application. For most scenarios you will either find Docker-embedded healthchecks somewhere or you can easily write your own so that Kubernetes can automatically

- no "hidden undocumented state" like wonky manual customizations somewhere in /etc that can mess up disaster recovery / horizontal scale, as everything relevant is included in either the Kubernetes spec or the Docker images. Side effect: this also massively reduces the ops load during upgrades, as all there is on a typical k8s node should be the base OS and Docker (or, in newest k8s versions, not even that anymore)

- it's easy to set up new development instances in a CI/CD environment

- generally, it's easier to get stuff done in corporate environments: just spin up a container on your cluster and that's it, no wrestling with finance and three levels of sign-off to get approval for a VM or, worse, bare metal.

I won't deny that there are issues though, especially if you're selfhosting:

- you will end up with issues with basic network tasks very quickly during setup, MetalLB is a nightmare, but smooth once you do have set it up. Most stuff is made with the assumption of every machine being in a fully Internet-reachable cluster (coughs in certbot), once you diverge from that (e.g. because of corp requiring you have to have dedicated "load balancer" nodes that only serve to direct traffic from outside to inside and "application" nodes not be directly internet-reachable) you're on your own.

- most likely you'll end up with one or two sandwich layers of load balancing (k8s ingress for one, and if you have it an external LB/WAF), which makes stuff like XFF headers ... interesting to say the least

- same if you're running anything with UDP, e.g. RTMP streaming

- the various networking layers are extremely hard to debug as most of k8s networking (no matter the overlay you use) is a boatload of iptables black magic. Even if you have a decade of experience...

littlecranky67 · on March 16, 2021

Your arguments are true, but you did not consider the complexity that you have now introduced in a small shop operation. You will need kubernetes knowledge and experienced engineers on that matter. I would argue that the SO setup with 9 webservers, 2x2 DB servers and 2 redis servers could easily be administered with 20 year old knowledge about networks and linux/windows itself.

And I also argue, lack of experience of fiddling with redundant kubernetes is a more likely source of downtime than hardware failure and keeping things simple.

zeckalpha · on March 17, 2021

Thankfully in this case there is 13 years of knowledge in the DB itself.

mschuster91 · on March 16, 2021

> You will need kubernetes knowledge and experienced engineers on that matter.

For a small shop you'll need one person knowing that stuff, or you bring in an external consultant for setting up and maintaining the cluster, or you move to some cloud provider (k8s is basically a commodity that everyone and their dog offers, not just the big 3!) so you don't have to worry about that at all.

And a cluster for basic stuff is not even that expensive if you do want to run your own. Three worker machines and one (or, if you want HA, two) NAS systems... half a rack and you're set.

The benefit you have is your engineers will waste a lot less time setting up, maintaining and tearing down development and QA environments.

As for the SO setup: the day-to-day maintenance of them should be fairly simple - but AFAIK they had to do a lot of development effort to get the cluster to that efficiency, including writing their own "tag DB".

littlecranky67 · on March 16, 2021

You will always need 2-3 experts, because in case of an incident, your 1 engineer might be on sick/holiday leave.

Well but lets walk one step back looking at SO, they are a windows shop (.NET, MS SQL Server) so I doubt k8s would be found in their setup.

toyg · on March 16, 2021

Ah yes, I’ll make my critical infrastructure totally dependent on some outside consultant who may or may not be around when I really need him. That sounds like a great strategy. /s

danpalmer · on March 16, 2021

SO is a great counter example to many over complicated setups, but they have a few important details going for them.

> Everytime you go to SO, it hits one of these 9 web servers

This isn't strictly true. Most SO traffic is logged out, most doesn't require strictly consistent data, most can be cached at the CDN. This means most page views should never reach their servers.

This is obviously a great design! Caching at the CDN is brilliant. But there are a lot of services that can't be built like this.

systemvoltage · on March 16, 2021

CDN caches static assets. The request still goes to SO servers. Search goes to one of their massive Elastic Search servers.

I’m not saying we should all use SO’s architecture, I am trying to shed light on what’s possible.

YMMV obviously.

danpalmer · on March 16, 2021

Are you an SO dev? I had thought I read about the use of CDNs and/or Varnish or something like that for rendered pages for logged out users? I don't want to correct you on your own architecture if you are!

systemvoltage · on March 16, 2021

No, not a dev at SO. I am guessing what would be rather a standard use of CDN (hosting static assets, caching them geographically).

What you're saying is probably right.

bob1029 · on March 16, 2021

We went all-in on vertical scaling with our product. We went so far we decided on SQLite because we were never going to plan to have a separate database server (or any separate host for that matter). 6 years later that assumption has still held very strong and yielded incredible benefits.

The slowest production environment we run in today is still barely touched by our application during the heaviest parts of the day. We use libraries and tools capable of pushing millions of requests per second, but we typically only demand tens to hundreds throughout the day.

Admitting your scale fits on a single host means you can leverage benefits that virtually no one else is even paying attention to anymore. These benefits can put entire sectors of our industry out of business if more developers were to focus on them.

benbjohnson · on March 16, 2021

Do you have any more details on your application? Sounds like your architecture choice worked out really well. I'm curious to hear more about it.

bob1029 · on March 16, 2021

Our technology choices for the backend are incredibly straightforward. The tricky bits are principally .NET Core and SQLite. One new technology we really like is Blazor, because their server-side mode of operation fits perfectly with our "everything on 1 server" grain, and obviates the need for additional front-end dependencies or APIs.

jjmod · on March 16, 2021

how do you handle backup/replication for your sqlite server?

bob1029 · on March 16, 2021

Our backup strategy is to periodically snapshot the entire host volume via relevant hypervisor tools. We have negotiated RPOs with all of our customers that allow for a small amount of data loss intraday (I.e. w/ 15 minute snapshot intervals, we might lose up to 15 minutes of live business state). There are other mitigating business processes we have put into place which bridge enough of this gap for it to be tolerable for all of our customers.

In the industry we work in, as long as your RTO/RPO is superior to the system of record you interface with, you are never the sore thumb sticking out of the tech pile.

In our 6-7 years of operating in this manner, we still have not had to restore a single environment from snapshot. We have tested it several times though.

You will probably find that VM snapshot+restore is a ridiculously easy and reliable way to provide backups if you put all of your eggs into one basket.

enraged_camel · on March 17, 2021

>> You will probably find that VM snapshot+restore is a ridiculously easy and reliable way to provide backups if you put all of your eggs into one basket.

Yep, this is something we rely on whenever we perform risky upgrades or migrations. Just snapshot the entire thing and restore it if something goes wrong, and it's both fast and virtually risk-free.

benbjohnson · on March 16, 2021

I’m not the OP but I’m the author of an open source tool called Litestream[1] that does streaming replication of SQLite databases to AWS S3. I’ve found it to be a good, cheap way of keeping your data safe.

[1]: https://litestream.io/

bob1029 · on March 16, 2021

I am definitely interested in a streaming backup solution. Right now, our application state is scattered across many independent SQLite databases and files.

We would probably have to look at a rewrite under a unified database schema to leverage something like this (at least for the business state we care about). Streaming replication implies serialization of total business state in my head, and this has some implications for performance.

Also, for us, backup to the cloud is a complete non-starter. We would have to have our customers set up a second machine within the same network (not necessarily same building) to receive these backups due to the sensitive nature of the data.

What I really want to do is keep all the same services & schemas we have today, but build another layer on top so that we can have business services directly aware of replication concerns. For instance, I might want to block on some targeted replication activity rather than let it complete asynchronously. Then, instead of a primary/backup, we can just have 4-5 application nodes operating as a cluster with some sort of scheme copying important entities between nodes as required. We already moved to GUIDs for a lot of identity due to configuration import/export problems, so that problem is solved already. There are very few areas of our application that actually require consensus (if we had multiple participants in the same environment), so this is a compelling path to explore.

benbjohnson · on March 16, 2021

You can stream back ups of multiple database files with Litestream. Right now you have to explicitly name them in the Litestream configuration file but in the future it will support using a glob or file pattern to pick up multiple files automatically.

As for cloud backup, that's just one replica type. It's usually the most common so I just state that. Litestream also supports file-based backups so you could do a streaming backup to an NFS mount instead. There's an HTTP replica type coming in v0.4.0 that's mainly for live read replication (e.g. distribute your query load out to multiple servers) but it could also be used as a backup method.

As for synchronous replication, that's something that's on the roadmap but I don't have an exact timeline. It'll probably be v0.5.0. The idea is that you can wait to confirm that data is replicated before returning a confirmation to the client.

We have a Slack[1] as well as a bunch of docs on the site[2] and an active GitHub project page. I do office hours[3] every Friday too if you want to chat over zoom.

[1]: https://join.slack.com/t/litestream/shared_invite/zt-n0j4s3c...

[2]: https://litestream.io/

[3]: https://calendly.com/benbjohnson/litestream

bob1029 · on March 16, 2021

I really like what I am seeing so far. What is the rundown on how synchronous replication would be realized? Feels like I would have to add something to my application for this to work, unless we are talking about modified versions of SQLite or some other process hooking approach.

benbjohnson · on March 17, 2021

Litestream maintains a WAL position so it would need to expose the current local WAL position & the highest replicated WAL position via some kind of shared memory—probably just a file similar to SQLite's "-shm" file. The application can check the current position when a transaction starts and then it can block until the transaction has been replicated. That's the basic idea from a high level.

mwcampbell · on March 16, 2021

Does your application run on your own servers, your customers' servers, or some of each? I gather from your comments that you deploy your application into multiple production environments, presumably one per customer.

bob1029 · on March 16, 2021

Both. We run a QA instance for every customer in our infrastructure, and then 2-3 additional environments per customer in their infrastructure.

anonytrary · on March 16, 2021

Vertical scaling maybe works forever for the 99% of companies that are CRUD apps running a basic website. As soon as you add any kind of 2D or 3D processing like image, video, etc. you pretty much have to have horizontal scaling at some point.

The sad truth is that your company probably won't be successful (statistically). You pretty never have to consider horizontal scaling until you have a few hundred thousand DAU.

ComputerGuru · on March 16, 2021

You don’t need to scale your application vertically even with media processing, you just need to distribute that chunk of the work, which is a lot easier (no state).

anonytrary · on March 17, 2021

Like someone else said, distributing work across multiple machines is a form of horizontal scaling.

mhuffman · on March 17, 2021

> Like someone else said, distributing work across multiple machines is a form of horizontal scaling.

Sure, but it is the easy kind, when it comes to images or videos. Lambda, for example, can handle a huge amount of image processing for pennies per month and there is none of the additional machine baggage that comes with traditional horizontal scaling.

LockAndLol · on March 16, 2021

It really depends. Handling streaming video service that does any kind of reprocessing of the data would probably be better off with horizontal scaling.

SilverRed · on March 16, 2021

I imagine its still super simple to have one core app that handles most of the logic and then a job queue system that runs these high load jobs on worker machines.

Much simpler than having everything split.

anonytrary · on March 16, 2021

Worker machines sound like horizontal scaling to me.

SilverRed · on March 16, 2021

Sure but its a massive amount simpler than some massively distributed microservice app where every component runs on multiple servers.

Most of these vertical scaling examples given actually do use multiple servers but the core is one very powerful server.

systemvoltage · on March 16, 2021

Definitely. There is certainly a place for Horizontal scaling. Just wanted to highlight how underrated vertical scaling is and a good engineer would evaluate these scaling options with prudence and perspicacity, not cult behavior so often observed in software engineering circles.

dumb1224 · on March 16, 2021

I think somehow this is related to how business minded people think too. I went to a course where people learn to pitch their ideas to get funds but the basics of business simply did exist much among the technical people.

One simple example (which I suspect most of the business do) is that you do all work either manually yourself or on your laptop while advise them as a resource-rich service. Only when you truly can not handle the demand then you may 'scale up' and turn your business into 'real' business. And there are plenty of tricks like this (as legally as possible).

ablekh · on March 16, 2021

> Everytime you go to SO, it hits one of these 9 web servers and all data on SO sits on those 2 massive SQL servers. That's pretty amazing.

I don't find it amazing at all. Functionality-wise, StackOverflow is a very simple Web application. Moreover, SO's range of 300-500 requests per second is not a mind-blowing load. Even in 2014, a powerful enough single physical server (running a Java application) was able to handle 1M requests per second[1]. A bit later, in 2017, similar performance has been demonstrated on a single AWS EC2 instance, using Python (and a blazingly-fast HTTP-focused micro-framework Japronto), which is typically not considered a high-performance option for Web applications[2].

[1] https://www.techempower.com/blog/2014/03/04/one-million-http...

[2] https://www.freecodecamp.org/news/million-requests-per-secon...

silvestrov · on March 16, 2021

The amaziness is that the leadership allows it to be simple.

This is such a great competitive advantage.

Compare this to a leadership that thinks you absolutely must use Akamai for your 50 req/secs webserver. You end up with tons of complexity for no reason.

ablekh · on March 16, 2021

Fair enough. Though not too surprising still, considering the original leadership of the company, one of whom (Joel Spolsky) is still on the board of directors. Having said that, the board's 5:4 VC-to-non-VC ratio looks pretty scary to me. But this is a different story ...

mtberatwork · on March 16, 2021

SO is a bit more complicated than returning a single character in a response. You can achieve high throughput with just about anything these days if you aren't doing any "work" on the server. 300-500 reqs/second is impressive for a web site/application with real-world traffic.

hutrdvnj · on March 16, 2021

Thing is 99% of companies could run like SO if their software would be like SO.

But if you are confronted with a very large 15+ year old monolith that requires multiple big instance machines to even handle medium load. Then you're not going to get this easily fixed.

It's very possible that you come to the conclusion that it is too complex to refactor for better vertical scaling. When your demand increases, then you simply buy another machine every now and then and spin up another instance of your monolith.

zimpenfish · on March 16, 2021

> if you are confronted with a very large 15+ year old monolith that requires multiple big instance machines to even handle medium load. Then you're not going to get this easily fixed

Last 15+ year old monolith I touched needed multiple machines to run because it was constrained by the database due to an insane homegrown ORM and poorly managed database schemas (and this is a common theme, I find.)

Tuning the SQL, rejigging things like session management, etc., would have made it go a lot quicker on a lot fewer machines but management were insistent that it had to be redone as node microservices under k8s.

weinzierl · on March 16, 2021

I totally agree with your main point and SO is kind of the perfect example. At the same time it is kind of the worst example because for one, to the best of my knowledge, their architecture is pretty much an outlier, and for another it is what it is for non-technical historical reasons.

As far as I remember they started that way because they were on a Microsoft stack and Microsofts licensing policies were (are?) pretty much prohibitive for scaling out. It is an interesting question if they would design their system the same way if they'd the opportunity to start from scratch.

systemvoltage · on March 16, 2021

Most people responding here are nitpicking on whether SO’s architecture is the one. I wasn’t trying to imply that at all.

I wanted to drive a point and SO is a good enough example to show that a massive company of the size of SO can run, so can your tiny app.

Don’t scale prematurely. A lot can be done by reasonable vertical scaling.

For one $120k/year Kubernetes infra engineer, you could pay for entire rack of beefy servers.

Obviously YMMV. Discussion about SO and licensing details are distracting.

antman · on March 16, 2021

Yes but Stackoverflow is a now mostly a graveyard of old closed questions, easily cached, I am only half joking. Most startup ideas today are a lot more interactive, so a SO model with two DBs would probably not serve them well. Horizontal scaling is not only for ETL and I am uncertain in why you say that it needs many lawyers.

southerntofu · on March 16, 2021

Related, LetsEncrypt recently published a blogpost about their database servers and how they scale vertically: https://letsencrypt.org/2021/01/21/next-gen-database-servers...

tcmb · on March 16, 2021

Genuine question, how is 9 web servers vertical scaling? And also, peak CPU usage of 12% means this is about 10x oversized for what is needed. Isn't it much better to only scale up when actually needed, mostly in terms of cost?

elnygren · on March 16, 2021

because they play in the major leagues where most teams have hundreds or thousands of servers, while they have those nine.

Yes, there is some horizontal scaling, but the sheer amount of vertical scaling here is still mind blowing.

I've run more servers in what were basically hobby projects compared to SO.

jbverschoor · on March 16, 2021

Probably 9 different geographical locations. Nothing to do with the actual load per server

gamedna · on March 16, 2021

Stack-overflow's use case has the benefit of being able to sit behind a Content Delivery Network (CDN) with a massive amount of infrastructure at the edge offloading much of the computational and database demands. This reduces the requirements of their systems dramatically. Given their experience in the segment, its plausible to expect they understand how to optimize their user-experience to balance out the hardware demands and costs as well.

calibas · on March 16, 2021

Looks like there's also two Redis servers with half a TB of RAM sitting in between web and SQL. I'm sure that's a huge load off the SQL server.

https://meta.stackoverflow.com/a/306604/2489265

sebmellen · on March 16, 2021

I think I agree, but what do you mean exactly? Just keep getting beefier servers as opposed to serverless junk?

mavelikara · on March 16, 2021

Not the OP, but yes, getting more powerful machines to run your program is what "vertical scaling" means (as opposed to running multiple copies of your program on similar-sized machines aka "horizontal scaling" ).

louwrentius · on March 16, 2021

A ‘single’ big box with multiple terabytes of RAM can probably outperform many ‘horizontally scaled’ solutions. It all depend on the workload, but I feel that sometimes it’s more about being ’hip’ than being practical.

https://yourdatafitsinram.net/

mongol · on March 16, 2021

Might apply to 99 % of the companies but I doubt it applies to 99 % of the companies that HN readers work for.

nly · on March 16, 2021

Their database query to page request ratio is about 20:1. Seems like this should be lower.

sofixa · on March 16, 2021

Stack Overflow have unique constraints ( Microsoft licensing) which make vertical scaling a cheaper option, and IMHO that's rarely the case.

brainzap · on March 16, 2021

people keep fapping to this, but stackoverflow is served read only to most users, and probably heavy cached.

soheil · on March 16, 2021

What is vertical scaling?

toast0 · on March 16, 2021

Use bigger machines instead of more machines.

There's always a limit on how big you can go, and a smaller limit on how big you should go, but eitherway it's pretty big. I wouldn't go past dual Intel Xeon, because 4P gets crazy expensive; I haven't been involved in systems work on Epyc, 1P might be a sensible limit, but maybe 2P makes sense for some uses.

soheil · on March 16, 2021

If you have a single machine with 64 cores running 256 threads of your daemon that's considered vertical scaling? Odd definition

jjmarinho · on March 16, 2021

If multiple cores/threads shouldn't be considered vertical scaling, what should?

Overclocking a single-core processor from 2.8Ghz to 4.2Ghz can only take you so far after all...

yjftsjthsd-h · on March 16, 2021

Yes, because it's still a single system image (one running OS with one process table).

notreallyauser · on March 16, 2021

'Scale up' vs 'scale out' are, to me, more intuitive terms for the same thing. Up/vertical are aligned, I guess

SilverRed · on March 16, 2021

Usually it involves having an insane amount of ram and keeping the entire DB in ram.

jiux · on March 16, 2021

I have found this to be a helpful resource when it comes to scaling in general.

https://alexpareto.com/scalability/systems/2020/02/03/scalin...

The_rationalist · on March 16, 2021

Get a more powerful single machine (in contrast to multiple machines). However I wonder if multisockets Xeons count as vertical or horizontal. I never understood how programmable those machines are..

thefz · on March 16, 2021

Wow, thanks. This is the last place I expected to see C# mentioned. Very interesting!

sumnole · on March 16, 2021

It was mentioned in the article as being hard to build/test/deploy, but I disagree. Everything can be done in a few clicks using VS or Rider.

throwaway53453 · on March 16, 2021

It might apply to 99% who have specific requirements, but the vast majority of internet companies need more. Deployments, N+1 redundancy, HA etc... are all valuable, even if some resources are going to waste.

BeefySwain · on March 16, 2021

> Deployments, N+1 redundancy, HA etc

None of those things are mutually exclusive with vertical scaling?

Having two identical servers for redundancy is doesn't mean you are scaling horizontally (assuming each can handle the load individually, nothing better than discovering that that assumption was incorrect in an outage).

TedDoesntTalk · on March 16, 2021

Indeed, all of those things existed before docker.