Biggest CPU for the Bad System (reddit.com)
50 points by opentokix 25 days ago | hide | past | favorite | 32 comments


> The whole thing is about 10k APIs that all share the same cluster of 10 databases on the backend, which was never designed to scale like this. This company did 500 million revenue 2010 and now 15 billion this year, all running on this fking sql back end. They have a team of 500 devs writing for these apps, the complexity is unbelievable. No one knows how to untangle it and scale out to micro services.

Unfortunately the word “microservices” on HN brings out some low-quality discourse. Maybe this time will be different?

For this particular instance and company, no. Not because microservices can't work, but instead because of the coordination required between services/api/database that doesn't appear to exist at all in this company.

This said, depending on your application and how it's split up quite often there are easy wins. An example I've seen is a company that had a huge batch reporting jobs for reporting at fixed intervals. Hammered the hell of out production, so was moved to pull data from a read only replica.

It really depends on what they are doing. At this stage, I'd suspect it needs some major rearchitecture, but they accumulated so much tech debt it's difficult to say what next steps could be. As was pointed out by others, starting by introducing some strong governance practices would be a good start.

My department was in a bad single-DB position like this. The only way out was big rewrites, and generally each team had one service with its own DB in the end. Dunno if you'd call that microservices or not.

I would if they can have their own release cycles.

Then yes

The important thing in these convos is to be clear each time what you mean by "microservices," because there isn't a solid definition for that. I've seen too many debates at work over whether we should use microservices, where both sides already agree on the actual design.

Adding micro-services is the way to make the company bankrupt.

"micro-services" is the worst solution to ANY problem. Is not a solution, only another way to write a problematic app.

The problem here is complexity and "micro-services" is the MOST efficient way to add complexity.

And the real problem here is complexity X scalability, which requires simplicity and tunning, stuff that "micro-services" IS NOT mean to solve.


I work in the enterprise space, and you bet you can cut 70%* (every time you find stuff like this) of the code and stay single-master-DB even for some very large companies if you have a half-decent architecture.

p.d: Do note that cut code does not mean the app will end with just 30%, is just that 70% is trash to be redone.

Are you saying that every company up to a certain size should have a single master DB?

It's hard to give significant advice with this little information - how much time the CPUs spend waiting for the memory, how many cache misses are happening, how many core execution units are doing something at any given time, etc.

HPE has single-image machines that can have up to 16 4th gen Xeons, which gives a top limit of 960 cores. IBM has POWER10 boxes that go up to 240 cores (but they are POWER 10 cores that can do, IIRC, up to 8 threads per core (increasing cache misses, but reducing unused execution units).

Does SQL Server run on IBM Power?

I'd say one of the only options is a HPE Superdome Flex machine but as you said they might run into other bottlenecks at this scale.

I can't fathom what a database is doing on the CPU so much, usually I run out of I/O (both disk and network) on 128 core machines before maxing the CPUs. Also the post says they have 4 machines and 10 databases, very strange.

4 machines is easy. Two pairs of redundant pairs. That's the minimum machine count I would run for an important use case (well, maybe 3x, one in each of three places)

If the 10 databases are independent, that seems like the easy way out --- siphon them off into separate clusters, and you should get some headroom; but if one database is 99% of the load, it won't be much.

Otherwise, you've got to find better hardware or partition the database somehow. The good news is, I don't think Azure has a 416 core server, but they do have 416 vCPU servers[1], at 2vCPU per core, 208 cores is a lot, but a) these are Skylake cores and b) you can get a similar core count in a dual socket Epyc board these days, and have a core that's much newer. Not sure if you can get one of those in a cloud though.

Edit to add: there's also a lot of potential to move compute out of the database. Without knowing anything about their queries, my experience has been the most expensive queries are either unnecessary table scans (which can often be fixed) or joins. For joins, sometimes you can fix those to run better, and sometimes it's better to do a 'client assisted join', first do an indexed query to get the ids of things you want, then do a big union of queries to get the details. You can tell me how disgusting that is, but it can turn a thing that takes one round trip and hard processing on the server into a thing that takes two round trips and is pretty easy for the server. Maybe SQL server is better at joins that MySQL though? Sometimes it might not be ok for data integrity/transactional reasons, but usually it's ok. Joining might be hard on the clients, too, but it's usually easier to add more database clients than to scale the database server.

[1] https://learn.microsoft.com/en-us/azure/virtual-machines/mv2...

That's a good point. I guess they shot themselves in the foot big time.

It might run on ARM. IIRC, Ampere has some large ones with lots of memory bandwidth. Maybe CXL memory can also help mitigating any disk IO.

This is fake, just rage bait. Besides the numbers in the post just not making any sense, the OP states that the company is in healthcare[1] but then says he's a 43 year old director[2], which still tracks, but then he says he's been 20 years in "big law"[3], then as a it director in fintech[4]. He says he's changed jobs twice in the last two years[5]. I gave up looking after just the first page of his post history.

[1] https://old.reddit.com/r/sysadmin/comments/1cqn3qa/whats_the...

[2] https://old.reddit.com/r/ITManagers/comments/1cqa0cp/genai_i...

[3] https://old.reddit.com/r/sysadmin/comments/1cotpdb/how_is_wo...

[4] https://old.reddit.com/r/Ameristralia/comments/1cnyxsh/what_...

[5] https://old.reddit.com/r/Intune/comments/ncj7oa/ios_sso_exte...

I ran a stressed app some years ago. We only had a wee little backend because our revenue was v.low, but we wanted to do stuff like sleep inside and eat, and so were motivated to cut costs to make profit.

What I did was make a table of all the queries that were being run on my backend, and I ordered them by the number of times that they were called and the cost of calling them (I honestly can't remember the measure I used for that but it was like cputime*memory or similar). I then did two things for the top queries.

1) Optimised them where I could.

2) Looked for where they were being used and tried to stop it.

(2) was very successful.

It's hard to believe for me how you would not start buying own hardware at this scale. In particular when the hyperscalers (at first glance) don't have anything to provide to match the needs.

The biggest x86 machine I found tops out at 960 cores, but I'm not sure what exactly they need, if having more cores would solve their problems or would only make some other pipe burst.

To figure that out, we'd need to look deep into what's happening in the machine, down to counting cache misses, memory bandwidth usage (per channel), QPI link usage (because NUMA), and, maybe, even go down to the usage stats of the CPU execution units.

When they mention a lot of what was stored procedures has been moved to external web services, I get concerned they replaced memory and CPU occupancy with it waiting for network IO.

I would hazard a guess that they're not really CPU bound.

Assuming the poster Aussiepete80 is Australian I should point out that the much higher salaries in the US and the favorable E3 visa has largely brain drained Australia of their best and brightest. This guys army (dozens) of DBAs is likely the residual.

Not really a nice thing to say - not everyone is inclined to move to the US (for me, and I'm not that cool, it'd need to be more than 6 digits). I suspect there is plenty competent people wishing to stay in Australia (in fact, a lot of healthcare folk here in Ireland have decided to move there - weather is so much better, despite the giant spider and murderous fauna problem).

Predators here in Ireland are terrifying. The other day I was almost adopted into a family of feral cats.

One competent person probably, 'some' possibly, 'plenty' defiantly not. The general advice for tech people in Australia is to move to the US ASAP - the more competent the tech person the stronger the advice. The insidious effect of brain drain is the effect is self reinforcing, i.e. the more it happens the stronger the pull. In tech it's important to work with other competent people. Historically Australians are unusually good programmers, so I am not saying Australians are unskilled, but I am saying their best and brightest leave Australia ASAP.

Plus I would submit the reddit thread 'Biggest CPU for the Bad System' as evidence to my point. How could this person possibly think that 100s of CPU cores are the bottleneck and the solution is more CPU cores - how could the dozens of DBAs not figure it out - the amount of work that can be done in 100s of CPU cores is utterly insane and the fact that they didn't rule out obvious IOWait bottlenecks in their initial posts suggest that none of them know what they're doing. I guess they figured that Azure would take care of all of that.

The initial step would be to bring that computer in house and design that computer around the problem - Azure is not going to be open and honest about their bottlenecks. Last I checked Azure was big on their Network Attached Storage and Managed Disks which add quite a lot of latency and throughput bottlenecks compared to a PCIe 5.0 Enterprise SSD.

I’d say the telltale sign is picking SQL Server in the first place.

If you think there is a risk your database will grow to be enormous, it’s likely it’ll be the worst possible choice, with the possible exception of Oracle and DB2 on AIX (but, on the last two, at least you have fewer limits to scaling up).

Could also be a B-player hires C-players situation, with borderline incompetence trickling down from the C-level.

Certainly, if you're planning on outgrowing a SQL Server you should probably DIY tailor made indexes and query languages/engines. LLVM makes DIY languages much easier to make than they used to be and very with fast interpreters and AOT it's very performant.

The reason for my brain drain statement is that people don't understand how pernicious the brain drain is in Australia. The Australian government believes that there isn't a net brain drain because more people with degrees immigrate than those emigrating. But people are not substitutable cogs and the very smart are very rare and it is the loss of those few very smart Australians to the US which is so devastating.

If they need really 1600 cores, perhaps they should be looking at a cluster with an Infiniband connection? Infiniband isn’t cheap but they have to beat nearly a million dollars per month which gives them a lot of headroom.

They are also tied to MS SQL Server, it seems. Not sure it supports the fanciest gear out there. I've only seen Infiniband in HPC settings which are overwhelmingly non-Windows.

Another comment pointed out that the Reddit poster was probably just making the whole thing up, the account had made a bunch of different posts indicating they had a bunch of incompatible high-ranking jobs. The creative writing on that site is very odd.

It might have uptime requirements that they can't provide on their own.

This seems like a good use case for Spanner? The pain would be in migrating the backend to gke, but is you are hitting the limits of what azure can do you are going to have to migrate at some point.

It'd basically require rewriting everything, and even then, Spanner isn't necessarily a good way to do it.

You could have some breathing room by scaling up to a bigger box (AFAIK, x86 tops out at 920 cores per memory image)

Having been through similar situations in my past life, I can confidently say that they don't need more CPU cores, they need to start really looking at their architecture holistically and identifying the critical path that can be rewritten in priority order for performance. At this point, throwing more hardware at the problem is the wrong thing to do /even/ if it temporarily kicks the can down the road. They have a fundamental system design issue that needs to be addressed, likely piecemeal and prioritized. The first step should be adding more performance instrumentation.

Shame it is running ms sql. anything Postgress, oracle or db2 and it might have been a candidate for running on a IBM Linuxone might even be a valid contender for the cost it is currently running at.

