Microsoft’s new x86 DataCenter class machines running Windows (2018)

aloknnikhil · on July 5, 2020

From one of the comments on the article

> I think Intel / Microsoft have found a way to interconnects scalable and this comes interesting thought on value of have more cores on cpu. Especially if system was designed to be pluggable and if one of cpu's failed than it does not bring entire system down.

I think this is an interesting point. If the QPI links on the Xeon Platinum is restricted to 3, I wonder what the 32 socket mesh looks like and what implications that has on availability of the system as a whole if one CPU was to fail. Also, unless these are fully NUMA aware workloads or each socket was dedicated to a VM, the latency has to be terrible for memory access especially if the interconnect is multiple hops away. Clearly it's not going to be a fully connected mesh?

freeqaz · on July 5, 2020

This starts to blur the line in my head between the low-level system concepts that your compiler/OS has to think about and the concepts you have to think about for distributes systems.

With the hardware systems and interconnects today, we've drawn a line in the sand based on what we've seen work and fail. Something like Redis is fairly well understood for many architectures. But when you have crazy NUMA architectures and different memory models like Intel Optane, what changes? Where do you draw the "new" line for when to use Redis vs keep everything "local" to one box?

Will we see a local Kubernetes extension (or new software) that makes building for "massive boxes" easier?

Kind of a cool idea.

sradman · on July 5, 2020

> Back in the days pre-Nehalem, the big eight socket 32-core servers were all the rage, but today not so much, and unless a company is willing to spend $250k+ (before support contracts or DRAM/NAND) on a single 8-socket system, it’s reserved for the big players in town. Today, those are the cloud providers.

SAP HANA is a major driver for single servers with the maximum number of cores and maximum memory per core that Intel supports. This is true for on-premise as well as Azure [1] and AWS [2] IaaS.

I think of it as commodity scale-up and the superior price-per-performance applies to any hybrid in-memory column store though HANA seems to be the only offering that exploits this approach. You can store a lot of data in 24 TB (or more) of in-memory compressed bitmap indexes.

[1] https://docs.microsoft.com/en-us/azure/virtual-machines/work...

[2] https://aws.amazon.com/sap/solutions/saphana/

fomine3 · on July 6, 2020

Fun quote:

> From what we have heard from both OEMs and Intel, the vast majority of 8-socket servers today are sold in China. The most common reason for this that we hear is not because of the scale-up benefits. Instead, it is because of the numerology of the 8. It just so happens that 8 in China is a lucky number, much like 7 is considered by many in the US to be a lucky number.

https://www.servethehome.com/why-all-servers-are-not-4-socke...

pgtan · on July 5, 2020

SAP offers similar technology on their Enterprise Cloud using POWER9 interconnected nodes:

https://insidesap.com.au/sap-hec-to-run-on-ibm-power-system-...

https://www.youtube.com/watch?v=djtqJ4P3OnU

blattimwind · on July 5, 2020

The late 1990s called and want their Numalink back.

p_l · on July 5, 2020

Some of the successful large-socket-count servers, specifically SGI Ultraviolet series (now sold by HPE AFAIK) actually use NUMAlink - a much updated version, but the basic remain the same as with MIPS and Itanium systems.

aloknnikhil · on July 5, 2020

This is from 2018. Can we add that to the title?

dang · on July 5, 2020

Good catch. Added. Thanks!

fortran77 · on July 5, 2020

Why does HN sometimes truncate leading words/numbers in submissions? The actual title is "869 Xeon Cores...."

saagarjha · on July 5, 2020

Numbers at the start of titles are stripped, presumably to catch titles such as "10 great tips to improve your business".

thoraway1010 · on July 5, 2020

And in this case it was a good thing. These are just normal machines (4 socket?) tied together with an interconnect.

So the rule works surprisingly well!

wmf · on July 5, 2020

Those are common clickbait patterns.

thoraway1010 · on July 5, 2020

No kidding. Despite the HN posters saying it's "factual" this is likely just 4 socket systems tied together with an interconnect.

If intel is really going to market these type of systems, they need to be a LOT LOT more transparent on interconnect latency / memory management issues etc.

These headlines are almost always clickbait when you try to run high contention workloads on these supposedly massive systems.

What is the name of the interconnect approach being used here? That's what I want to know.

yjftsjthsd-h · on July 5, 2020

If it runs a single system image, how is it different from any other NUMA system or a big.little arrangement?

thoraway1010 · on July 5, 2020

Some big.little's have very good interconnect / memory access. Some systems have remote NUMA memory - you are going off one board over an internconnect to another board to memory there.

The headline is so misleading. I could wire together a ton of machines over 10mbs, and yes, I could get NUMA going on it and a high core count, but the performance would be horrendous. Maybe you could even run super NUMA over DB9 9 bit serial ports with remote memory access latency in hours?

wmf · on July 5, 2020

https://en.wikipedia.org/wiki/NUMAlink

derision · on July 5, 2020

If it's a factual number is it really clickbait?

benologist · on July 5, 2020

The exact number isn't usually relevant so they get removed and the official guidelines request they not be included - in this case the number represents cores x sockets, the article mentions 2 and 4 and 8 socket boards and with a different processor it could vary a lot, but the important part is combining the boards not their particular hardware choices.

> If the title begins with a number or number + gratuitous adjective, we'd appreciate it if you'd crop it. E.g. translate "10 Ways To Do X" to "How To Do X," and "14 Amazing Ys" to "Ys." Exception: when the number is meaningful, e.g. "The 5 Platonic Solids."

https://news.ycombinator.com/newsguidelines.html

justinsaccount · on July 5, 2020

The exact number isn't really relevant, but the fact that it is relatively higher than any commonly available system IS relevant.

48+ xeon core systems are easily obtained and nothing special. The fact that this system has 896 cores is the interesting part.

"Xeon cores in one PC" doesn't even make sense.

rbanffy · on July 5, 2020

I didn't realize that when submitting. It's too late for me to fix it.

sbierwagen · on July 5, 2020

The article states that it’s actually four 8 socket machines connected over pcie. Not what I would call “one PC”

guenthert · on July 5, 2020

Well, it's clearly not for Personal Computing, nor will anyone run DOS on it. It's a Single System Image (SSI) system. Some DB which can't be (effectively) clustered and other software which doesn't scale out well (or hasn't been rewritten yet to make use of distributed computing) demands such.

sbierwagen · on July 5, 2020

Then the obvious compromise tile is "869 Xeon Cores in one SSI" :) This makes it clear that it's not a normal computer, but a weird hyperexpensive proprietary cluster thing that normal people won't be interested in.

mlyle · on July 5, 2020

It's a really big single computer-- that's what SSI -means-. Yes, shared memory over the most distant cores will be slow.

> but a weird hyperexpensive proprietary cluster thing that normal people won't be interested in

Yes, I'm probably not going to use it to play Cyberpunk 2077 or run Microsoft Office. But this is Hacker News.

The existence of things like this should rationally lead people to question scale-out in some cases. If you know the ceiling of how large of a machine you can throw at an unusual problem is very high, you may not want to dedicate the expensive engineering effort to scale out very early. After all, even very, very high end computing is cheap and people are very expensive.

sbierwagen · on July 14, 2020

Hmm, I don't know. It occupies a strange spot in the demand curve, doesn't it? Below C1M, you use commodity hardware. Above C100M, you're forced to use distributed systems, since no single machine can possibly handle the load.

But if you're in a startup, like many HN users, you have to fulfill Investor Storytime. In this consensual hallucination, everyone pretends the company can hockey stick at any minute. In this environment, you can't have hard limits. If the investors ask "What if traffic goes up by 1,000x?" you can't say "We die", because then you become unemployed.

So this thing is purely for companies with predictable loads... stable, established companies. Enterprises! Resume poison! Nobody in this economy would be so foolish as to admit experience with enterprise software, IBM z series(!), lest ye be cursed forevermore with its taint. Throw salt over your left shoulder, spit three times... A good HN user only uses cloud autoscaling, serverless, Lambda, anything to stay out of bank IT shops, where you're paid $70k a year and get outsourced at the drop of a hat.

I now understand why OP died with a mere 22 votes. Better I should forget the acronym "SSI" entirely, lest it contaminate me, become fallen... I clear my mind, think of web scale, BigTable...

johnklos · on July 5, 2020

Like lipstick on a pig.

Koshkin · on July 6, 2020

Thank you for the mental image of a compute stick running lisp.