Hacker News new | past | comments | ask | show | jobs | submit login
SiPearl Lets Rhea Design Leak: 72x Zeus Cores, 4x HBM2E, 4-6 DDR5 (anandtech.com)
65 points by rbanffy on Feb 25, 2021 | hide | past | favorite | 19 comments



> hybrid memory subsystem comprising four HBM2E memory stacks as well as four or six regular DDR5 memory channels

YES! That's what I wanted to see happen.. I'm surprised AMD wasn't first to do this, considering their history with HBM GPUs.


AMD was almost sinking till a few years ago, so throwing the little money they had in expensive pipe dreams like this would have been their end.

Best they stuck to their strengths like Ryzen CPUs and Radeon GPUs and turned the ship around.


Processors designed for HPC, GPUs and generic server CPUs have different evolutionary forces acting on them. We can be sure AMD did at least some simulations on EPYC about how HBM memory would impact performance on server workloads.

It all depends on the workload and demand. AMD can have much larger volume with a part tailored for the space between HEDT and server than with one tailored to HPC workloads.

I would love to misuse one of these as my personal workstation (just I would have loved to use a Xeon Phi), but not all my purchasing decisions are strictly rational.


You can get a phi for $100 now, live a little. :)


Question is, is it easy to program and use it in Linux? Last I checked some years ago, I think the tooling was proprietary?


If you want to extract every bit of advertised performance, you'd probably be better off with Intel's proprietary compilers. If you can accept a little less, it runs fine with clang and gcc. It just looks like a lot of x86 cores with AVX512.


I don't want the PCI board. I want a Knights Mill workstation motherboard.

Those aren't easy to come by ;-)


Oh yeah, I am with you, I tried to map out what it would take to acquire. Sometimes the parts come up, but the likelihood you could get everything integrated properly and not have a pile of junk is pretty low.

Then you have to find a mis-labeled workstation come up, where they just say HP bla-bla and not that is Phi. At least a grand, maybe 2 for crazy old hardware.

You could go for a SparcT3, they are fairly cheap but not the massive sea of cores that Knights Mill is.

https://www.ebay.com/itm/Intel-HJ8068303823700-SR3VD-Xeon-Ph...

Who is buying these? The vendor to show a sale? Is this just for money laundering?

I haven't disassembled a Xeon Phi PCIe board, but I wouldn't be surprised if it literally isn't just this same chip in a low profile socket.


I think the PCI versions need the host to provide them all the IO services. IIRC, the host loads the OS image and then boots the Phi.

OTOH, the early PCIe Phis had SMT4, so the thread count was higher than the latest versions. But... they were Cell PPU-like "threads" in that giving each core a single thread to work would result in overall 25% core utilization - useful if adding more threads results in more L2 cache misses. In that, the latest Phis are much nicer - the individual cores, while not doing SMT4 (or 2), are much better than the first-gen ones.

I don't remember any SPARC T3 workstations too. Would have been smart to seed developers with them since before the first Niagara to encourage development for high thread counts, but Sun's management at the time wasn't known for their business acumen.


I am just saying a T3 server is much more attainable than a Knights Landing workstation. Friend of mine is trying to get an Itanium workstation, expensive and difficult.

I think simulators are probably way more appropriate now even though hard-hardware somehow makes things feel real.

Every computation is a simulation, why not run the simulator on a simulation engine?

https://fires.im/


Hybrid memory systems are famously difficult to work with.

See XBox 360 eDRAM, Xeon Phi (DDR4 + HMC version), among others. Its a computer architecture that keeps showing up every few years... programmers get frustrated with the split-memory architecture. So its rare for such a system to ever become popular.

----------

HBM RAM doesn't have any better latency characteristics than DDR4. Traversing a linked-list in HBM will take the same amount of time (maybe even a bit longer) than DDR4. As such, its non-obvious how to optimize the split-memory operations.

NUMA-style mallocs / frees (malloc to a particular memory location) don't really match the performance characteristics either. Its not that one or the other RAM has better latency... its that one RAM has more BANDWIDTH than the other. NUMA-style is already more complicated than most programmers are willing to work with, and yet its still not enough to capture the performance attributes of HBM vs DDR4.


> September 8, 2020



> PCIe board

That shows up on the roadmap. Does that mean it's only available as an add-in card?


"leak"


Yes, I'm very mindful how companies and politicians have accidents like this - so may well be what I class as "tactile marketing" If we see some share or funding for that company soon, then that would be handy PR then.

[EDIT ADD] Seems they had funding just over a year ago, so be due another round soon I'd expect, that with staggered PR press releases with a slant of European super computer and with that, this leak probably do them more good than harm PR wise - more so if they are in position to be seeking new investment, which it does look like they will - at least how it feels too me.


Need (2020) In the Title.

Also not sure what is worthy of discussions or if there are anything new.


It hasn’t been discussed on this site yet so I don’t have a problem with the post


>the project uses Arm’s upcoming Neoverse “Zeus” cores which succeed the Neoverse N1 Ares cores

Would have much rather seen rise-v for true independence. Maybe the cores will be swappable?

This does seem like legitimately a good case for it with the means to succeed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: